[
https://issues.apache.org/jira/browse/PARQUET-299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14588122#comment-14588122
]
Dong Chen commented on PARQUET-299:
-----------------------------------
Thanks [~nezihyigitbasi], I will take a try on Hive side to handle this.
If the rows size of vector depends on the data pages and is not constant, I get
a question about the array {{values}} in {{ColumnVector}}. For example, the
{{int[] values}} in {{IntColumnVector}}, is initialized with default size 1K. I
guessed this array is designed to store decoded values if eager decoding. Since
the actual rows numbers is always a litter bigger, how will we handle this?
Resize it to {{ColumnVector.numValues}}, or other thoughts?
> [Vectorized Reader] ColumnVector length should be in terms of rows, not
> DataPages
> ---------------------------------------------------------------------------------
>
> Key: PARQUET-299
> URL: https://issues.apache.org/jira/browse/PARQUET-299
> Project: Parquet
> Issue Type: Sub-task
> Components: parquet-mr
> Reporter: Zhenxiao Luo
>
> In https://github.com/zhenxiao/incubator-parquet-mr/tree/vector
> ColumnVector length is in terms of DataPages, need to be in terms of rows
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)