I think value is the problem word. I'm not sure it is better for groupings or cells in the case of repeated types. What do they use in Parquet?
I'd also like to see this proposal in the context of a larger proposed design spec for that jira. On Feb 26, 2015 5:52 PM, "Hanifi Gunes" <[email protected]> wrote: > Hey everyone, > > Scalar ValueVector(VV) types implement getValueCount method, which returns > the number of "value"s stored in the vector. I would expect the same be > true for RepeatedVVs as well. However, getValueCount on repeated types > report number of inner/sub-values stored and introduces another method > called groupCount to report actual number of "value"s stored. > > This becomes really confusing and somewhat inconsistent (especially for > RepeatedList) as one would expect #getValueCount should report the number > of values regardless if the stored value type is nested or flat. > > As part of DRILL-2150, I am refactoring VVs so that getValueCount > universally returns the number of values stored. Alongside, I plan to > introduce a new method getCellCount that reports total number of > sub-values/cells stored in a repeated vector. > > I'd like to probe if anyone has any concerns relating to this. Please let > me know. > > > Thanks. > -Hanifi >
