Hi Wes,
thanks for the feedback.
I actually share your reservations regarding performance. I just think that
the arrow structure seems ideal for working with tabular data (especially
for effective filtering and selection), and after that a final step would
(i think) often involve traversing the remaining data in a row oriented
fashion. You would probably have a good overview over the ecosystem using
Arrow - aren't there any SQL engines etc using Arrow, who would probably
already have invested some thought in that? Or was your answer really
limited to the specific hava case and such a concept does exist somewhere
else, like in the c++ lib?
I'll cerntainly put some thought into this, and if i come up with a
sensible solution, i'd be happy to contribute it.
Kind regards,
Simon
BTW: I've seen quite some of your talks (at YouTube) and read some of your
articles while investigating into Arrow and its surrounding ecosystem,
therefore: Thanks for all you have done and invested for Arrow in
particular and for the open source community in general! I (as probably
many others) very much appreciate that!
Am 30. August 2019 19:27:31 schrieb Wes McKinney <[email protected]>:
hi Simon -- I don't think there is any such Row accessor class in Java
but you are welcome to contribute one to the project. For performance
sensitive applications, using a record interface might not be the best
idea, but I can understand the convenience for some uses cases.
- Wes
On Fri, Aug 30, 2019 at 4:55 AM Simon Dumke <[email protected]> wrote:
Hi all,
I did not find anything (and so: no definite answer) in the docs, so i
thought to ask here:
Does Arrow (and at this point my main concern is Arrow for java) support
any type of concept that allows a "record level access" (so, a "row") to
data in an Arrow RecordBatch or Table? I would have thougt that even in
column-oriented analytics etc. this would be a common last step access
pattern over many use cases, but i could not find any references to such a
thing.
Thanks and kind regards,
Simon