[GitHub] [arrow] vertexclique commented on pull request #8430: ARROW-10249: [Rust] Support nested dictionaries inside list arrays

GitBox Mon, 12 Oct 2020 13:57:27 -0700


vertexclique commented on pull request #8430:
URL: https://github.com/apache/arrow/pull/8430#issuecomment-707341797



   @nevi-me 
   
   > I didn't do a detailed review, but I'm happy with the changes. It's been a 
while since I looked at the JSON reader, how hard/easy do you think it would be 
for us to support the outstanding work on 
https://issues.apache.org/jira/browse/ARROW-4534?
   
   Thanks! Especially on the nested reading part 
(https://issues.apache.org/jira/browse/ARROW-4544), it would be nice to reuse 
builders at entry. Having a recursive reader with a `recursion_limit` set would 
be good to go. If we go down into the iterative approach, we will explicitly 
generate a macro to expand on the compile-time with a depth embedded in. That 
might slow down to compile times and create larger binaries.
   
   The good part of the recursive approach is that it will be limited by the 
stack size (but there might be growing stack implementation), where the user 
can increase this by hand. The bad part is that the recursion limit we have 
defined shouldn't hit to default stack size, and we should have a sweet spot 
for it.
   
   About the other ticket that is still open 
(https://issues.apache.org/jira/browse/ARROW-4803). Type inference for schema 
might be hard at first. Although it is hard, we can do assumption based parsing 
by parsing that first(or +2) record's data to infer the type. But when the type 
is given, we can try to parse all down in iso format.
   
   > I also have the feeling that the reader might be slower than other 
readers. What has been your experience @vertexclique?
   
   I didn't test the performance, since I was using this in tests, I needed it. 
We can create a benchmark for r/w, maybe? wdyt?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] vertexclique commented on pull request #8430: ARROW-10249: [Rust] Support nested dictionaries inside list arrays

Reply via email to