neopaf commented on issue #1644:
URL: https://github.com/apache/orc/issues/1644#issuecomment-1791363254

   @guiyanakuang came across this quick description: 
https://orc.apache.org/specification/ORCv0/
   
   Looks like I had a total misconception inside my head that if we have an 
array the rows should somehow be duplicated, not at all!
   
   Disc data structure allows for storing nested information, and column index 
is just an arbitrary assigned (most probably left-to-right, depth first) value. 
   
   As you wrote: "write child column". One should not think about it as 
literally a column, just a value, no matter how complex (struct/array/whatever).
   
   Upon reading the magic goes in reverse -- one gets to proper complicated 
structure and it gets unwound back to whatever detail we need.
   
   Russian dolls.
   
   Thanks again for pushing me in the right direction!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to