neopaf commented on issue #1644: URL: https://github.com/apache/orc/issues/1644#issuecomment-1791363254
@guiyanakuang came across this quick description: https://orc.apache.org/specification/ORCv0/ Looks like I had a total misconception inside my head that if we have an array the rows should somehow be duplicated, not at all! Disc data structure allows for storing nested information, and column index is just an arbitrary assigned (most probably left-to-right, depth first) value. As you wrote: "write child column". One should not think about it as literally a column, just a value, no matter how complex (struct/array/whatever). Upon reading the magic goes in reverse -- one gets to proper complicated structure and it gets unwound back to whatever detail we need. Russian dolls. Thanks again for pushing me in the right direction! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
