abellgithub commented on issue #530:
URL: https://github.com/apache/parquet-format/issues/530#issuecomment-3446778855

   I think that largest problem with the Thrift encoding of metadata is that 
you can't find anything -- you have to read all the data before locating the 
thing you want. It seems unfortunate to re-encode all the metadata in another 
format if you could find a way to provide offsets to things people need to 
find. It wouldn't need 100% direct access, but enough to allow people to locate 
things without decoding too much that they don't want. This could be done as 
some binary blob, which is essentially what you're proposing with flatbuf, but 
you could do something more simplistic than encoding all (most) of the data 
with flatbuf.
   
   Also, having two sets of metadata in one file can lead to inconsistencies -- 
which data do you believe if things don't match?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to