alamb commented on issue #746: URL: https://github.com/apache/arrow-rs/issues/746#issuecomment-916460502
As mentioned above, the challenge for log writing is that it is often requires "schema evolution" -- aka new fields may appear in subsequent messages. This requires some non trivial engineering as mentioned above when working with formats / systems that want "relational" data (aka data in rows and columns where the columns and their types (schema) are known in advance) The following blog post from Uber gives you some idea of what games they play to put such data into the relational format https://eng.uber.com/logging/ There is also code to do "schema merging" which is the foundation for most schema evolution schemes in the arrow crate: `Schema::try_merge` https://github.com/apache/arrow-rs/blob/master/arrow/src/datatypes/schema.rs#L120 Though you will have to handle details like what happens when new records appear with an attribute that has a different column -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
