Alexx-G opened a new issue #746: URL: https://github.com/apache/arrow-rs/issues/746
**Which part is this question about** library API and documentation **Describe your question** I'd like to extend a log forwarding adding support for writing log events in `parquet` format. The overall flow is rather simple - collect & process log events, batch & buffer processed log events, write batches as parquet files somewhere (e.g. AWS S3). There are a few problems that I'm not sure how should be solved: - same stream of log events may not have the same shape (e.g. some log events miss a field), thus a "schema-less" approach is highly desirable; - some structured logs may have nested data structures (e.g. request event with headers - a dictionary); - what's the best approach of buffering log events and serializing a batch to a parquet file I have little->moderate Rust experience and no parquet experience, so it's a bit challenging to get a clearer picture on the way forward. I'd highly appreciate some pointers/examples on how to approach this problem. **Additional context** I'm collecting info on how parquet format can be integrated into Vector (log forwarder written in Rust). https://github.com/timberio/vector/issues/1374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
