kou commented on code in PR #474: URL: https://github.com/apache/arrow-site/pull/474#discussion_r1485483982
########## powered_by.md: ########## @@ -129,6 +129,10 @@ short description of your use case. natural language processing, and tabular tasks. Dataset objects are wrappers around Arrow Tables and memory-mapped from disk to support out-of-core parallel processing for machine learning workflows. +* **[iceburst][53]:** A real-time data lake for monitoring and security built + directly on top of Amazon S3. Our approach is simple: ingest the OpenTelemetry data in an S3 bucket as + Parquet files in Iceberg table format and query them using DuckDB with milliseond retrieval and zero egress cost. + Parquet is converted to Arrow format in-memory enhancing both speed and efficiency. Review Comment: Is this done by DuckDB or iceburst? If you mean that DuckDB does it, it may be wrong. I think that DuckDB doesn't use Apache Arrow as its internal data format. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
