Re: [PR] Add iceburst to powered by list [arrow-site]

via GitHub Sat, 10 Feb 2024 22:07:18 -0800


kou commented on code in PR #474:
URL: https://github.com/apache/arrow-site/pull/474#discussion_r1485483982



##########
powered_by.md:
##########
@@ -129,6 +129,10 @@ short description of your use case.
   natural language processing, and tabular tasks. Dataset objects are wrappers 
around 
   Arrow Tables and memory-mapped from disk to support out-of-core parallel 
processing 
   for machine learning workflows.
+* **[iceburst][53]:** A real-time data lake for monitoring and security built 
+  directly on top of Amazon S3. Our approach is simple: ingest the 
OpenTelemetry data in an S3 bucket as
+  Parquet files in Iceberg table format and query them using DuckDB with 
milliseond retrieval and zero egress cost.
+  Parquet is converted to Arrow format in-memory enhancing both speed and 
efficiency.

Review Comment:
   Is this done by DuckDB or iceburst? If you mean that DuckDB does it, it may 
be wrong. I think that DuckDB doesn't use Apache Arrow as its internal data 
format.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Add iceburst to powered by list [arrow-site]

Reply via email to