This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch site/tpch_data_generator
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git

commit 3dbdc0a13ee482c90ad8246cf7a76d175fd7fc4f
Author: Andrew Lamb <[email protected]>
AuthorDate: Fri Apr 4 15:57:34 2025 -0400

    images + updates
---
 content/blog/2025-04-10-fastest-tpch-generator.md  |  29 ++++++++++++++++-----
 .../images/fastest-tpch-generator/lamb-theory.png  | Bin 0 -> 300479 bytes
 .../fastest-tpch-generator/parquet-performance.png | Bin 0 -> 61946 bytes
 .../fastest-tpch-generator/tbl-performance.png     | Bin 0 -> 49477 bytes
 4 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/content/blog/2025-04-10-fastest-tpch-generator.md 
b/content/blog/2025-04-10-fastest-tpch-generator.md
index ddb3f22..7244528 100644
--- a/content/blog/2025-04-10-fastest-tpch-generator.md
+++ b/content/blog/2025-04-10-fastest-tpch-generator.md
@@ -34,18 +34,33 @@ th, td {
 }
 </style>
 
-We used Rust and open source development to build 
[tpchgen-rs](https://github.com/alamb/tpchgen-rs), a fully open TPCH data 
generator over 10x faster than any other such generator we know of.
+We used Rust and open source development to build [tpchgen-rs], a fully open
+TPCH data generator over 10x faster than any other implementation  we know of.
 
-Authors:
-* [Andrew Lamb](https://www.linkedin.com/in/andrewalamb/) 
([@alamb](https://github.com/alamb)) is a Staff Engineer at 
[InfluxData](https://www.influxdata.com/) and an [Apache 
DataFusion](https://datafusion.apache.org/) and Apache Arrow PMC member.
-* Achraf B ([@clflushopt](https://github.com/clflushopt)) is a Software 
Engineer at [Optable](https://optable.co/) where he works on data 
infrastructure.
-* [Sean Smith](https://www.linkedin.com/in/scsmithr/) 
([@scsmithr](https://github.com/scsmithr)) is the founder of 
[GlareDB](https://glaredb.com/) focused on building a fast analytics database.
+
+About the Authors:
+- [Andrew Lamb] ([@alamb]) is a Staff Engineer at [InfluxData]) and a PMC 
member of [Apache DataFusion] and [Apache Arrow].
+- Achraf B ([@clflushopt]) is a Software Engineer at [Optable] where he works 
on data infrastructure.
+- [Sean Smith] ([@scsmithr]) is the founder of  focused on building a fast 
analytics database.
 
 It is now possible to create the TPCH SF=100 dataset in 72.23 seconds (1.4 GB/s
 😎) on a Macbook Air M3 with 16GB of memory, compared to the classic `dbgen`
 which takes 30 minutes[^1] (0.05GB/sec). On the same machine, it takes less 
than
-2 minutes to create all 3.6 GB of SF=100 in [Apache
-Parquet](https://parquet.apache.org/) format.
+2 minutes to create all 3.6 GB of SF=100 in [Apache Parquet] format.
+
+[tpchgen-rs]: https://github.com/alamb/tpchgen-rs
+
+[Andrew Lamb]: https://www.linkedin.com/in/andrewalamb/
+[@alamb]: https://github.com/alamb
+[InfluxData]: https://www.influxdata.com/
+[Apache DataFusion]: https://datafusion.apache.org/
+[Apache Arrow]: https://arrow.apache.org/
+[@clflushopt]: https://github.com/clflushopt
+[Optable]: https://optable.co/
+[Sean Smith]: https://www.linkedin.com/in/scsmithr/
+[@scsmithr]: https://github.com/scsmithr
+[GlareDB]: https://glaredb.com/
+[Apache Parquet]: https://parquet.apache.org/
 
 Finally, it is convenient and efficient to run TPCH queries locally when 
testing
 analytical engines such as DataFusion.
diff --git a/content/images/fastest-tpch-generator/lamb-theory.png 
b/content/images/fastest-tpch-generator/lamb-theory.png
new file mode 100644
index 0000000..2551ffa
Binary files /dev/null and 
b/content/images/fastest-tpch-generator/lamb-theory.png differ
diff --git a/content/images/fastest-tpch-generator/parquet-performance.png 
b/content/images/fastest-tpch-generator/parquet-performance.png
new file mode 100644
index 0000000..462c995
Binary files /dev/null and 
b/content/images/fastest-tpch-generator/parquet-performance.png differ
diff --git a/content/images/fastest-tpch-generator/tbl-performance.png 
b/content/images/fastest-tpch-generator/tbl-performance.png
new file mode 100644
index 0000000..2e64f11
Binary files /dev/null and 
b/content/images/fastest-tpch-generator/tbl-performance.png differ


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to