This is an automated email from the ASF dual-hosted git repository.

alamb pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/datafusion-site.git


The following commit(s) were added to refs/heads/main by this push:
     new 12d5ee2  Site/gene.bordegaray/2025/12/consecutive repartitions blog 
post title (#129)
12d5ee2 is described below

commit 12d5ee2e39b8113643f07ddc48f8dedc4c12e2d8
Author: Gene Bordegaray <[email protected]>
AuthorDate: Thu Dec 18 04:02:48 2025 -0800

    Site/gene.bordegaray/2025/12/consecutive repartitions blog post title (#129)
    
    * initial blog post
    
    * better images and formatting
    
    * realigned some images
    
    * added links for Nga and Andrew's github
    
    * added links for Nga and Andrew's github
    
    * fixed to DataFusion and some word selection
    
    * reformatted some images for clarity and minor changes to punctuation
    
    * Update file name to match publish date
    
    * updated images
    
    * fix title
    
    ---------
    
    Co-authored-by: Andrew Lamb <[email protected]>
---
 content/blog/2025-12-15-avoid-consecutive-repartitions.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/content/blog/2025-12-15-avoid-consecutive-repartitions.md 
b/content/blog/2025-12-15-avoid-consecutive-repartitions.md
index a236977..c3d3c4d 100644
--- a/content/blog/2025-12-15-avoid-consecutive-repartitions.md
+++ b/content/blog/2025-12-15-avoid-consecutive-repartitions.md
@@ -1,6 +1,6 @@
 ---
 layout: post
-title: Optimizing Repartitions in DataFusion: How I Went From Database Nood to 
Core Contribution
+title: Optimizing Repartitions in DataFusion: How I Went From Database Noob to 
Core Contribution
 date: 2025-12-15
 author: Gene Bordegaray
 categories: [tutorial]
@@ -198,7 +198,7 @@ SELECT a, SUM(b) FROM data.parquet GROUP BY a;
 
 Repartitions would appear back-to-back in query plans, specifically a 
round-robin followed by a hash repartition.
 
-Why is this such a big deal? Well, repartitions do not process the data; their 
purpose is to redistribute it in ways that enable more efficient computation 
for other operators. Having consecutive repartitions is counterintuitive 
because we are redistributing data, then immediately redistributing it again, 
making the first repartition pointless. While this didn't create extreme 
overhead for queries, since round-robin repartitioning does not copy data, just 
the pointers to batches, the beh [...]
+Why is this such a big deal? Well, repartitions do not process the data; their 
purpose is to redistribute it in ways that enable more efficient computation 
for other operators. Having consecutive repartitions is counterintuitive 
because we are redistributing data, then immediately redistributing it again, 
making the first repartition pointless. While this didn't create extreme 
overhead for queries, since round-robin repartitioning does not copy data, just 
the pointers to batches, the beh [...]
 
 <div class="text-center">
 <img


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to