Hello,
I have a plan that looks like this
HashAggregateTransformer(keys=[country_code#0],
functions=[sum(latest_trade_data#29L), avg(latest_industrial_data#28L)],
isStreamingAgg=false, output=[country_code#0, sum(latest_trade_data)#95L,
avg(latest_industrial_data)#96])
+- AQEShuffleRead coalesced
+- ShuffleQueryStage 0
+- Exchange hashpartitioning(country_code#0, 5), ENSURE_REQUIREMENTS,
[plan_id=668]
+- VeloxColumnarToRow
+- ^(1) FlushableHashAggregateTransformer(keys=[country_code#0],
functions=[partial_sum(latest_trade_data#29L),
partial_avg(latest_industrial_data#28L)], isStreamingAgg=false,
output=[country_code#0, sum#107L, sum#108, count#109L])
+- ^(1) ProjectExecTransformer [country_code#0,
latest_industrial_data#28L, latest_trade_data#29L]
+- ^(1) FilterExecTransformer (trim(short_name#1, None) = Low income)
+- ^(1) InputIteratorTransformer[columns...]
* +- RowToVeloxColumnar +- *(1) ColumnarToRow*
+- BatchScan country_summary[columns...] Reading table
[bigquery-public-data.world_bank_intl_debt.country_summary]
I see 2 plan nodes together
1. ColumnarToRow
2. RowToVeloxColumnar
This seems a little redundant apparently? Can someone help me why these
transitions are being added? I found out that ColumnarToRow is being added
since the Shuffle does not ouput a columnar output, but i also saw that
gluten code removes that rule intermittently while adding transitions.
Any hints would help.
Thanks
Abbas