> Now i know this class i.e *ApplyColumnarRulesAndInsertTransitions* doesn't matter in the context of my question, since gluten anyways calls *RemoveTransitions* later
Yes, Gluten will call RemoveTransitions right away when the columnar query optimization starts, then the subsequent columnar rules could see a cleaner query plan that is easier to optimize. After the columnar rules are executed, all the needed transitions will be added again by `InsertTransitions`[1] in one go, the C2R2C transition you mentioned is added by this rule either. Hongze [1] https://github.com/apache/incubator-gluten/blob/eda660b572c78a8aaf5ea0f9d217e5d0ca6340c7/backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxRuleApi.scala#L106 On Tue, Jun 3, 2025 at 3:51 PM Abbas Gadhia <[email protected]> wrote: > > Hi Hongze, > > Spark-to-Velox C2C is not supported yet > > > Thanks for clarifying this. Makes sense now :) > > I don't clearly get the issue here. Would you give an example? > > > Apologies if I wasn't clear. I was referring to the Spark > *ApplyColumnarRulesAndInsertTransitions* rule that adds a *ColumnarToRow* > node by looking at the "supportsColumnar" boolean of the Shuffle node. > Now i know this class i.e *ApplyColumnarRulesAndInsertTransitions* doesn't > matter in the context of my question, since gluten anyways calls > *RemoveTransitions* later > > Thanks much for the clarification again! > Regds > Abbas > > On Tue, Jun 3, 2025 at 7:14 PM Hongze Zhang <[email protected]> wrote: > > > Hi Abbas, > > > > > This seems a little redundant apparently? > > > > This is actually a C2R2C transition used to convert from vanilla > > Spark's columnar format to Velox's. It's necessary because > > Spark-to-Velox C2C is not supported yet. > > > > > I found out that ColumnarToRow is being added since the Shuffle does not > > ouput a columnar output, but i also saw that gluten code removes that rule > > intermittently while adding transitions. > > > > I don't clearly get the issue here. Would you give an example? > > > > Best, > > Hongze > > > > On Tue, Jun 3, 2025 at 11:52 AM Abbas Gadhia > > <[email protected]> wrote: > > > > > > Hello, > > > I have a plan that looks like this > > > > > > HashAggregateTransformer(keys=[country_code#0], > > > functions=[sum(latest_trade_data#29L), avg(latest_industrial_data#28L)], > > > isStreamingAgg=false, output=[country_code#0, sum(latest_trade_data)#95L, > > > avg(latest_industrial_data)#96]) > > > +- AQEShuffleRead coalesced > > > +- ShuffleQueryStage 0 > > > +- Exchange hashpartitioning(country_code#0, 5), ENSURE_REQUIREMENTS, > > > [plan_id=668] > > > +- VeloxColumnarToRow > > > +- ^(1) FlushableHashAggregateTransformer(keys=[country_code#0], > > > functions=[partial_sum(latest_trade_data#29L), > > > partial_avg(latest_industrial_data#28L)], isStreamingAgg=false, > > > output=[country_code#0, sum#107L, sum#108, count#109L]) > > > +- ^(1) ProjectExecTransformer [country_code#0, > > > latest_industrial_data#28L, latest_trade_data#29L] > > > +- ^(1) FilterExecTransformer (trim(short_name#1, None) = Low > > income) > > > +- ^(1) InputIteratorTransformer[columns...] > > > > > > * +- RowToVeloxColumnar +- *(1) ColumnarToRow* > > > +- BatchScan country_summary[columns...] Reading table > > > [bigquery-public-data.world_bank_intl_debt.country_summary] > > > > > > I see 2 plan nodes together > > > 1. ColumnarToRow > > > 2. RowToVeloxColumnar > > > > > > This seems a little redundant apparently? Can someone help me why these > > > transitions are being added? I found out that ColumnarToRow is being > > added > > > since the Shuffle does not ouput a columnar output, but i also saw that > > > gluten code removes that rule intermittently while adding transitions. > > > > > > Any hints would help. > > > Thanks > > > Abbas > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
