Thanks for your helpful response! It seems like disabling the flattening will at least affect some rules in optimization. It might not be a minor change.
-Rui On Wed, Sep 5, 2018 at 4:54 AM Stamatis Zampetakis <[email protected]> wrote: > Hi Rui, > > Disabling flattening in some cases seems reasonable. > > If I am not mistaken, even in the existing code it is not used all the time > so it makes sense to become configurable. > For example, Calcite prepared statements (CalcitePrepareImpl) are using the > flattener only for DDL operations that create materialized views (and this > is because this code at some point passes from the PlannerImpl). > On the other hand, any query that is using the Planner will also pass from > the flattener. > > Disabling the flattener does not mean that all rules will work without > problems. The Javadoc of the RelStructuredTypeFlattener at some point says > "This approach has the benefit that real optimizer and codegen rules never > have to deal with structured types.". Due to this, it is very likely that > some rules were written based on the fact that there are no structured > types. > > Best, > Stamatis > > > Στις Τετ, 5 Σεπ 2018 στις 9:48 π.μ., ο/η Julian Hyde <[email protected]> > έγραψε: > > > Flattening was introduced mainly because the original engine used flat > > column-oriented storage. Now we have several ways to executing, > > including generating java code. > > > > Adding a mode to disable flattening might make sense. > > On Tue, Sep 4, 2018 at 12:52 PM Rui Wang <[email protected]> > > wrote: > > > > > > Hi Community, > > > > > > While trying to support Row type in Apache Beam SQL on top of Calcite, > I > > > realized flattening Row logic will make structure information of Row > lost > > > after Projections. There is a use case where users want to mix Beam > > > programming model with Beam SQL together to process a dataset. The > > > following is an example of the use case: > > > > > > dataset.apply(something user defined) > > > .apply(SELECT ...) > > > .apply(something user defined) > > > > > > As you can see, after the SQL statement is applied, the data structure > > > should be preserved for further processing. > > > > > > The most straightforward way to me is to make Struct fattening optional > > so > > > I could choose to disable it and the Row structure is preserved. Can I > > ask > > > if it is feasible to make it happen? What could happen if Calcite just > > > doesn't flatten Struct in flattener? (I tried to disable it but had > > > exceptions in optimizer. I wasn't sure if that were some minor thing to > > fix > > > or Struct flattening was a design choice so the impact of change was > > huge) > > > > > > Additionally, if there is a way to keep the information that I can use > to > > > reconstruct the Row after projections, it might be ok as well. Does > this > > > idea exist in Calcite? If it does not exist, how is this idea compared > > with > > > disabling Struct flattening? > > > > > > Thanks, > > > Rui > > >
