Flattening was introduced mainly because the original engine used flat
column-oriented storage. Now we have several ways to executing,
including generating java code.

Adding a mode to disable flattening might make sense.
On Tue, Sep 4, 2018 at 12:52 PM Rui Wang <ruw...@google.com.invalid> wrote:
>
> Hi Community,
>
> While trying to support Row type in Apache Beam SQL on top of Calcite, I
> realized flattening Row logic will make structure information of Row lost
> after Projections. There is a use case where users want to mix Beam
> programming model with Beam SQL together to process a dataset. The
> following is an example of the use case:
>
> dataset.apply(something user defined)
>             .apply(SELECT ...)
>             .apply(something user defined)
>
> As you can see, after the SQL statement is applied, the data structure
> should be preserved for further processing.
>
> The most straightforward way to me is to make Struct fattening optional so
> I could choose to disable it and the Row structure is preserved. Can I ask
> if it is feasible to make it happen? What could happen if Calcite just
> doesn't flatten Struct in flattener? (I tried to disable it but had
> exceptions in optimizer. I wasn't sure if that were some minor thing to fix
> or Struct flattening was a design choice so the impact of change was huge)
>
> Additionally, if there is a way to keep the information that I can use to
> reconstruct the Row after projections, it might be ok as well. Does this
> idea exist in Calcite? If it does not exist, how is this idea compared with
> disabling Struct flattening?
>
> Thanks,
> Rui

Reply via email to