RE: Implementing EXPLODE operator

mbudiu Sat, 03 Jun 2023 10:56:37 -0700

This seems to be the same as UNNEST WITH ORDINALITY, in Calcite since 2016.
https://issues.apache.org/jira/browse/CALCITE-854


Mihai

-----Original Message-----
From: Soumyadeep Mukhopadhyay 
Sent: Saturday, June 03, 2023 10:42 AM
To: [email protected]
Subject: Re: Implementing EXPLODE operator

My apologies for not providing the context. The following links may provide a 
better background :
Vertica
https://docs.vertica.com/12.0.x/en/sql-reference/functions/data-type-specific-functions/collection-functions/explode/
and Spark
https://spark.apache.org/docs/3.1.3/api/python/reference/api/pyspark.sql.functions.explode.html
use the function EXPLODE and the way it is supposed to work is convert a single 
row with an array or set to multiple rows, the number of rows shall be equal to 
the number of items in the array / set (and also hashmaps in the case of Spark).
For Snowflake the equivalent would be
https://docs.snowflake.com/en/sql-reference/functions/flatten.
They are variations of the existing UNNEST. So if I have to implement them 
shall I inherit from the existing SqlUnnestOperator?

Thanks again for your inputs!

Soumyadeep.

On Sat, 3 Jun 2023 at 10:43 PM, Julian Hyde <[email protected]> wrote:

> I don’t know what EXPLODE is and you didn’t define it. UNNEST is in 
> the SQL standard.
>
> Julian
>
> > On Jun 3, 2023, at 09:41, Soumyadeep Mukhopadhyay 
> > <[email protected]>
> wrote:
> >
> > Actually I am hoping to contribute, but only if it doesn’t seem
> redundant.
> > That’s why I wanted to understand if the effort would even make sense.
> Was
> > EXPLODE not included by design? Would you recommend against adding it?
> >
> > Soumyadeep.
> >
> >
> >
> >> On Sat, 3 Jun 2023 at 9:51 PM, Julian Hyde <[email protected]>
> wrote:
> >>
> >> I can’t tell whether you are intending to contribute EXPLODE or do 
> >> it
> on a
> >> private branch. You don’t make a case for what EXPLODE could do, so 
> >> I presume the latter.
> >>
> >> UNNEST is a unique function. Its implementation is (unfortunately 
> >> but
> >> necessarily) spread over many files. Copy-pasting it to make a new
> function
> >> seems like a bad idea, because you would be multiplying a mess. I 
> >> don’t know what you’re trying to achieve but I would lean on the 
> >> existing facilities (UDFs, table functions, adding syntactic sugar if 
> >> necessary).
> >>
> >> Julian
> >>
> >>> On Jun 3, 2023, at 2:45 AM, Soumyadeep Mukhopadhyay <
> >> [email protected]> wrote:
> >>>
> >>> Hello all,
> >>>
> >>> I was wondering if there's any virtue in creating an operator like
> UNNEST
> >>> by having a similar implementation like that of SqlUnnestOperator.
> >>> (SqlExplodeOperator maybe)
> >>>
> >>> The functionalities are close but there are nuances that I have 
> >>> not discovered yet.
> >>>
> >>> My approach was kind of a hack :
> >>> Clone the class SqlUnnestOperator -> rename the 
> >>> SqlFunctionalOperator
> >> name
> >>> as "EXPLODE" and rename return builder.add("$unnest", 
> >>> SqlTypeName.ANY)
> >>> .nullable(true).build() in the inferReturnType to "$explode" -> 
> >>> then
> >> check
> >>> if call.getKind() within unparseCall is UNNEST -> if yes then SqlUtil.
> >>> unparseFunctionSyntax(SqlStdOperatorTable.EXPLODE, writer, call,
> false).
> >>> NOTE: I have also tried creating a new kind EXPLODE and then 
> >>> followed
> the
> >>> above step, both of them work.
> >>>
> >>> As Julian had pointed out about semantics and how everything 
> >>> within
> >> Calcite
> >>> should be in Calcite's dialect, I feel there could be antipatterns 
> >>> in
> my
> >>> approach.
> >>> Could any of you please point me to where I might be going wrong? 
> >>> How
> >> could
> >>> I make this more robust?
> >>>
> >>> (If any of you are wondering why I am doing this then this is what 
> >>> I
> want
> >>> to achieve - "SELECT * FROM UNNEST(ARRAY['1', '2'])" can be 
> >>> written as "SELECT * FROM EXPLODE(ARRAY['1', '2'])" in an 
> >>> appropriate dialect,
> like
> >>> Spark for example)
> >>>
> >>> Thank you for your time!
> >>>
> >>> Best,
> >>> Soumyadeep.
> >>
>

RE: Implementing EXPLODE operator

Reply via email to