Got it - thanks! This is very helpful. We managed to generate the form
Weston suggested above (from a Python producer) and managed to get this to
work.

On Mon, Aug 7, 2023 at 2:45 PM Weston Pace <weston.p...@gmail.com> wrote:

> > But I can't figure out how to express "select struct field 0 from field 2
> > of the original table where field 2 is a struct column"
> >
> > Any idea how the substrait message should look like for the above?
>
> I believe it would be:
>
> ```
> "expression": {
>   "selection": {
>     "direct_reference": {
>       "struct_field" {
>         "field": 2,
>         "child" {
>           "struct_field" {  "field": 0 }
>         }
>       }
>     }
>     "root_reference": { }
>   }
> }
> ```
>
> To get the above I used the following python (requires [1] which could use
> a review and you need some way to convert the binary substrait to json, I
> used a script I have lying around):
>
> ```
> >>> import pyarrow as pa
> >>> import pyarrow.compute as pc
> >>> schema = pa.schema([pa.field("points", pa.struct([pa.field("x",
> pa.float64()), pa.field("y", pa.float64())]))])
> >>> expr = pc.field(("points", "x"))
> >>> expr.to_substrait(schema)
> <pyarrow.Buffer address=0x5602249c9970 size=92 is_cpu=True
> is_mutable=False>
> ```
>
> [1] https://github.com/apache/arrow/pull/34834
>
> On Tue, Aug 1, 2023 at 1:45 PM Li Jin <ice.xell...@gmail.com> wrote:
>
> > Hi,
> >
> > I am recently trying to do
> > (1) assign a struct type column s<v1, v2>
> > (2) flatten the struct columns (by assign v1=s[v1], v2=s[v2] and drop
> the s
> > column)
> >
> > via Substrait and Acero.
> >
> > However, I ran into the problem where I don't know the proper substrait
> > message to encode this (for (2))
> >
> > Normally, if I select a column from the origin table, it would look like
> > this (e.g, select column index 1 from the original table):
> >
> > selection {
> >   direct_reference {
> >     struct_field {
> >         1
> >     }
> >   }
> > }
> >
> > But I can't figure out how to express "select struct field 0 from field 2
> > of the original table where field 2 is a struct column"
> >
> > Any idea how the substrait message should look like for the above?
> >
>

Reply via email to