I may be wrong, but I think that Calcite's Enumerable convention
requires that SQL STRUCT and ARRAY types are mapped to a Java List.
The item function expects that, and the adapter should comply. If the
adapter is providing an array, that is wrong.

On Sun, Sep 27, 2020 at 12:23 PM Alessandro Solimando
<[email protected]> wrote:
>
> Hello all,
> at the time I wrote the unit tests for the extended support for the missing
> Cassandra data types, I have disabled the one at
> CassandraAdapterDataTypesTest.java:192
> <https://github.com/apache/calcite/blob/master/cassandra/src/test/java/org/apache/calcite/test/CassandraAdapterDataTypesTest.java#L192
> > since
> accessing a tuple element was returning a null value in place of the actual
> non-null element.
>
> I recently had a chance to dig a bit to see why that's happening, and I
> found what I consider the immediate cause for that (from
> SqlFunctions.java:2511
> <https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L2511>
> ):
>
>   /** Implements the {@code [ ... ]} operator on an object whose type is not
> >    * known until runtime.
> >    */
>
> public static Object item(Object object, Object index) {
> >     if (object instanceof Map) {
> >       return mapItem((Map) object, index);
> >     }
> >     if (object instanceof List && index instanceof Number) {
> >       return arrayItem((List) object, ((Number) index).intValue());
> >     }
> >     return null;
> >   }
>
>
> This method ends up being called, and since the backing structure for
> struct is an array (see CassandraEnumerator.java:114
> <https://github.com/apache/calcite/blob/master/cassandra/src/main/java/org/apache/calcite/adapter/cassandra/CassandraEnumerator.java#L114>),
> none of the two if conditions match, and null is returned.
>
> This of course can be easily fixed, in what follows an example:
>
> public static Object item(Object object, Object index) {
> >   if (object instanceof List && index instanceof Number) {
> >      return arrayItem((List) object, ((Number) index).intValue());
> >    }
> >    if (object instanceof Object[]) { // it guarantees also that object !=
> > null
> >      if (index instanceof Number) {
> >        return arrayItem(Arrays.asList(object), ((Number)
> > index).intValue());
> >      } else if (index instanceof String) {
> >        Object[] array = (Object[]) object;
> >        return mapItem(IntStream.range(0, array.length).boxed()
> >           .collect(Collectors.toMap(i -> Integer.toString(i + 1), i ->
> > array[i])), index);
> >      }
> >    }
> >    return null;
> >   }
>
>
> Question is, is there anything which should be done differently on the
> adapter end to prevent this from "item" to be called in the first place? Or
> this is a legitimate situation and the "fix" is actually covering an
> unhandled legal case?
>
> I have tried to look up in the codebase for something similar, but without
> much luck, and I'd appreciate some guidance here.
>
> In what follows my analysis and a bit of context behind the status quo for
> tuple/struct handling in Cassandra adapter:
>
> 1) I am accessing the struct elements via the "item" operator as this seems
> the right way to do so according to the codebase examples I have seen
> (JdbcTest for instance). Given that the "SqlTypeName" of the column of
> "f_field" is "STRUCTURED", it ends up treated like a "MAP", rather than an
> "ARRAY".
>
> "MAP" requires a "string" identifier and does not accept an integer to be
> used with the "item" operator ("f_tuple"['1'] for instance). The identifier
> it's the 1-based index of the element inside the structure (as dictated by
> CassandraSchema.java:225
> <https://github.com/apache/calcite/blob/master/cassandra/src/main/java/org/apache/calcite/adapter/cassandra/CassandraSchema.java#L225>)
> since, unlike other structs, there is no field name here coming from the
> datastax driver, hence we are left with the index inside the structure,
> which seems reasonable.
>
> At first I tried to use an integer index, that's the error returned:
>
> java.sql.SQLException: Error while executing SQL "select x[1], x['2'],
> > x['3'] from (select "f_tuple" from "test_collections") as T(x)": From line
> > 1, column 8 to line 1, column 11: Cannot apply 'ITEM' to arguments of type
> > 'ITEM(<RECORDTYPE(BIGINT 1, VARBINARY 2, TIMESTAMP(0) 3)>, <INTEGER>)'.
> > Supported form(s): <ARRAY>[<INTEGER>]
> > <MAP>[<VALUE>]
>
> (omitting the full stack trace as I don't think it's adding much value).
>
> 2) I also had second thoughts about having mapped Cassandra tuples to
> struct, but there seems to be no alternatives allowing for an indexed
> collection with heterogeneous types. What's your opinion, is it correct or
> there is another way I could take?
>
> Finally, I'd either open a JIRA ticket for adding the extra behavior on
> "item", or one for fixing the Cassandra adapter for tuples if in the end
> that's the root cause of the issue.
>
> <https://github.com/asolimando/calcite/actions/runs/274415659>
> In the first case I would already have a branch
> <https://github.com/asolimando/calcite/tree/struct-values-index-access> which
> might be ready to become a PR, for which tests are passing
> <https://github.com/asolimando/calcite/actions/runs/274415659> already (CI
> on forks are really great, thanks for introducing that!).
>
> Looking forward to hearing your thoughts.
>
> Best regards,
> Alessandro

Reply via email to