Hello all, at the time I wrote the unit tests for the extended support for the missing Cassandra data types, I have disabled the one at CassandraAdapterDataTypesTest.java:192 <https://github.com/apache/calcite/blob/master/cassandra/src/test/java/org/apache/calcite/test/CassandraAdapterDataTypesTest.java#L192 > since accessing a tuple element was returning a null value in place of the actual non-null element.
I recently had a chance to dig a bit to see why that's happening, and I found what I consider the immediate cause for that (from SqlFunctions.java:2511 <https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L2511> ): /** Implements the {@code [ ... ]} operator on an object whose type is not > * known until runtime. > */ public static Object item(Object object, Object index) { > if (object instanceof Map) { > return mapItem((Map) object, index); > } > if (object instanceof List && index instanceof Number) { > return arrayItem((List) object, ((Number) index).intValue()); > } > return null; > } This method ends up being called, and since the backing structure for struct is an array (see CassandraEnumerator.java:114 <https://github.com/apache/calcite/blob/master/cassandra/src/main/java/org/apache/calcite/adapter/cassandra/CassandraEnumerator.java#L114>), none of the two if conditions match, and null is returned. This of course can be easily fixed, in what follows an example: public static Object item(Object object, Object index) { > if (object instanceof List && index instanceof Number) { > return arrayItem((List) object, ((Number) index).intValue()); > } > if (object instanceof Object[]) { // it guarantees also that object != > null > if (index instanceof Number) { > return arrayItem(Arrays.asList(object), ((Number) > index).intValue()); > } else if (index instanceof String) { > Object[] array = (Object[]) object; > return mapItem(IntStream.range(0, array.length).boxed() > .collect(Collectors.toMap(i -> Integer.toString(i + 1), i -> > array[i])), index); > } > } > return null; > } Question is, is there anything which should be done differently on the adapter end to prevent this from "item" to be called in the first place? Or this is a legitimate situation and the "fix" is actually covering an unhandled legal case? I have tried to look up in the codebase for something similar, but without much luck, and I'd appreciate some guidance here. In what follows my analysis and a bit of context behind the status quo for tuple/struct handling in Cassandra adapter: 1) I am accessing the struct elements via the "item" operator as this seems the right way to do so according to the codebase examples I have seen (JdbcTest for instance). Given that the "SqlTypeName" of the column of "f_field" is "STRUCTURED", it ends up treated like a "MAP", rather than an "ARRAY". "MAP" requires a "string" identifier and does not accept an integer to be used with the "item" operator ("f_tuple"['1'] for instance). The identifier it's the 1-based index of the element inside the structure (as dictated by CassandraSchema.java:225 <https://github.com/apache/calcite/blob/master/cassandra/src/main/java/org/apache/calcite/adapter/cassandra/CassandraSchema.java#L225>) since, unlike other structs, there is no field name here coming from the datastax driver, hence we are left with the index inside the structure, which seems reasonable. At first I tried to use an integer index, that's the error returned: java.sql.SQLException: Error while executing SQL "select x[1], x['2'], > x['3'] from (select "f_tuple" from "test_collections") as T(x)": From line > 1, column 8 to line 1, column 11: Cannot apply 'ITEM' to arguments of type > 'ITEM(<RECORDTYPE(BIGINT 1, VARBINARY 2, TIMESTAMP(0) 3)>, <INTEGER>)'. > Supported form(s): <ARRAY>[<INTEGER>] > <MAP>[<VALUE>] (omitting the full stack trace as I don't think it's adding much value). 2) I also had second thoughts about having mapped Cassandra tuples to struct, but there seems to be no alternatives allowing for an indexed collection with heterogeneous types. What's your opinion, is it correct or there is another way I could take? Finally, I'd either open a JIRA ticket for adding the extra behavior on "item", or one for fixing the Cassandra adapter for tuples if in the end that's the root cause of the issue. <https://github.com/asolimando/calcite/actions/runs/274415659> In the first case I would already have a branch <https://github.com/asolimando/calcite/tree/struct-values-index-access> which might be ready to become a PR, for which tests are passing <https://github.com/asolimando/calcite/actions/runs/274415659> already (CI on forks are really great, thanks for introducing that!). Looking forward to hearing your thoughts. Best regards, Alessandro
