Re: Array element access and nested rows

Shuyi Chen Wed, 21 Mar 2018 18:09:43 -0700

Thanks a lot, Anton. This seems to be a bug in Calcite. When the statement
involving record types, sql validation seems to work, but the rel plan
generated might be wrong.  I can also reproduce your case:


MockTable deptNestedTable =
    MockTable.create(this, salesSchema, "DEPT_NESTED", false, 4);
deptNestedTable.addColumn("DEPTNO", f.intType, true);
deptNestedTable.addColumn("NAME", f.varchar10Type);
deptNestedTable.addColumn("SKILLRECORD", f.skillRecordType);
deptNestedTable.addColumn("EMPLOYEES", f.empListType);
registerTable(deptNestedTable);

Run the following test:

@Test public void testArrayOfRecord() {
  sql("select skillrecord, employees from dept_nested").ok();
}

yield:
LogicalProject(SKILLRECORD=[$0], EMPLOYEES=[$1])
  LogicalProject(SKILLRECORD=[$2], SKILLRECORD1=[$3], EMPLOYEES=[$4])
    LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE], DESC=[$2.DESC],
EMPLOYEES=[$3])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])

Sometimes, it works:

@Test public void testArrayOfRecord() {
sql("select name, employees from dept_nested").ok();
}

yield:

LogicalProject(NAME=[$1], EMPLOYEES=[$4])
  LogicalProject(DEPTNO=[$0], NAME=[$1], TYPE=[$2.TYPE], DESC=[$2.DESC],
EMPLOYEES=[$3])
    LogicalTableScan(table=[[CATALOG, SALES, DEPT_NESTED]])

I can take a deeper look.

Shuyi

On Wed, Mar 21, 2018 at 11:06 AM, Anton Kedin <[email protected]>
wrote:

> Hi,
>
> I have an issue I am not sure how to handle, would appreciate any pointers.
>
> I have a table with row type:
> RecordType(
>     INTEGER orderId,
>     RecordType(VARCHAR name, INTEGER personId)
>         person,
>     RecordType(VARCHAR sku, INTEGER price, VARCHAR currency, VARCHAR ARRAY
> tags)
>         ARRAY items
> )
>
> With this row type I am trying to model a JSON object which looks like
> this:
> { "orderId" : 1,
>   "person" : { "name" : "john", "personId" : 12, },
>   "items": [
>     { "sku" : "aaa01", "price" : 12, "currency" : "USD", "tags" : ["blue",
> "book"] }
>   ]}
>
> When selecting the whole items array I get the following plan:
> SELECT items FROM PCOLLECTION
>
> LogicalProject(items=[$3])
>   LogicalProject(orderId=[$0], name=[$1.name], personId=[$1.personId],
> items
> =[$2])
>     LogicalTableScan(table=[[PCOLLECTION]])
>
> Which looks correct and it works. One thing to note here is that Calcite
> flattens the person row, and makes the input ref for the items field as $3,
> as expected.
>
> But when I want to get a specific element from that array I get the
> following:
> SELECT items[0] FROM PCOLLECTION
>
> LogicalProject(EXPR$0$0=[ITEM($2, 0).sku], EXPR$0$1=[ITEM($2, 0).price],
> EXPR$0$2=[ITEM($2, 0).currency], EXPR$0$3=[ITEM($2, 0).tags])
>   LogicalProject(orderId=[$0], name=[$1.name], personId=[$1.personId],
> items
> =[$2])
>     LogicalTableScan(table=[[PCOLLECTION]])
>
> The first project looks the same. Flattened person row, items array, all
> looks similar to the above.
> But the outer project calls ITEM($2, i). I would expect it to be
> ITEM($3, i) instead,
> to adjust for the flattened person row, but it keeps the index as $2, which
> would have been the correct index if the row was not flattened, but it
> should be $3 for flattened row, similar to the previous example.
>
> Is there something I am missing or is it a bug and Calcite should adjust
> the input ref index to account for flattened rows in this case as well?
>
> Thank you,
> Anton
>



-- 
"So you have to trust that the dots will somehow connect in your future."

Re: Array element access and nested rows

Reply via email to