I can confirm that this is reproducible:
orders/111.json:
{
"tax" : 10,
"id" : 111,
"cust_id" : 333,
"total" : 12,
"demo" :10
}
orders/222.json:
{
"cool": 20,
"id" : 222,
"cust_id" : 111,
"total" : 12
}
1st query:
0: jdbc:drill:zk=sen11:5181,sen12:5181> SELECT customers.id, orders.cool
. . . . . . . . . . . . . . . . . . . > FROM
`maprfs.cmatta`.`test/customers/*.json` customers,
. . . . . . . . . . . . . . . . . . . >
`maprfs.cmatta`.`test/orders/*.json` orders
. . . . . . . . . . . . . . . . . . . > WHERE customers.id = orders.cust_id
. . . . . . . . . . . . . . . . . . . > AND customers.country = 'FRANCE';
+------+-------+
| id | cool |
+------+-------+
| 333 | null |
+------+-------+
1 row selected (0.258 seconds)
Now change orders/111.json by moving the cool field from 222.json to
111.json:
{
"cool": 20,
"tax" : 10,
"id" : 111,
"cust_id" : 333,
"total" : 12,
"demo" :10
}
And removing cool from orders/222.json:
{
"id" : 222,
"cust_id" : 111,
"total" : 12
}
Re-run the query:
: jdbc:drill:zk=sen11:5181,sen12:5181> SELECT customers.id, orders.cool
. . . . . . . . . . . . . . . . . . . > FROM
`maprfs.cmatta`.`test/customers/*.json` customers,
. . . . . . . . . . . . . . . . . . . >
`maprfs.cmatta`.`test/orders/*.json` orders
. . . . . . . . . . . . . . . . . . . > WHERE customers.id = orders.cust_id
. . . . . . . . . . . . . . . . . . . > AND customers.country = 'FRANCE';
java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR:
java.lang.IllegalStateException: Failure while reading vector.
Expected vector class of
org.apache.drill.exec.vector.NullableIntVector but was holding vector
class org.apache.drill.exec.vector.NullableVarCharVector.
Fragment 0:0
[Error Id: 04e231ee-8bad-4ad2-aff3-6c0273befd2f on se-node11.se.lab:31010]
at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73)
at
sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87)
at sqlline.TableOutputFormat.print(TableOutputFormat.java:118)
at sqlline.SqlLine.print(SqlLine.java:1583)
at sqlline.Commands.execute(Commands.java:852)
at sqlline.Commands.sql(Commands.java:751)
at sqlline.SqlLine.dispatch(SqlLine.java:738)
at sqlline.SqlLine.begin(SqlLine.java:612)
at sqlline.SqlLine.start(SqlLine.java:366)
at sqlline.SqlLine.main(SqlLine.java:259)
Chris Matta
[email protected]
215-701-3146
On Mon, Jun 22, 2015 at 10:13 AM, Tugdual Grall <[email protected]> wrote:
> Yes.
>
> On Mon, Jun 22, 2015 at 4:12 PM, Christopher Matta <[email protected]>
> wrote:
>
>> Just to clarify, you run the *exact same query* once and it works, then
>> you remove say the “cool” field from orders/222.json and put it in
>> orders/111.json and the next time the same query returns that error?
>>
>>
>> Chris Matta
>> [email protected]
>> 215-701-3146
>>
>> On Mon, Jun 22, 2015 at 9:59 AM, Tugdual Grall <[email protected]> wrote:
>>
>>> Hello,
>>>
>>> In my use case I have several JSON documents that I need to query using a
>>> join.
>>> The structure of each document can vary a lot (some fields a present or
>>> not
>>> in documents)
>>>
>>> Sometimes the following exception is raised:
>>> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
>>> java.lang.IllegalStateException: Failure while reading vector. Expected
>>> vector class of org.apache.drill.exec.vector.NullableIntVector but was
>>> holding vector class org.apache.drill.exec.vector.NullableVarCharVector.
>>> Fragment 0:0 [Error Id: 35c751bd-3ca0-4e4a-bbac-ad5823ce582f on
>>> 192.168.99.13:31010]
>>>
>>> The queries:
>>>
>>> Following query works:
>>> -----
>>> SELECT customers.id, orders.demo
>>> FROM dfs.`/Users/tgrall/working/customers/*.json` customers,
>>> dfs.`/Users/tgrall/working/orders/*.json` orders
>>> WHERE customers.id = orders.cust_id
>>> AND customers.country = 'FRANCE'
>>> -----
>>>
>>> Following query FAILS:
>>> -----
>>> SELECT customers.id, orders.cool
>>> FROM dfs.`/Users/tgrall/working/customers/*.json` customers,
>>> dfs.`/Users/tgrall/working/orders/*.json` orders
>>> WHERE customers.id = orders.cust_id
>>> AND customers.country = 'FRANCE'
>>> -----
>>>
>>>
>>> The documents:
>>>
>>> Here the files:
>>>
>>> ./customers/333.json
>>> {
>>> "id" : 333,
>>> "name" : "Dave Smith",
>>> "country" : "FRANCE"
>>> }
>>>
>>>
>>> ./orders/111.json
>>> {
>>> "tax" : 10,
>>> "id" : 111,
>>> "cust_id" : 333,
>>> "total" : 12,
>>> "demo" :10
>>> }
>>>
>>> ./orders/222.json
>>> {
>>> "cool":20,
>>> "id" : 222,
>>> "cust_id" : 111,
>>> "total" : 12
>>> }
>>>
>>>
>>> To reproduce the bug you may have to change the document (add/remove
>>> cool,
>>> tax fields)
>>>
>>> It looks like the schema is not "updated" on the fly in some case.
>>>
>>> Any idea how to workaround? Is that bug?
>>>
>>> Regards
>>> Tug
>>>
>>
>>
>