Hello,
In my use case I have several JSON documents that I need to query using a
join.
The structure of each document can vary a lot (some fields a present or not
in documents)
Sometimes the following exception is raised:
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
java.lang.IllegalStateException: Failure while reading vector. Expected
vector class of org.apache.drill.exec.vector.NullableIntVector but was
holding vector class org.apache.drill.exec.vector.NullableVarCharVector.
Fragment 0:0 [Error Id: 35c751bd-3ca0-4e4a-bbac-ad5823ce582f on
192.168.99.13:31010]
The queries:
Following query works:
-----
SELECT customers.id, orders.demo
FROM dfs.`/Users/tgrall/working/customers/*.json` customers,
dfs.`/Users/tgrall/working/orders/*.json` orders
WHERE customers.id = orders.cust_id
AND customers.country = 'FRANCE'
-----
Following query FAILS:
-----
SELECT customers.id, orders.cool
FROM dfs.`/Users/tgrall/working/customers/*.json` customers,
dfs.`/Users/tgrall/working/orders/*.json` orders
WHERE customers.id = orders.cust_id
AND customers.country = 'FRANCE'
-----
The documents:
Here the files:
./customers/333.json
{
"id" : 333,
"name" : "Dave Smith",
"country" : "FRANCE"
}
./orders/111.json
{
"tax" : 10,
"id" : 111,
"cust_id" : 333,
"total" : 12,
"demo" :10
}
./orders/222.json
{
"cool":20,
"id" : 222,
"cust_id" : 111,
"total" : 12
}
To reproduce the bug you may have to change the document (add/remove cool,
tax fields)
It looks like the schema is not "updated" on the fly in some case.
Any idea how to workaround? Is that bug?
Regards
Tug