[jira] [Comment Edited] (DRILL-5480) Empty batch returning from HBase may cause SchemChangeException or incorrect query result

Jinfeng Ni (JIRA) Thu, 18 May 2017 16:38:39 -0700

    [ 
https://issues.apache.org/jira/browse/DRILL-5480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16016659#comment-16016659
 ]


Jinfeng Ni edited comment on DRILL-5480 at 5/18/17 11:37 PM:
-------------------------------------------------------------

We may argue that in the above reproduce sample data, hbase table has a schema 
change, since different regions have different columns, hence schema change 
exception seems to be reasonable, but the incorrect query result is still a bug.

Here is the case where there is no schema change across different hbase 
regions; all 3 regions have 'row_key', and 'orders.id'.  However, query failed 
with  schema change conflicts intermittently. 

{code}
create 'customer', 'orders', {SPLITS => ['3', '6']}
put 'customer', '1', 'orders:id', '100'
put 'customer', '5', 'orders:id', '200'
put 'customer', '7', 'orders:id', '300'

alter session set `planner.enable_broadcast_join`=false;
select * from hbase.customer c, cp.`tpch/orders.parquet` o where 
cast(c.orders.id as bigint) = o.o_orderkey and c.orders.id <= '200';
{code}

In the following, the first run hit schema conflicts in UnorderReceiver, and 
second run was successful. 

{code}
select * from hbase.customer c, cp.`tpch/orders.parquet` o where 
cast(c.orders.id as bigint) = o.o_orderkey and c.orders.id <= '200';
Error: SYSTEM ERROR: IllegalStateException: Failure while reading vector.  
Expected vector class of org.apache.drill.exec.vector.NullableIntVector but was 
holding vector class org.apache.drill.exec.vector.complex.MapVector, field= 
orders(MAP:REQUIRED) [id(VARBINARY:OPTIONAL) [$bits$(UINT1:REQUIRED), 
id(VARBINARY:OPTIONAL) [$offsets$(UINT4:REQUIRED)]]]

Fragment 2:0

[Error Id: 4a249fc4-0323-4c25-82bf-2313e05a3d82 on drlvm-245.qa.lab:31010] 
(state=,code=0)

select * from hbase.customer c, cp.`tpch/orders.parquet` o where 
cast(c.orders.id as bigint) = o.o_orderkey and c.orders.id <= '200';
+--------------+----------------+-------------+------------+----------------+---------------+--------------+------------------+------------------+-----------------+-----------------------------------------------+
|   row_key    |     orders     | o_orderkey  | o_custkey  | o_orderstatus  | 
o_totalprice  | o_orderdate  | o_orderpriority  |     o_clerk      | 
o_shippriority  |                   o_comment                   |
+--------------+----------------+-------------+------------+----------------+---------------+--------------+------------------+------------------+-----------------+-----------------------------------------------+
| [B@12c7d5cf  | {"id":"MTAw"}  | 100         | 1471       | O              | 
198978.27     | 1998-02-28   | 4-NOT SPECIFIED  | Clerk#000000577  | 0          
     | heodolites detect slyly alongside of the ent  |
+--------------+----------------+-------------+------------+----------------+---------------+--------------+------------------+------------------+-----------------+-----------------------------------------------+
{code}

The cause of the schema change: with filter pushdown, the third region from 
HBase API returns nothing. This forces Drill to inject nullable-int column, 
which causes the schema conflict in downstream operator (UnorderedReceiver).
{code}
04-03                        Scan(groupscan=[HBaseGroupScan 
[HBaseScanSpec=HBaseScanSpec [tableName=customer, startRow=, stopRow=, 
filter=SingleColumnValueFilter (orders, id, LESS_OR_EQUAL, 200)], 
columns=[`*`]]]) : rowType = RecordType(ANY row_key, (VARCHAR(65535), ANY) MAP 
orders): rowcount = 1572864.0, cumulative cost = {1572864.0 rows, 3145728.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 33866
{code}


was (Author: jni):
We may argue that in the above reproduce sample data, hbase table has a schema 
change, since different regions have different columns, hence schema change 
exception seems to be reasonable, but the incorrect query result is still a bug.

Here is the case where there is no schema change across different hbase 
regions; all 3 regions have 'row_key', and 'orders.id'.  However, query failed 
with  schema change conflicts intermittently. 

{code}
create 'customer', 'orders', {SPLITS => ['3', '6']}
put 'customer', '1', 'orders:id', '100'
put 'customer', '5', 'orders:id', '200'
put 'customer', '7', 'orders:id', '300'

alter session set `planner.enable_broadcast_join`=false;
select * from hbase.customer c, cp.`tpch/orders.parquet` o where 
cast(c.orders.id as bigint) = o.o_orderkey and c.orders.id <= '200';
{code}

In the following, the first run hit schema conflicts in UnorderReceiver, and 
second run was successful. 

{code}
select * from hbase.customer c, cp.`tpch/orders.parquet` o where 
cast(c.orders.id as bigint) = o.o_orderkey and c.orders.id <= '200';
Error: SYSTEM ERROR: IllegalStateException: Failure while reading vector.  
Expected vector class of org.apache.drill.exec.vector.NullableIntVector but was 
holding vector class org.apache.drill.exec.vector.complex.MapVector, field= 
orders(MAP:REQUIRED) [id(VARBINARY:OPTIONAL) [$bits$(UINT1:REQUIRED), 
id(VARBINARY:OPTIONAL) [$offsets$(UINT4:REQUIRED)]]]

Fragment 2:0

[Error Id: 4a249fc4-0323-4c25-82bf-2313e05a3d82 on drlvm-245.qa.lab:31010] 
(state=,code=0)

select * from hbase.customer c, cp.`tpch/orders.parquet` o where 
cast(c.orders.id as bigint) = o.o_orderkey and c.orders.id <= '200';
+--------------+----------------+-------------+------------+----------------+---------------+--------------+------------------+------------------+-----------------+-----------------------------------------------+
|   row_key    |     orders     | o_orderkey  | o_custkey  | o_orderstatus  | 
o_totalprice  | o_orderdate  | o_orderpriority  |     o_clerk      | 
o_shippriority  |                   o_comment                   |
+--------------+----------------+-------------+------------+----------------+---------------+--------------+------------------+------------------+-----------------+-----------------------------------------------+
| [B@12c7d5cf  | {"id":"MTAw"}  | 100         | 1471       | O              | 
198978.27     | 1998-02-28   | 4-NOT SPECIFIED  | Clerk#000000577  | 0          
     | heodolites detect slyly alongside of the ent  |
+--------------+----------------+-------------+------------+----------------+---------------+--------------+------------------+------------------+-----------------+-----------------------------------------------+
{code}

The cause of the schema change: with filter pushdown, the second region from 
HBase API returns nothing. This forces Drill to inject nullable-int column, 
which causes the schema conflict in downstream operator (UnorderedReceiver).
{code}
04-03                        Scan(groupscan=[HBaseGroupScan 
[HBaseScanSpec=HBaseScanSpec [tableName=customer, startRow=, stopRow=, 
filter=SingleColumnValueFilter (orders, id, LESS_OR_EQUAL, 200)], 
columns=[`*`]]]) : rowType = RecordType(ANY row_key, (VARCHAR(65535), ANY) MAP 
orders): rowcount = 1572864.0, cumulative cost = {1572864.0 rows, 3145728.0 
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 33866
{code}

> Empty batch returning from HBase may cause SchemChangeException or incorrect 
> query result
> -----------------------------------------------------------------------------------------
>
>                 Key: DRILL-5480
>                 URL: https://issues.apache.org/jira/browse/DRILL-5480
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Jinfeng Ni
>            Assignee: Jinfeng Ni
>
> The following repo was provided by [~haozhu].
> 1. Create a Hbase table with 4 regions
> {code}
> create 'myhbase', 'cf1','cf2', {SPLITS => ['a', 'b', 'c']}
> put 'myhbase','a','cf1:col1','somedata'
> put 'myhbase','b','cf1:col2','somedata'
> put 'myhbase','c','cf2:col1','somedata'
> {code}
> One region has cf1.col1.  One region has column family 'cf1', but does not 
> have 'col1' under 'cf1'. One region has only column family 'cf2'. And last 
> region is complete empty.
> 2. Prepare a csv file.
> {code}
> select * from dfs.tmp.`joinhbase.csv`;
> +-------------------+
> |      columns      |
> +-------------------+
> | ["1","somedata"]  |
> | ["2","somedata"]  |
> | ["3","somedata"]  |
> {code}
> Now run the following query on drill 1.11.0-SNAPSHOT:
> {code}
> select cast(H.row_key as varchar(10)) as keyCol, CONVERT_FROM(H.cf1.col1, 
> 'UTF8') as col1
> from 
> hbase.myhbase H JOIN dfs.tmp.`joinhbase.csv` C
> ON CONVERT_FROM(H.cf1.col1, 'UTF8')= C.columns[1]
> ;
> {code}
> The correct query result show be:
> {code}
> +---------+-----------+
> | keyCol  |   col1    |
> +---------+-----------+
> | a       | somedata  |
> | a       | somedata  |
> | a       | somedata  |
> +---------+-----------+
> {code}
> Turn off broadcast join, then we will see SchemaChangeException, or incorrect 
> result randomly. By 'randomly', it means in the same session, the same query 
> would hit SchemaChangeException in one run, while gets incorrect result in a 
> second run. 
> {code}
> alter session set `planner.enable_broadcast_join`=false;
> {code}
> {code}
> select cast(H.row_key as varchar(10)) as keyCol, CONVERT_FROM(H.cf1.col1, 
> 'UTF8') as col1
> . . . . . . . . . . . . . . . . . .> from
> . . . . . . . . . . . . . . . . . .> hbase.myhbase H JOIN 
> dfs.tmp.`joinhbase.csv` C
> . . . . . . . . . . . . . . . . . .> ON CONVERT_FROM(H.cf1.col1, 'UTF8')= 
> C.columns[1]
> . . . . . . . . . . . . . . . . . .> ;
> Error: SYSTEM ERROR: SchemaChangeException: Hash join does not support schema 
> changes
> {code}
> {code}
> +---------+-------+
> | keyCol  | col1  |
> +---------+-------+
> +---------+-------+
> No rows selected (0.302 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (DRILL-5480) Empty batch returning from HBase may cause SchemChangeException or incorrect query result

Reply via email to