[
https://issues.apache.org/jira/browse/DRILL-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14384497#comment-14384497
]
Khurram Faraaz commented on DRILL-2608:
---------------------------------------
Physical plan for the failing query
00-00 Screen : rowType = RecordType(ANY key): rowcount = 343.0, cumulative
cost = {1122.3 rows, 1122.3 cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12812
00-01 UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount =
343.0, cumulative cost = {1088.0 rows, 1088.0 cpu, 0.0 io, 0.0 network, 0.0
memory}, id = 12811
00-03 UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount =
138.0, cumulative cost = {540.0 rows, 540.0 cpu, 0.0 io, 0.0 network, 0.0
memory}, id = 12809
00-05 UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount =
105.0, cumulative cost = {369.0 rows, 369.0 cpu, 0.0 io, 0.0 network, 0.0
memory}, id = 12807
00-07 UnionAll(all=[true]) : rowType = RecordType(ANY key): rowcount
= 101.0, cumulative cost = {260.0 rows, 260.0 cpu, 0.0 io, 0.0 network, 0.0
memory}, id = 12805
00-09 UnionAll(all=[true]) : rowType = RecordType(ANY key):
rowcount = 58.0, cumulative cost = {116.0 rows, 116.0 cpu, 0.0 io, 0.0 network,
0.0 memory}, id = 12803
00-11 Scan(groupscan=[EasyGroupScan
[selectionRoot=/tmp/charData.json, numFiles=1, columns=[`key`],
files=[maprfs:/tmp/charData.json]]]) : rowType = RecordType(ANY key): rowcount
= 18.0, cumulative cost = {18.0 rows, 18.0 cpu, 0.0 io, 0.0 network, 0.0
memory}, id = 12801
00-10 Scan(groupscan=[EasyGroupScan
[selectionRoot=/tmp/dateData.json, numFiles=1, columns=[`key`],
files=[maprfs:/tmp/dateData.json]]]) : rowType = RecordType(ANY key): rowcount
= 40.0, cumulative cost = {40.0 rows, 40.0 cpu, 0.0 io, 0.0 network, 0.0
memory}, id = 12802
00-08 Scan(groupscan=[EasyGroupScan
[selectionRoot=/tmp/doubleData.json, numFiles=1, columns=[`key`],
files=[maprfs:/tmp/doubleData.json]]]) : rowType = RecordType(ANY key):
rowcount = 43.0, cumulative cost = {43.0 rows, 43.0 cpu, 0.0 io, 0.0 network,
0.0 memory}, id = 12804
00-06 Scan(groupscan=[EasyGroupScan
[selectionRoot=/tmp/intData.json, numFiles=1, columns=[`key`],
files=[maprfs:/tmp/intData.json]]]) : rowType = RecordType(ANY key): rowcount =
4.0, cumulative cost = {4.0 rows, 4.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, id
= 12806
00-04 Scan(groupscan=[EasyGroupScan
[selectionRoot=/tmp/timeStmpData.json, numFiles=1, columns=[`key`],
files=[maprfs:/tmp/timeStmpData.json]]]) : rowType = RecordType(ANY key):
rowcount = 33.0, cumulative cost = {33.0 rows, 33.0 cpu, 0.0 io, 0.0 network,
0.0 memory}, id = 12808
00-02 Scan(groupscan=[EasyGroupScan [selectionRoot=/tmp/vrChrData.json,
numFiles=1, columns=[`key`], files=[maprfs:/tmp/vrChrData.json]]]) : rowType =
RecordType(ANY key): rowcount = 205.0, cumulative cost = {205.0 rows, 205.0
cpu, 0.0 io, 0.0 network, 0.0 memory}, id = 12810
> Union all query fails when json.all_text_mode=false
> ---------------------------------------------------
>
> Key: DRILL-2608
> URL: https://issues.apache.org/jira/browse/DRILL-2608
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 0.9.0
> Environment: | 9d92b8e319f2d46e8659d903d355450e15946533 | DRILL-2580:
> Exit early from HashJoinBatch if build side is empty | 26.03.2015 @ 16:13:53
> EDT | Unknown | 26.03.2015 @ 16:53:21 EDT |
> Reporter: Khurram Faraaz
> Assignee: Sean Hsuan-Yi Chu
>
> Union all query over JSON data file fails when store.json.all_text_mode is
> set to false, and same query returns correct results when
> store.json.all_text_mode is set to true. Each JSON data file had only one
> type of object {"key":<value>}, and the values in each of the JSON data files
> were of same datatype. Test was executed on a 4 node cluster.
> {code}
> 0: jdbc:drill:> select key from `charData.json` union all select key from
> `dateData.json` union all select key from `doubleData.json` union all select
> key from `intData.json` union all select key from `timeStmpData.json` union
> all select key from `vrChrData.json`;
> Query failed: RemoteRpcException: Failure while running fragment., For input
> string: "itzVxYBb" [ f1f81073-161c-4f24-89e5-37379413b01b on
> centos-04.qa.lab:31010 ]
> [ f1f81073-161c-4f24-89e5-37379413b01b on centos-04.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
> {code}
> Then I set alter session set `store.json.all_text_mode`=true;
> After setting son.all_text_mode to true, union all query returned correct
> results.
> {code}
> 0: jdbc:drill:> select key from `charData.json` union all select key from
> `dateData.json` union all select key from `doubleData.json` union all select
> key from `intData.json` union all select key from `timeStmpData.json` union
> all select key from `vrChrData.json`;
> ...
> +------------+
> 7,194 rows selected (0.462 seconds)
> {code}
> Resetting it back to false gives the same Exception
> {code}
> 0: jdbc:drill:> alter session set `store.json.all_text_mode`=false;
> +------------+------------+
> | ok | summary |
> +------------+------------+
> | true | store.json.all_text_mode updated. |
> +------------+------------+
> 1 row selected (0.049 seconds)
> 0: jdbc:drill:> select key from `charData.json` union all select key from
> `dateData.json` union all select key from `doubleData.json` union all select
> key from `intData.json` union all select key from `timeStmpData.json` union
> all select key from `vrChrData.json`;
> Query failed: RemoteRpcException: Failure while running fragment., For input
> string: "itzVxYBb" [ 412eda0e-cc22-43ae-b763-5e40a0326551 on
> centos-04.qa.lab:31010 ]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
> {code}
> Stack trace from drillbit.log
> {code}
> 2015-03-27 18:30:56,620 [2aea5e1e-88b9-3e4e-07b5-d7e46b29756f:frag:0:0] ERROR
> o.a.drill.exec.work.foreman.Foreman - Error
> b9cb90bd-7d89-4061-8595-4c5ad983f3f3: RemoteRpcException: Failure while
> running fragment., For input string: "itzVxYBb" [
> 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> org.apache.drill.exec.rpc.RemoteRpcException: Failure while running
> fragment., For input string: "itzVxYBb" [
> 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> [ 412eda0e-cc22-43ae-b763-5e40a0326551 on centos-04.qa.lab:31010 ]
> at
> org.apache.drill.exec.work.foreman.QueryManager.statusUpdate(QueryManager.java:163)
> [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.foreman.QueryManager$RootStatusReporter.statusChange(QueryManager.java:281)
> [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:114)
> [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.AbstractStatusReporter.fail(AbstractStatusReporter.java:110)
> [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.internalFail(FragmentExecutor.java:230)
> [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:182)
> [drill-java-exec-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> [drill-common-0.9.0-SNAPSHOT-rebuffed.jar:0.9.0-SNAPSHOT]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_75]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_75]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75]
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)