Abhishek Girish created DRILL-5304:
--------------------------------------
Summary: Queries fail intermittently when there is skew in data
distribution
Key: DRILL-5304
URL: https://issues.apache.org/jira/browse/DRILL-5304
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.10.0
Reporter: Abhishek Girish
Assignee: Padma Penumarthy
In a distributed environment, we've observed certain queries to fail execution
intermittently, with an assignment logic issue, when the underlying data is
skewed w.r.t distribution.
For example the TPC-H query 7 failed with the below error:
{code}
java.sql.SQLException: SYSTEM ERROR: IllegalArgumentException: MinorFragmentId
105 has no read entries assigned
...
(org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception
during fragment initialization: MinorFragmentId 105 has no read entries assigned
org.apache.drill.exec.work.foreman.Foreman.run():281
java.util.concurrent.ThreadPoolExecutor.runWorker():1145
java.util.concurrent.ThreadPoolExecutor$Worker.run():615
java.lang.Thread.run():744
Caused By (java.lang.IllegalArgumentException) MinorFragmentId 105 has no
read entries assigned
{code}
Log containing full stack trace is attached.
And for this query, the underlying TPC-H SF100 Parquet dataset was observed to
be located mostly only on 2-3 nodes on an 8 node DFS environment. The data
distribution skew on this cluster is most likely the triggering factor for this
case, as the same query, on the same dataset does not show this failure on a
different test cluster (with possibly different data distribution).
Also, another
[query](https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/limit0/window_functions/bugs/data/drill-3700.sql)
failed with a similar error when slice target was set to 1.
{code}
Failed with exception
java.sql.SQLException: SYSTEM ERROR: IllegalArgumentException: MinorFragmentId
66 has no read entries assigned
...
(org.apache.drill.exec.work.foreman.ForemanException) Unexpected exception
during fragment initialization: MinorFragmentId 66 has no read entries assigned
{code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)