[
https://issues.apache.org/jira/browse/DRILL-4884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15603615#comment-15603615
]
Ted Dunning commented on DRILL-4884:
------------------------------------
I just did an experiment (which must be flawed) to try to recreate this problem.
First, I created a table:
{code}
drop table maprfs.ted.`q1.parquet`;
create table maprfs.ted.`q1.parquet` as
with x1(a,b) as (values (1, rand()-0.5), (1, rand()-0.5)),
x2 as (select t1.a as a, t1.b + t2.b + t3.b + t4.b as b from x1 t1, x1 t2, x1
t3, x1 t4
where t1.a = t2.a and t2.a = t3.a and t3.a = t4.a),
x3 as (select t1.a as a, t1.b + t2.b + t3.b + t4.b as b from x2 t1, x2 t2, x2
t3, x2 t4
where t1.a = t2.a and t2.a = t3.a and t3.a = t4.a) ,
x4 as (select t1.a as a, t1.b + t2.b + t3.b + t4.b as b from x1 t1, x1 t2, x1
t3, x3 t4
where t1.a = t2.a and t2.a = t3.a and t3.a = t4.a)
select * from x4;
{code}
This table has about half a million rows (x1 has 2 rows, x2 has 2^4, x3 has
16^4 = 65536, x4 has 2 * 2 * 2 * 65,536):
{code}
0: jdbc:drill:> select count(*) from maprfs.ted.`q1.parquet`;
+---------+
| EXPR$0 |
+---------+
| 524288 |
+---------+
{code}
Unfortunately, I can't get Drill to fail using a limit of 65536±1. Or 100,000.
Or 200,000.
Does the phrase "non batched scanner" somehow magical here? Or do I need to
have multiple files in a directory?
> Drill produced IOB exception while querying data of 65536 limitation using
> non batched reader
> ---------------------------------------------------------------------------------------------
>
> Key: DRILL-4884
> URL: https://issues.apache.org/jira/browse/DRILL-4884
> Project: Apache Drill
> Issue Type: Bug
> Components: Query Planning & Optimization
> Affects Versions: 1.8.0
> Environment: CentOS 6.5 / JAVA 8
> Reporter: Hongze Zhang
> Assignee: Jinfeng Ni
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> Drill produces IOB while using a non batched scanner and limiting SQL by
> 65536.
> SQL:
> {noformat}
> select id from xx limit 1 offset 65535
> {noformat}
> Result:
> {noformat}
> at
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534)
> ~[classes/:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:324)
> [classes/:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184)
> [classes/:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290)
> [classes/:na]
> at
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> [classes/:na]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_101]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_101]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
> Caused by: java.lang.IndexOutOfBoundsException: index: 131072, length: 2
> (expected: range(0, 131072))
> at io.netty.buffer.DrillBuf.checkIndexD(DrillBuf.java:175)
> ~[classes/:4.0.27.Final]
> at io.netty.buffer.DrillBuf.chk(DrillBuf.java:197)
> ~[classes/:4.0.27.Final]
> at io.netty.buffer.DrillBuf.setChar(DrillBuf.java:517)
> ~[classes/:4.0.27.Final]
> at
> org.apache.drill.exec.record.selection.SelectionVector2.setIndex(SelectionVector2.java:79)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.limitWithNoSV(LimitRecordBatch.java:167)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.doWork(LimitRecordBatch.java:145)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:93)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.limit.LimitRecordBatch.innerNext(LimitRecordBatch.java:115)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext(RemovingRecordBatch.java:94)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:132)
> ~[classes/:na]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:215)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:81)
> ~[classes/:na]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
> ~[classes/:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:256)
> ~[classes/:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:250)
> ~[classes/:na]
> at java.security.AccessController.doPrivileged(Native Method)
> ~[na:1.8.0_101]
> at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_101]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> ~[hadoop-common-2.7.1.jar:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:250)
> [classes/:na]
> ... 4 common frames omitted
> {noformat}
> Code from IteratorValidatorBatchIterator.java said that it is OK incoming
> returning 65536 records:
> {noformat}
> if (incoming.getRecordCount() > MAX_BATCH_SIZE) { // MAX_BATCH_SIZE == 65536
> throw new IllegalStateException(
> String.format(
> "Incoming batch [#%d, %s] has size %d, which is beyond the"
> + " limit of %d",
> instNum, batchTypeName, incoming.getRecordCount(),
> MAX_BATCH_SIZE
> ));
> }
> {noformat}
> Code from LimitRecordBatch.java shows that a loop will not break as expected
> when the incoming returns 65536 records:
> {noformat}
> private void limitWithNoSV(int recordCount) {
> final int offset = Math.max(0, Math.min(recordCount - 1, recordsToSkip));
> recordsToSkip -= offset;
> int fetch;
> if(noEndLimit) {
> fetch = recordCount;
> } else {
> fetch = Math.min(recordCount, offset + recordsLeft);
> recordsLeft -= Math.max(0, fetch - offset);
> }
> int svIndex = 0;
> for(char i = (char) offset; i < fetch; svIndex++, i++) { // since
> fetch==recordCount==65536, param i can be increased from 65535 to 65536, then
> be limited to 0 by the char type limitation, the loop abnormally continues.
> outgoingSv.setIndex(svIndex, i);
> }
> outgoingSv.setRecordCount(svIndex);
> }
> {noformat}
> The IllegalStateException should be thrown when incoming returns 65535
> records rather than 65536.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)