[
https://issues.apache.org/jira/browse/DRILL-5470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15999712#comment-15999712
]
Paul Rogers commented on DRILL-5470:
------------------------------------
To illustrate the CSV data corruption, I created a CSV file, test4.csv, of the
following form:
{code}
h,u
abc,def
ghi
{code}
Then, I created a simple test using the "cluster fixture" framework:
{code}
@Test
public void readerTest() throws Exception {
FixtureBuilder builder = ClusterFixture.builder()
.maxParallelization(1);
try (ClusterFixture cluster = builder.build();
ClientFixture client = cluster.clientFixture()) {
TextFormatConfig csvFormat = new TextFormatConfig();
csvFormat.fieldDelimiter = ',';
csvFormat.skipFirstLine = false;
csvFormat.extractHeader = true;
cluster.defineWorkspace("dfs", "data", "/tmp/data", "csv", csvFormat);
String sql = "SELECT * FROM `dfs.data`.`csv/test4.csv` LIMIT 10";
client.queryBuilder().sql(sql).printCsv();
}
}
{code}
The results show we've got a problem:
{code}
Exception (no rows returned):
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
IllegalArgumentException: length: -3 (expected: >= 0)
{code}
If the last line were:
{code}
efg,
{code}
Then the offset vector should look like this:
{code}
\[0, 3, 3]
{code}
Very likely we have an offset vector that looks like this instead:
{code}
\[0, 3, 0]
{code}
When we compute the second column of the second row, we should compute:
{code}
length = offset\[2] - offset\[1] = 3 - 3 = 0
{code}
Instead we get:
{code}
length = offset\[2] - offset\[1] = 0 - 3 = -3
{code}
Somehow, in the user's scenario, the number are far larger and the value has
wrapped around to the bogus length shown.
The summary is that a premature EOF appears to cause the "missing" columns to
be skipped; they are not filled with a blank value to "bump" the offset vectors
to fill in the last row. Instead, they are left at 0, causing havoc downstream
in the query.
> CSV reader data corruption on truncated lines
> ---------------------------------------------
>
> Key: DRILL-5470
> URL: https://issues.apache.org/jira/browse/DRILL-5470
> Project: Apache Drill
> Issue Type: Bug
> Components: Server
> Affects Versions: 1.10.0
> Environment: - ubuntu 14.04
> - r3.8xl (32 CPU/240GB Mem)
> - openjdk version "1.8.0_111"
> - drill 1.10.0 with 8656c83b00f8ab09fb6817e4e9943b2211772541 cherry-picked
> Reporter: Nathan Butler
> Assignee: Paul Rogers
>
> Per the mailing list discussion and Rahul's and Paul's suggestion I'm filing
> this Jira issue. Drill seems to be running out of memory when doing an
> External Sort. Per Zelaine's suggestion I enabled
> sort.external.disable_managed in drill-override.conf and in the sqlline
> session. This caused the query to run for longer but it still would fail with
> the same message.
> Per Paul's suggestion, I enabled debug logging for the
> org.apache.drill.exec.physical.impl.xsort.managed package and re-ran the
> query.
> Here's the initial DEBUG line for ExternalSortBatch for our query:
> bq. 2017-05-03 12:02:56,095 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:15]
> DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Config: memory limit = 10737418240,
> spill file size = 268435456, spill batch size = 8388608, merge limit =
> 2147483647, merge batch size = 16777216
> And here's the last DEBUG line before the stack trace:
> bq. 2017-05-03 12:37:44,249 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:4]
> DEBUG o.a.d.e.p.i.x.m.ExternalSortBatch - Available memory: 10737418240,
> buffer memory = 10719535268, merge memory = 10707140978
> And the stacktrace:
> {quote}
> 2017-05-03 12:38:02,927 [26f600f1-17b3-d649-51be-2ca0c9bf7606:frag:2:6] INFO
> o.a.d.e.p.i.x.m.ExternalSortBatch - User Error Occurred: External Sort
> encountered an error while spilling to disk (Un
> able to allocate buffer of size 268435456 due to memory limit. Current
> allocation: 10579849472)
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: External
> Sort encountered an error while spilling to disk
> [Error Id: 5d53c677-0cd9-4c01-a664-c02089670a1c ]
> at
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
> ~[drill-common-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1447)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.mergeAndSpill(ExternalSortBatch.java:1376)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.spillFromMemory(ExternalSortBatch.java:1339)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.processBatch(ExternalSortBatch.java:831)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch(ExternalSortBatch.java:618)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load(ExternalSortBatch.java:660)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext(ExternalSortBatch.java:559)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.innerNext(StreamingAggBatch.java:137)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:162)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.partitionsender.PartitionSenderRootExec.innerNext(PartitionSenderRootExec.java:144)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:232)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:226)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at java.security.AccessController.doPrivileged(Native Method)
> [na:1.8.0_111]
> at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_111]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> [hadoop-common-2.7.1.jar:na]
> at
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:226)
> [drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
> [drill-common-1.10.0.jar:1.10.0]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_111]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_111]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_111]
> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: Unable to
> allocate buffer of size 268435456 due to memory limit. Current allocation:
> 10579849472
> at
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:220)
> ~[drill-memory-base-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:195)
> ~[drill-memory-base-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.vector.VarCharVector.reAlloc(VarCharVector.java:425)
> ~[vector-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.vector.VarCharVector.copyFromSafe(VarCharVector.java:278)
> ~[vector-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.vector.NullableVarCharVector.copyFromSafe(NullableVarCharVector.java:379)
> ~[vector-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.test.generated.PriorityQueueCopierGen140.doCopy(PriorityQueueCopierTemplate.java:22)
> ~[na:na]
> at
> org.apache.drill.exec.test.generated.PriorityQueueCopierGen140.next(PriorityQueueCopierTemplate.java:76)
> ~[na:na]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.CopierHolder$BatchMerger.next(CopierHolder.java:234)
> ~[drill-java-exec-1.10.0.jar:1.10.0]
> at
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.doMergeAndSpill(ExternalSortBatch.java:1408)
> [drill-java-exec-1.10.0.jar:1.10.0]
> ... 24 common frames omitted
> {quote}
> I'm in communication with Paul and will send him the full log file.
> Thanks,
> Nathan
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)