Hi,
Thanks for the reply. It seems like the error has to do with running out of
memory. Here is the full error message:
Error: DATA_READ ERROR: Error processing input: , line=6026, char=7284350.
Content parsed: [ ]
Failure while reading file s3a://<bucket/file>.gz. Happened at or shortly
before byte position 929686.
Fragment 1:171
[Error Id: ce3d41af-5ee2-448a-97ee-206b601acd25 on <host>:31010]
(com.univocity.parsers.common.TextParsingException) Error processing input: ,
line=6026, char=7284350. Content parsed: [ ]
org.apache.drill.exec.store.easy.text.compliant.TextReader.handleException():480
org.apache.drill.exec.store.easy.text.compliant.TextReader.parseNext():389
org.apache.drill.exec.store.easy.text.compliant.CompliantTextRecordReader.next():196
org.apache.drill.exec.physical.impl.ScanBatch.next():191
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():91
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.physical.impl.BaseRootExec.next():104
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
org.apache.drill.exec.physical.impl.BaseRootExec.next():94
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1657
org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745
Caused By (org.apache.drill.exec.exception.OutOfMemoryException) Failure
allocating buffer.
io.netty.buffer.PooledByteBufAllocatorL.allocate():64
org.apache.drill.exec.memory.AllocationManager.<init>():80
org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation():239
org.apache.drill.exec.memory.BaseAllocator.buffer():221
org.apache.drill.exec.memory.BaseAllocator.buffer():191
org.apache.drill.exec.vector.UInt4Vector.reAlloc():217
org.apache.drill.exec.store.easy.text.compliant.RepeatedVarCharOutput.expandVarCharOffsets():212
org.apache.drill.exec.store.easy.text.compliant.RepeatedVarCharOutput.endField():255
org.apache.drill.exec.store.easy.text.compliant.TextReader.parseField():325
org.apache.drill.exec.store.easy.text.compliant.TextReader.parseRecord():141
org.apache.drill.exec.store.easy.text.compliant.TextReader.parseNext():370
org.apache.drill.exec.store.easy.text.compliant.CompliantTextRecordReader.next():196
org.apache.drill.exec.physical.impl.ScanBatch.next():191
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():91
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.physical.impl.BaseRootExec.next():104
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
org.apache.drill.exec.physical.impl.BaseRootExec.next():94
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1657
org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745
Caused By (java.lang.OutOfMemoryError) Direct buffer memory
java.nio.Bits.reserveMemory():693
java.nio.DirectByteBuffer.<init>():123
java.nio.ByteBuffer.allocateDirect():311
io.netty.buffer.PoolArena$DirectArena.newChunk():437
io.netty.buffer.PoolArena.allocateNormal():179
io.netty.buffer.PoolArena.allocate():168
io.netty.buffer.PoolArena.allocate():98
io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.newDirectBufferL():165
io.netty.buffer.PooledByteBufAllocatorL$InnerAllocator.directBuffer():195
io.netty.buffer.PooledByteBufAllocatorL.allocate():62
org.apache.drill.exec.memory.AllocationManager.<init>():80
org.apache.drill.exec.memory.BaseAllocator.bufferWithoutReservation():239
org.apache.drill.exec.memory.BaseAllocator.buffer():221
org.apache.drill.exec.memory.BaseAllocator.buffer():191
org.apache.drill.exec.vector.UInt4Vector.reAlloc():217
org.apache.drill.exec.store.easy.text.compliant.RepeatedVarCharOutput.expandVarCharOffsets():212
org.apache.drill.exec.store.easy.text.compliant.RepeatedVarCharOutput.endField():255
org.apache.drill.exec.store.easy.text.compliant.TextReader.parseField():325
org.apache.drill.exec.store.easy.text.compliant.TextReader.parseRecord():141
org.apache.drill.exec.store.easy.text.compliant.TextReader.parseNext():370
org.apache.drill.exec.store.easy.text.compliant.CompliantTextRecordReader.next():196
org.apache.drill.exec.physical.impl.ScanBatch.next():191
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():129
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():109
org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext():91
org.apache.drill.exec.record.AbstractRecordBatch.next():162
org.apache.drill.exec.physical.impl.BaseRootExec.next():104
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext():92
org.apache.drill.exec.physical.impl.BaseRootExec.next():94
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1657
org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745 (state=,code=0)
I have the MAX_DIRECT_MEMORY set to 128G and MAX_HEAP_MEMORY set to 32G which I
assume is per node. I also set the planner.memory.max_query_memory_per_node to
a very high value. However, looking in the web console, the Maximum Direct
Memory is 8,589,934,592 which seems much much lower than what I set it to.
Tanmay
On Thursday, 16 June 2016 2:22 PM, Jinfeng Ni <[email protected]> wrote:
sounds like data at line 2026 is either not well formatted, or hit a
bug in drill.
Can you try the following?
1) turn on verbose error message, and run the query again to see if
the verbose error msg will tell us more info?
alter session set `exec.errors.verbose` = true;
2) if possible, check line 2026 in your input file, to see if there is
anything suspicious .
On Thu, Jun 16, 2016 at 1:49 PM, Tanmay Solanki
<[email protected]> wrote:
> Hello,
> I am currently running Apache Drill on a 20 node cluster and was running into
> some errors that I was wondering if you would be able to help me with this.
>
> I am attempting to run the following query to create a parquet table in a new
> S3 bucket from another table that is in a tsv format:
> create table s3_output.tmp.`<output file>` as select
> columns[0], columns[1], columns[2], columns[3], columns[4], columns[5],
> columns[6], columns[7], columns[8], columns[9],
> columns[10], columns[11], columns[12], columns[13], columns[14], columns[15],
> columns[16], columns[17], columns[18], columns[19],
> columns[20], columns[21], columns[22], columns[23], columns[24], columns[25],
> columns[26], columns[27], columns[28], columns[29],
> columns[30], columns[31], columns[32], columns[33], columns[34], columns[35],
> columns[36], columns[37], columns[38], columns[39],
> columns[40], columns[41], columns[42], columns[43], columns[44], columns[45],
> columns[46], columns[47], columns[48], columns[49],
> columns[50], columns[51], columns[52], columns[53], columns[54], columns[55],
> columns[56], columns[57], columns[58], columns[59],
> columns[60], columns[61], columns[62], columns[63], columns[64], columns[65],
> columns[66], columns[67], columns[68], columns[69],
> columns[70], columns[71], columns[72], columns[73], columns[74], columns[75],
> columns[76], columns[77], columns[78], columns[79],
> columns[80], columns[81], columns[82], columns[83], columns[84], columns[85],
> columns[86], columns[87], columns[88], columns[89],
> columns[90], columns[91], columns[92], columns[93], columns[94], columns[95],
> columns[96], columns[97], columns[98], columns[99],
> columns[100], columns[101], columns[102], columns[103], columns[104],
> columns[105], columns[106], columns[107], columns[108], columns[109],
> columns[110], columns[111], columns[112], columns[113], columns[114],
> columns[115], columns[116], columns[117], columns[118], columns[119],
> columns[120], columns[121], columns[122], columns[123], columns[124],
> columns[125], columns[126], columns[127], columns[128], columns[129],
> columns[130], columns[131], columns[132], columns[133], columns[134],
> columns[135], columns[136], columns[137], columns[138], columns[139],
> columns[140], columns[141], columns[142], columns[143], columns[144],
> columns[145], columns[146], columns[147], columns[148], columns[149],
> columns[150], columns[151], columns[152], columns[153], columns[154],
> columns[155], columns[156], columns[157], columns[158], columns[159],
> columns[160], columns[161], columns[162], columns[163], columns[164],
> columns[165], columns[166], columns[167], columns[168], columns[169],
> columns[170], columns[171], columns[172], columns[173] from s3input.`<input
> path>*.gz`;
> This is the error output I get while running this query.
> Error: DATA_READ ERROR: Error processing input: , line=2026, char=2449781.
> Content parsed: [ ]
>
> Failure while reading file s3a://<input bucket/file>.gz. Happened at or
> shortly before byte position 329719.
> Fragment 1:19
>
> [Error Id: fe289e19-c7b7-4739-9960-c15b8a62af3b on <node 6>:31010]
> (state=,code=0)
> Do you have any idea how I can go about trying to solve this issue?
> Thanks for any help!Tanmay Solanki