irisshainsky opened a new issue #9408: Druid 0.17 native parallel batch 
ingestion with orc files fails
URL: https://github.com/apache/druid/issues/9408
 
 
   I'm trying to run a native parallel batch ingestion from s3 to druid. the 
inputFormat is orc.
    I deployed druid 0.17 cluster and enable the druid-orc-extensions.
   when running the ingestion task I receive on the sub task the following 
exception: 
    org.apache.hadoop.util.VersionInfo - Could not read 
'common-version-info.properties', java.io.IOException: Resource not found
   
   it looks as if something is wrong with the hadoop-common loading, but on the 
log I see that it was loaded when the extension is loaded, is it a known issue?
   Thanks
   
   [druid-orc-extensions], jars: hadoop-auth-2.8.5.jar, 
druid-orc-extensions-0.17.0.jar, commons-digester-1.8.jar, orc-shims-1.5.6.jar, 
hadoop-hdfs-client-2.8.5.jar, commons-configuration-1.6.jar, 
protobuf-java-3.11.0.jar, hadoop-annotations-2.8.5.jar, 
jackson-core-asl-1.9.13.jar, hadoop-common-2.8.5.jar, orc-mapreduce-1.5.6.jar, 
jackson-mapper-asl-1.9.13.jar, hadoop-mapreduce-client-core-2.8.5.jar, 
hive-storage-api-2.6.0.jar, htrace-core4-4.0.1-incubating.jar, 
orc-core-1.5.6.jar, aircompressor-0.10.jar
   2020-02-26T11:22:58,090 INFO [main] 
org.apache.druid.initialization.Initialization - Loading extension 
[druid-s3-extensions], jars: druid-s3-extensions-0.17.0.jar
   2020-02-26T11:22:58,091 INFO [main] 
org.apache.druid.initialization.Initialization - Loading extension 
[postgresql-metadata-storage], jars: postgresql-metadata-storage-0.17.0.jar, 
postgresql-42.2.8.jar
   2020-02-26T11:22:58,093 INFO [main] 
org.apache.druid.initialization.Initialization - Loading extension 
[statsd-emitter], jars: jnr-unixsocket-0.18.jar, jnr-ffi-2.1.4.jar, 
jffi-1.2.15.jar, java-dogstatsd-client-2.6.1.jar, jnr-posix-3.0.35.jar, 
asm-util-5.0.3.jar, jnr-constants-0.9.8.jar, statsd-emitter-0.17.0.jar, 
asm-tree-5.0.3.jar, jffi-1.2.15-native.jar, jnr-enxio-0.16.jar, 
jnr-x86asm-1.0.2.jar, asm-analysis-5.0.3.jar, asm-7.1.jar, asm-commons-7.1.jar
   
   
   2020-02-26T11:23:04,836 WARN [task-runner-0-priority-0] 
org.apache.hadoop.util.VersionInfo - Could not read 
'common-version-info.properties', java.io.IOException: Resource not found
   java.io.IOException: Resource not found
        at org.apache.hadoop.util.VersionInfo.<init>(VersionInfo.java:49) ~[?:?]
        at org.apache.hadoop.util.VersionInfo.<clinit>(VersionInfo.java:99) 
~[?:?]
        at 
org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:52) ~[?:?]
        at 
org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47) ~[?:?]
        at 
org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257) ~[?:?]
        at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649) ~[?:?]
        at 
org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98)
 ~[?:?]
        at 
org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83)
 [druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69)
 [druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67)
 [druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103)
 [druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74)
 [druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43)
 [druid-processing-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442)
 [druid-indexing-service-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225)
 [druid-indexing-service-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138)
 [druid-indexing-service-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419)
 [druid-indexing-service-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391)
 [druid-indexing-service-0.17.0.jar:0.17.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
   2020-02-26T11:23:04,847 ERROR [task-runner-0-priority-0] 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Uncaught 
Throwable while running 
task[AbstractTask{id='single_phase_sub_task_tasks_v4_eedbneak_2020-02-26T11:22:55.339Z',
 groupId='index_parallel_tasks_v4_ofddmnmk_2020-02-26T11:22:17.567Z', 
taskResource=TaskResource{availabilityGroup='single_phase_sub_task_tasks_v4_eedbneak_2020-02-26T11:22:55.339Z',
 requiredCapacity=1}, dataSource='tasks_v4', context={forceTimeChunkLock=true}}]
   java.lang.ExceptionInInitializerError: null
        at 
org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257) ~[?:?]
        at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649) ~[?:?]
        at 
org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98)
 ~[?:?]
        at 
org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74)
 ~[druid-core-0.17.0.jar:0.17.0]
        at 
org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43)
 ~[druid-processing-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442)
 ~[druid-indexing-service-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225)
 ~[druid-indexing-service-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138)
 ~[druid-indexing-service-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419)
 [druid-indexing-service-0.17.0.jar:0.17.0]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391)
 [druid-indexing-service-0.17.0.jar:0.17.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_242]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
   Caused by: java.lang.NumberFormatException: For input string: "Unknown"
        at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) 
~[?:1.8.0_242]
        at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_242]
        at java.lang.Integer.parseInt(Integer.java:615) ~[?:1.8.0_242]
        at 
org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:53) ~[?:?]
        at 
org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47) ~[?:?]
        ... 20 more
   Error!
   java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
java.lang.ExceptionInInitializerError
        at 
org.apache.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:215)
        at org.apache.druid.cli.CliPeon.run(CliPeon.java:288)
        at org.apache.druid.cli.Main.main(Main.java:113)
   Caused by: java.util.concurrent.ExecutionException: 
java.lang.ExceptionInInitializerError
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299)
        at 
com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286)
        at 
com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
        at 
org.apache.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:212)
        ... 2 more
   Caused by: java.lang.ExceptionInInitializerError
        at 
org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257)
        at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649)
        at 
org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98)
        at 
org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43)
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78)
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83)
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69)
        at 
org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67)
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103)
        at 
org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74)
        at 
org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43)
        at 
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442)
        at 
org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225)
        at 
org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138)
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419)
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.NumberFormatException: For input string: "Unknown"
        at 
java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
        at java.lang.Integer.parseInt(Integer.java:580)
        at java.lang.Integer.parseInt(Integer.java:615)
        at 
org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:53)
        at 
org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47)
        ... 20 more
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to