irisshainsky opened a new issue #9408: Druid 0.17 native parallel batch ingestion with orc files fails URL: https://github.com/apache/druid/issues/9408 I'm trying to run a native parallel batch ingestion from s3 to druid. the inputFormat is orc. I deployed druid 0.17 cluster and enable the druid-orc-extensions. when running the ingestion task I receive on the sub task the following exception: org.apache.hadoop.util.VersionInfo - Could not read 'common-version-info.properties', java.io.IOException: Resource not found it looks as if something is wrong with the hadoop-common loading, but on the log I see that it was loaded when the extension is loaded, is it a known issue? Thanks [druid-orc-extensions], jars: hadoop-auth-2.8.5.jar, druid-orc-extensions-0.17.0.jar, commons-digester-1.8.jar, orc-shims-1.5.6.jar, hadoop-hdfs-client-2.8.5.jar, commons-configuration-1.6.jar, protobuf-java-3.11.0.jar, hadoop-annotations-2.8.5.jar, jackson-core-asl-1.9.13.jar, hadoop-common-2.8.5.jar, orc-mapreduce-1.5.6.jar, jackson-mapper-asl-1.9.13.jar, hadoop-mapreduce-client-core-2.8.5.jar, hive-storage-api-2.6.0.jar, htrace-core4-4.0.1-incubating.jar, orc-core-1.5.6.jar, aircompressor-0.10.jar 2020-02-26T11:22:58,090 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [druid-s3-extensions], jars: druid-s3-extensions-0.17.0.jar 2020-02-26T11:22:58,091 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [postgresql-metadata-storage], jars: postgresql-metadata-storage-0.17.0.jar, postgresql-42.2.8.jar 2020-02-26T11:22:58,093 INFO [main] org.apache.druid.initialization.Initialization - Loading extension [statsd-emitter], jars: jnr-unixsocket-0.18.jar, jnr-ffi-2.1.4.jar, jffi-1.2.15.jar, java-dogstatsd-client-2.6.1.jar, jnr-posix-3.0.35.jar, asm-util-5.0.3.jar, jnr-constants-0.9.8.jar, statsd-emitter-0.17.0.jar, asm-tree-5.0.3.jar, jffi-1.2.15-native.jar, jnr-enxio-0.16.jar, jnr-x86asm-1.0.2.jar, asm-analysis-5.0.3.jar, asm-7.1.jar, asm-commons-7.1.jar 2020-02-26T11:23:04,836 WARN [task-runner-0-priority-0] org.apache.hadoop.util.VersionInfo - Could not read 'common-version-info.properties', java.io.IOException: Resource not found java.io.IOException: Resource not found at org.apache.hadoop.util.VersionInfo.<init>(VersionInfo.java:49) ~[?:?] at org.apache.hadoop.util.VersionInfo.<clinit>(VersionInfo.java:99) ~[?:?] at org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:52) ~[?:?] at org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47) ~[?:?] at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257) ~[?:?] at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649) ~[?:?] at org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98) ~[?:?] at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83) [druid-core-0.17.0.jar:0.17.0] at org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69) [druid-core-0.17.0.jar:0.17.0] at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67) [druid-core-0.17.0.jar:0.17.0] at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103) [druid-core-0.17.0.jar:0.17.0] at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74) [druid-core-0.17.0.jar:0.17.0] at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43) [druid-processing-0.17.0.jar:0.17.0] at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442) [druid-indexing-service-0.17.0.jar:0.17.0] at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225) [druid-indexing-service-0.17.0.jar:0.17.0] at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138) [druid-indexing-service-0.17.0.jar:0.17.0] at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) [druid-indexing-service-0.17.0.jar:0.17.0] at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) [druid-indexing-service-0.17.0.jar:0.17.0] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] 2020-02-26T11:23:04,847 ERROR [task-runner-0-priority-0] org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Uncaught Throwable while running task[AbstractTask{id='single_phase_sub_task_tasks_v4_eedbneak_2020-02-26T11:22:55.339Z', groupId='index_parallel_tasks_v4_ofddmnmk_2020-02-26T11:22:17.567Z', taskResource=TaskResource{availabilityGroup='single_phase_sub_task_tasks_v4_eedbneak_2020-02-26T11:22:55.339Z', requiredCapacity=1}, dataSource='tasks_v4', context={forceTimeChunkLock=true}}] java.lang.ExceptionInInitializerError: null at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257) ~[?:?] at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649) ~[?:?] at org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98) ~[?:?] at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74) ~[druid-core-0.17.0.jar:0.17.0] at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43) ~[druid-processing-0.17.0.jar:0.17.0] at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442) ~[druid-indexing-service-0.17.0.jar:0.17.0] at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225) ~[druid-indexing-service-0.17.0.jar:0.17.0] at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138) ~[druid-indexing-service-0.17.0.jar:0.17.0] at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) [druid-indexing-service-0.17.0.jar:0.17.0] at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) [druid-indexing-service-0.17.0.jar:0.17.0] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] Caused by: java.lang.NumberFormatException: For input string: "Unknown" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_242] at java.lang.Integer.parseInt(Integer.java:580) ~[?:1.8.0_242] at java.lang.Integer.parseInt(Integer.java:615) ~[?:1.8.0_242] at org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:53) ~[?:?] at org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47) ~[?:?] ... 20 more Error! java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ExceptionInInitializerError at org.apache.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:215) at org.apache.druid.cli.CliPeon.run(CliPeon.java:288) at org.apache.druid.cli.Main.main(Main.java:113) Caused by: java.util.concurrent.ExecutionException: java.lang.ExceptionInInitializerError at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:299) at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:286) at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) at org.apache.druid.indexing.worker.executor.ExecutorLifecycle.join(ExecutorLifecycle.java:212) ... 2 more Caused by: java.lang.ExceptionInInitializerError at org.apache.orc.impl.RecordReaderImpl.<init>(RecordReaderImpl.java:257) at org.apache.orc.impl.ReaderImpl.rows(ReaderImpl.java:649) at org.apache.druid.data.input.orc.OrcReader.intermediateRowIterator(OrcReader.java:98) at org.apache.druid.data.input.IntermediateRowParsingReader.read(IntermediateRowParsingReader.java:43) at org.apache.druid.data.input.impl.InputEntityIteratingReader.lambda$read$0(InputEntityIteratingReader.java:78) at org.apache.druid.java.util.common.parsers.CloseableIterator$2.findNextIeteratorIfNecessary(CloseableIterator.java:83) at org.apache.druid.java.util.common.parsers.CloseableIterator$2.<init>(CloseableIterator.java:69) at org.apache.druid.java.util.common.parsers.CloseableIterator.flatMap(CloseableIterator.java:67) at org.apache.druid.data.input.impl.InputEntityIteratingReader.createIterator(InputEntityIteratingReader.java:103) at org.apache.druid.data.input.impl.InputEntityIteratingReader.read(InputEntityIteratingReader.java:74) at org.apache.druid.segment.transform.TransformingInputSourceReader.read(TransformingInputSourceReader.java:43) at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.generateAndPushSegments(SinglePhaseSubTask.java:442) at org.apache.druid.indexing.common.task.batch.parallel.SinglePhaseSubTask.runTask(SinglePhaseSubTask.java:225) at org.apache.druid.indexing.common.task.AbstractBatchIndexTask.run(AbstractBatchIndexTask.java:138) at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:419) at org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:391) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NumberFormatException: For input string: "Unknown" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.parseInt(Integer.java:615) at org.apache.orc.impl.HadoopShimsFactory.get(HadoopShimsFactory.java:53) at org.apache.orc.impl.RecordReaderUtils.<clinit>(RecordReaderUtils.java:47) ... 20 more
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
