egor-ryashin opened a new issue #7438: druid-orc-extensions hadoop-common 
dependency is broken
URL: https://github.com/apache/incubator-druid/issues/7438
 
 
   `druid-orc-extensions` `hadoop-common` dependency is broken or maybe the 
extension isn't properly documented
   
   ### Affected Version
   
   0.13.0-incubating
   
   ### Description
   
   Using this modules:
   `druid.extensions.loadList=["mysql-metadata-storage", 
"druid-kafka-indexing-service", "druid-orc-extensions", "druid-hdfs-storage"]`
   ```
   du extensions/*
   20032        extensions/druid-cassandra-storage
   46928        extensions/druid-hdfs-storage
   4240 extensions/druid-kafka-indexing-service
   168  extensions/druid-lookups-cached-global
   56640        extensions/druid-orc-extensions
   136  extensions/druid-s3-extensions
   1968 extensions/mysql-metadata-storage
   ```
   Posting this task:
   ```json
   {
     "type": "index_parallel",
     "spec": {
       "dataSchema": {
         "dataSource": "my_orc_test",
         "metricsSpec": [
           {
             "type": "count",
                 "name": "count"
               }
         ],
         "granularitySpec": {
             "segmentGranularity": "DAY",
             "queryGranularity": "second",
             "intervals" : [ "2018-07-10/2018-07-11" ]
          },
           "parser": {
             "type": "orc",
             "parseSpec": {
               "format": "timeAndDims",
               "timestampSpec": {
                 "column": "time",
                 "format": "auto"
               },
               "dimensionsSpec": {
                 "dimensions": [
                   "tag"
                 ],
                 "dimensionExclusions": [],
                 "spatialDimensions": []
               }
             },
             "typeString": "struct<time:string,tag:string>",
             "mapFieldNameFormat": "<PARENT>_<CHILD>"
           }
               
       },
       "ioConfig": {
           "type": "index_parallel",
           "firehose": {
             "type": "local",
             "baseDir": "./",
             "filter": "*.orc"
           }
       }
     }
   }
   ```
   Got error from the spawned subtask:
   ```
   2019-04-10T22:16:10,413 ERROR [task-runner-0-priority-0] 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner - Uncaught 
Throwable while running 
task[AbstractTask{id='index_sub_my_orc_test_2019-04-10T22:16:02.661Z', 
groupId='index_parallel_my_orc_test_2019-04-10T22:15:55.541Z', 
taskResource=TaskResource{availabilityGroup='index_sub_my_orc_test_2019-04-10T22:16:02.661Z',
 requiredCapacity=1}, dataSource='my_orc_test', context={}}]
   java.lang.NoClassDefFoundError: org/apache/hadoop/io/Writable
   ```
   The log also says the needed dependency jar is loaded beforehand:
   ```
   2019-04-10T22:16:04,921 INFO [main] 
org.apache.druid.initialization.Initialization - added 
URL[file:/Users/egorryashin/a-druid-0.13-i/extensions/druid-hdfs-storage/hadoop-common-2.8.3.jar]
 for extension[druid-hdfs-storage]
   ```
   The task doesn't work neither with `druid-hdfs-storage` loaded nor without 
it.
   
   I spotted that while I was investigating 
https://github.com/apache/incubator-druid/issues/6925
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to