cswarth opened a new issue #11003:
URL: https://github.com/apache/druid/issues/11003


   The [Protobuf extension 
documentation](https://druid.apache.org/docs/latest/development/extensions-core/protobuf.html)
 demonstrates use of the extension to decode Kafka events.
   Can the protobuf extension also be used to decode files, or it is only 
suitable for streaming input?
   
   I tried make an example "index_parallel" task definition that uses protobuf 
but it gets rejected,
   ```
   {"error":"Cannot construct instance of 
`org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexIngestionSpec`,
     problem: Cannot use parser and inputSource together. Try using inputFormat 
instead of parser.
    at [Source: (org.eclipse.jetty.server.HttpInputOverHTTP); line: 77, column: 
1] 
   (through reference chain: 
org.apache.druid.indexing.common.task.batch.parallel.ParallelIndexSupervisorTask[\"spec\"])"
   }
   ```
   
   Task definition:
   ```
   curl -v http://localhost:8888/druid/indexer/v1/task -H 'Content-Type: 
application/json' -d '
   {
     "type": "index_parallel",
     "spec": {
       "ioConfig": {
         "type": "index_parallel",
         "inputSource": {
           "type": "local",
           "filter": "metrics.bin",
           "baseDir": "./"
         }
       },
       "tuningConfig": {
         "type": "index_parallel",
         "partitionsSpec": {
           "type": "dynamic"
         }
       },
       "dataSchema": {
         "dataSource": "metrics",
         "parser": {
           "type": "protobuf",
           "descriptor": "file:///tmp/metrics.desc",
           "protoMessageType": "Metrics",
           "parseSpec": {
             "format": "json",
             "timestampSpec": {
               "column": "timestamp",
               "format": "auto"
             },
             "dimensionsSpec": {
               "dimensions": [
                 "unit",
                 "http_method",
                 "http_code",
                 "page",
                 "metricType",
                 "server"
               ],
               "dimensionExclusions": [
                 "timestamp",
                 "value"
               ]
             }
           }
         },
         "metricsSpec": [
           {
             "name": "count",
             "type": "count"
           },
           {
             "name": "value_sum",
             "fieldName": "value",
             "type": "doubleSum"
           },
           {
             "name": "value_min",
             "fieldName": "value",
             "type": "doubleMin"
           },
           {
             "name": "value_max",
             "fieldName": "value",
             "type": "doubleMax"
           }
         ],
         "granularitySpec": {
           "type": "uniform",
           "segmentGranularity": "HOUR",
           "queryGranularity": "NONE"
         }
     }
   }
   '
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to