internaulte commented on issue #10866:
URL: https://github.com/apache/druid/issues/10866#issuecomment-858687921
Hello there !
Sorry for the time to respond, I had to work on other subjects.
So, as I said, we randomly have a similar error (logs are not exactly the
same, but it seems to be quite similar to me) on indexing data in druid from
kafka streaming. We are on druid 0.21.0 but we had the same problem in 0.20.1.
When creating data in druid, in fact we have 4 supervisors created. All of
them on a similar scheme, pointing to slightly different datasources.
Example of supervisor spec :
```
[
{
"type": "kafka",
"spec": {
"dataSchema": {
"dataSource": "testdatasource",
"timestampSpec": {
"column": "date",
"format": "auto"
},
"dimensionsSpec": {
"dimensions": [
"number",
{
"type": "long",
"name": "duration"
},
"string",
"otherstring",
"boolean"
]
},
"metricsSpec": [],
"granularitySpec": {
"type": "uniform",
"segmentGranularity": "YEAR",
"queryGranularity": "NONE",
"rollup": false
}
},
"ioConfig": {
"type": "kafka",
"topic": "testdatasource",
"consumerProperties": {
"bootstrap.servers":
"kafka-broker:00001"
},
"inputFormat": {
"type": "json"
},
"useEarliestOffset": true
},
"tuningConfig": {
"type": "kafka",
"logParseExceptions": true,
"intermediateHandoffPeriod": "PT30M"
}
}
}
]
```
example of data in kafka topic :
```
{
"date": "2010-10-10T08:05:00.000Z",
"number": "8000",
"duration": 300000,
"string": "3",
"otherstring": "b3701aefc6797a9c883067ef71341db1",
"boolean": true
}
```
Usually, everything is fine, but randomly one of the 4 ingesting tasks fails
(4 tasks as we created 4 supervisors). It seems to only happens when ingesting
few lines (let's say less than 10).
Here is the stackTrace we obtain in druid 0.20.1 :
[ingestion_fails-log-druid-0.20.1.txt](https://github.com/apache/druid/files/6632308/ingestion_fails-log-druid-0.20.1.txt)
Here is the stackTrace we obtain in druid 0.21.0 :
[ingestion_fails-log-druid-0.21.0.txt](https://github.com/apache/druid/files/6632311/ingestion_fails-log-druid-0.21.0.txt)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]