Raven888888 opened a new issue #11906:
URL: https://github.com/apache/druid/issues/11906


   ### Affected Version
   
   Druid 0.21.0
   
   ### Description
   
   I have a druid standalone setup using `small` `single-server` configs.
   
   I have KIS running with segment granularity `hour` and query granularity 
`none`. So far so good, late data are ingested correctly into the corresponding 
segments. I then run manual compaction to older segments, which results in 1 
big segment.
   
   For instance, I have manually compacted segments for old data in the 
interval [2021-09-27/2021-10-04] to segment granularity `week`. Query 
granularity is not set, so it follows back the original one.
   ```
   {
        "type": "compact",
        "dataSource": "device-health-check-log",
        "ioConfig": {
                "type": "compact",
                "inputSpec": {
                        "type": "interval",
                        "interval": "2021-09-27/2021-10-04"
                }
        },
        "segmentGranularity": "week"
   }
   ```
   
![image](https://user-images.githubusercontent.com/58241952/141226869-45ca6d45-af2b-4c40-a468-0187951d9cc0.png)
   
   KIS starts to fail (and continues to fail in a loop) when an old data in the 
compacted interval arrives. Below is the log:
   
   ```
   2021-11-10T12:43:52,113 WARN [task-runner-0-priority-0] 
org.apache.druid.segment.realtime.appenderator.BaseAppenderatorDriver - Cannot 
allocate segment for timestamp[2021-10-02T09:45:03.000Z], 
sequenceName[index_kafka_device-health-check-log_aed5bc375a9096d_0].
   2021-11-10T12:43:52,113 ERROR [task-runner-0-priority-0] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner - 
Encountered exception in run() before persisting.
   org.apache.druid.java.util.common.ISE: Could not allocate segment for row 
with timestamp[2021-10-02T09:45:03.000Z]
        at 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.runInternal(SeekableStreamIndexTaskRunner.java:670)
 [druid-indexing-service-0.21.0.jar:0.21.0]
        at 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskRunner.run(SeekableStreamIndexTaskRunner.java:268)
 [druid-indexing-service-0.21.0.jar:0.21.0]
        at 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTask.run(SeekableStreamIndexTask.java:146)
 [druid-indexing-service-0.21.0.jar:0.21.0]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:451)
 [druid-indexing-service-0.21.0.jar:0.21.0]
        at 
org.apache.druid.indexing.overlord.SingleTaskBackgroundRunner$SingleTaskBackgroundRunnerCallable.call(SingleTaskBackgroundRunner.java:423)
 [druid-indexing-service-0.21.0.jar:0.21.0]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
[?:1.8.0_292]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_292]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_292]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
   ```
   
   I was unable to reproduce the same issue on another druid server running 
`micro-quickstart` `single-server` configs, version 0.21.1.
   On this server, behaviour is as expected, where (realtime) late data 
correctly creates a new segment accordingly, and KIS task succeed.
   
![image](https://user-images.githubusercontent.com/58241952/141227833-56e1fa07-6846-4008-aea0-0cc0c56890c9.png)
   Both servers using the exact same KIS specs.
   
   #### Similar issue
   #9386
   Tried re-compacting the segment, does not help. I had to hard reset 
supervisor to skip over problematic old data, which is undesired as late data 
is lost.
   
   Any pointer is greatly appreciated! Thanks.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to