[GitHub] [incubator-druid] teeram commented on issue #8663: Kafka indexing service duplicate entry exception in druid_pendingSegments
teeram commented on issue #8663: Kafka indexing service duplicate entry exception in druid_pendingSegments URL: https://github.com/apache/incubator-druid/issues/8663#issuecomment-544534256 Based on that info, I suspect there might be a bug in the 3rd step. It seems like I would not receive a duplicate entry exception if the segment ID had the partition id increased by 1. The entry in conflict seems to always be using the `current max partition id` instead of it being incremented by 1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] teeram commented on issue #8663: Kafka indexing service duplicate entry exception in druid_pendingSegments
teeram commented on issue #8663: Kafka indexing service duplicate entry exception in druid_pendingSegments URL: https://github.com/apache/incubator-druid/issues/8663#issuecomment-542288337 Hey @jihoonson, sorry for the delays in response. Hmm, I haven't turned on the minor compaction features mentioned in that issue, so I don't suspect that to be the culprit either. Unfortunately, I can't share those exact segment id's that are in conflict. But I am looking through those tables and can describe some of the oddities I am seeing. One thing I have noticed in the `druid_pendingSegments` table is that the id in conflict has a `created_date` that is typically from the previous day. For instance, I am getting an id conflict today (2019-10-15) for an entry in `druid_pendingSegments` with a `created_date` of 2019-10-14. That seems a little odd to me that the resumed Kafka indexing task would not attempt to place an entry in `druid_pendingSegments` with an updated value for the `created_date` field. There are other entries in the `druid_pendingSegments` table with a `created_date` field from today (2019-10-15). I did a quick query in the `druid_segments` table to see what segments were available for the time chunk in conflict. I found that all the segments in this table had been unpublished (`used = false`). This was the behavior I was expecting as I was dropping old segments before new ones were created. Thanks again for all your assistance! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org
[GitHub] [incubator-druid] teeram commented on issue #8663: Kafka indexing service duplicate entry exception in druid_pendingSegments
teeram commented on issue #8663: Kafka indexing service duplicate entry exception in druid_pendingSegments URL: https://github.com/apache/incubator-druid/issues/8663#issuecomment-541202721 @jihoonson, thank you so much for looking into this. 1. I would say this error occurs roughly 33% of the time I repeat the process described above. 2. I'm not explicitly updating the `druid_pendingSegments` or `druid_segments` tables between step 2 and 4. I generally don't touch anything in metadata storage and always try to work within the standard Druid api's. I don't trust myself to not break something :) But I do occasionally peek in there when I'm doing some debugging. So I wrote a small utility that will suspend a supervisor and drop various segments for a given time period from that data source before resuming supervisor. It does wait for tasks to finish and segments to be fully unloaded before continuing on to the next steps. I updated this utility to go and delete any entries for a given data source in the `druid_pendingSegments` table before resuming the supervisor. This seems to prevent the duplicate key error from occurring. Any chance there were some changes between 0.14.1 and 0.16.0 to the way the `druid_pendingSegments` table is manipulated? I upgraded straight from 0.14.1 to 0.16.0 and skipped some of the 0.15.X versions. Maybe that was a bad move on my part? I don't believe I had this issue in 0.14.1 so I'm guessing something changed between that release and 0.16.0. Thanks again for all your help! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org