[jira] [Comment Edited] (BEAM-10002) Mongo cursor timeout leads to CursorNotFound error

Corvin Deboeser (Jira) Thu, 02 Jul 2020 03:33:08 -0700


    [ 
https://issues.apache.org/jira/browse/BEAM-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17150171#comment-17150171
 ]


Corvin Deboeser edited comment on BEAM-10002 at 7/2/20, 10:32 AM:
------------------------------------------------------------------

Thanks for the input [~yichi]. Valid point, did not think of that...

 

The problem occurred when I had many work items that take very long (worst case 
up to 20min). That virtually blocks the pipeline, but the processing was 
required, so not really blocking. Because all workers were busy with slow 
items, it did not pull in new items from mongo which made at least one cursor 
time out. Technically, retrying the cursor should not occur since the slow item 
has already entered the pipeline. Does that make sense?

 

Another problem I encountered with small small splits is that the split vector 
can potentially become larger than 16mb which exceed the max bson size (see 
this BEAM-9960 - haven't found a good solution, but happy to discuss ideas). So 
the user may face that issue when setting the chunk size too small.

 

Considering that, I would tend to re-query with the last claimed start index, 
but I was not sure of this will work and not cause duplicates in the pipeline. 
But from what I understand, the range tracker is thread safe, so there should 
be no duplicates. Is that correct?

 

Thanks for your support! There are still some things in beam that I don't full 
understand ;) 


was (Author: corvin):
Thanks for the input [~yichi]. Valid point, did not think of that...

 

The problem occurred when I had many work items that take very long (worst case 
up to 20min). That virtually blocks the pipeline, but the processing was 
required, so not really blocking. Because all workers were busy with slow 
items, it did not pull in new items from mongo which made at least one cursor 
time out. Technically, retrying the cursor should not occur since the slow item 
has already entered the pipeline. Does that make sense?

 

Another problem I encountered with small small splits is that the split vector 
can potentially become larger than 16mb which exceed the max bson size (see 
this #9960 - haven't found a good solution, but happy to discuss ideas). So the 
user may face that issue when setting the chunk size too small.

 

Considering that, I would tend to re-query with the last claimed start index, 
but I was not sure of this will work and not cause duplicates in the pipeline. 
But from what I understand, the range tracker is thread safe, so there should 
be no duplicates. Is that correct?

 

Thanks for your support! There are still some things in beam that I don't full 
understand ;) 

> Mongo cursor timeout leads to CursorNotFound error
> --------------------------------------------------
>
>                 Key: BEAM-10002
>                 URL: https://issues.apache.org/jira/browse/BEAM-10002
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-mongodb
>    Affects Versions: 2.20.0
>            Reporter: Corvin Deboeser
>            Assignee: Corvin Deboeser
>            Priority: P2
>
> If some work items take a lot of processing time and the cursor of a bundle 
> is not queried for too long, then mongodb will timeout the cursor which 
> results in
> {code:java}
> pymongo.errors.CursorNotFound: cursor id ... not found
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (BEAM-10002) Mongo cursor timeout leads to CursorNotFound error

Reply via email to