[
https://issues.apache.org/jira/browse/BEAM-10002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17151146#comment-17151146
]
Yichi Zhang commented on BEAM-10002:
------------------------------------
I see, it sounds more like a resource issue, while the read source is holding
the cursor the downstream steps are not pulling elements. Your solution makes
sense. Have you also evaluated the option to disable cursor timeout?
I think the issue BEAM-9960 comes more from the unreliableness of mongo db
split vector command, we can only use some heuristic to avoid it, e.g. if we
detect the error, we try to increase the desired bundle size by a fraction or
times 2 until we can query it. This may cause imperfect liquid sharding on
dataflow but it is generally still acceptable.
> Mongo cursor timeout leads to CursorNotFound error
> --------------------------------------------------
>
> Key: BEAM-10002
> URL: https://issues.apache.org/jira/browse/BEAM-10002
> Project: Beam
> Issue Type: Bug
> Components: io-py-mongodb
> Affects Versions: 2.20.0
> Reporter: Corvin Deboeser
> Assignee: Corvin Deboeser
> Priority: P2
>
> If some work items take a lot of processing time and the cursor of a bundle
> is not queried for too long, then mongodb will timeout the cursor which
> results in
> {code:java}
> pymongo.errors.CursorNotFound: cursor id ... not found
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)