[
https://issues.apache.org/jira/browse/BEAM-1018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15683974#comment-15683974
]
ASF GitHub Bot commented on BEAM-1018:
--------------------------------------
GitHub user crcsmnky opened a pull request:
https://github.com/apache/incubator-beam/pull/1394
BEAM-1018: updated getEstimatedSizeBytes to use Number.longValue()
Updated BoundedMongoDbSource.getEstimatesSizeBytes to use more generic
`Number` class and then return `longValue()`. For smaller collections the
`size` is returned as Long but for larger collections, the `size` can be
returned using scientific notation.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/crcsmnky/incubator-beam BEAM-1018
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/1394.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1394
----
commit a9e3e2928bb05672d8af950237e9fe4d96acbbf5
Author: Sandeep Parikh <[email protected]>
Date: 2016-11-21T16:05:36Z
BEAM-1018: updated getEstimatedSizeBytes to use Number.longValue()
----
> getEstimatedSizeBytes fails with large MongoDB collection sizes
> ---------------------------------------------------------------
>
> Key: BEAM-1018
> URL: https://issues.apache.org/jira/browse/BEAM-1018
> Project: Beam
> Issue Type: Bug
> Affects Versions: 0.4.0-incubating
> Reporter: Sandeep Parikh
> Assignee: Jean-Baptiste Onofré
>
> When running against large collections sizes (20M+ documents), MongoDbIO
> fails to correctly parse the {{size}} element in the document returned by
> {code:javascript}
> db.runCommand({'collStats', 'collectionName'})
> {code}
> As the collection sizes grow larger, the returned value is in scientific
> notation which cannot be parsed as a Long.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)