[
https://issues.apache.org/jira/browse/BEAM-991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16004420#comment-16004420
]
ASF GitHub Bot commented on BEAM-991:
-------------------------------------
GitHub user cph6 opened a pull request:
https://github.com/apache/beam/pull/3043
[BEAM-991] Comply with byte limit for Datastore Commit in python SDK
This is the equivalent of https://github.com/apache/beam/pull/2948 for the
python SDK. RPCs are limited both by overall size and by the number of entities
contained, to fit within the Datastore API limits
https://cloud.google.com/datastore/docs/concepts/limits .
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cph6/beam datastore_request_size_limit_py
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/beam/pull/3043.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3043
----
commit a5b2ce0341b5e80653a92f8ebda1b6b6a496d52e
Author: Colin Phipps <[email protected]>
Date: 2017-05-10T09:50:56Z
Comply with byte limit for Datastore Commit.
----
> DatastoreIO Write should flush early for large batches
> ------------------------------------------------------
>
> Key: BEAM-991
> URL: https://issues.apache.org/jira/browse/BEAM-991
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-gcp
> Reporter: Vikas Kedigehalli
> Assignee: Vikas Kedigehalli
>
> If entities are large (avg size > 20KB) then the a single batched write (500
> entities) would exceed the Datastore size limit of a single request (10MB)
> from https://cloud.google.com/datastore/docs/concepts/limits.
> First reported in:
> http://stackoverflow.com/questions/40156400/why-does-dataflow-erratically-fail-in-datastore-access
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)