[
https://issues.apache.org/jira/browse/BEAM-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17132437#comment-17132437
]
Beam JIRA Bot commented on BEAM-1800:
-------------------------------------
This issue is assigned but has not received an update in 30 days so it has been
labeled "stale-assigned". If you are still working on the issue, please give an
update and remove the label. If you are no longer working on the issue, please
unassign so someone else may work on it. In 7 days the issue will be
automatically unassigned.
> Can't save datastore objects
> ----------------------------
>
> Key: BEAM-1800
> URL: https://issues.apache.org/jira/browse/BEAM-1800
> Project: Beam
> Issue Type: Bug
> Components: sdk-py-core
> Reporter: Mike Lambert
> Assignee: Vikas Kedigehalli
> Priority: P2
> Labels: stale-assigned
>
> I can't seem to save my database objects using {{WriteToDatastore}}, as it
> errors out on a strange unicode issue when trying to write a batch.
> Stacktrace follows:
> {noformat}
> File "apache_beam/runners/common.py", line 195, in
> apache_beam.runners.common.DoFnRunner.receive
> (apache_beam/runners/common.c:5142)
> self.process(windowed_value)
> File "apache_beam/runners/common.py", line 267, in
> apache_beam.runners.common.DoFnRunner.process
> (apache_beam/runners/common.c:7201)
> self.reraise_augmented(exn)
> File "apache_beam/runners/common.py", line 279, in
> apache_beam.runners.common.DoFnRunner.reraise_augmented
> (apache_beam/runners/common.c:7590)
> raise type(exn), args, sys.exc_info()[2]
> File "apache_beam/runners/common.py", line 263, in
> apache_beam.runners.common.DoFnRunner.process
> (apache_beam/runners/common.c:7090)
> self._dofn_simple_invoker(element)
> File "apache_beam/runners/common.py", line 198, in
> apache_beam.runners.common.DoFnRunner._dofn_simple_invoker
> (apache_beam/runners/common.c:5262)
> self._process_outputs(element, self.dofn_process(element.value))
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/datastoreio.py",
> line 354, in process
> self._flush_batch()
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/datastoreio.py",
> line 363, in _flush_batch
> helper.write_mutations(self._datastore, self._project, self._mutations)
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/helper.py",
> line 187, in write_mutations
> commit(commit_request)
> File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py",
> line 174, in wrapper
> return fun(*args, **kwargs)
> File
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/helper.py",
> line 185, in commit
> datastore.commit(req)
> File "/usr/local/lib/python2.7/dist-packages/googledatastore/connection.py",
> line 140, in commit
> datastore_pb2.CommitResponse)
> File "/usr/local/lib/python2.7/dist-packages/googledatastore/connection.py",
> line 199, in _call_method
> method='POST', body=payload, headers=headers)
> File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line
> 631, in new_request
> redirections, connection_type)
> File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line
> 1609, in request (response, content)
> = self._request(conn, authority, uri, request_uri, method, body, headers,
> redirections, cachekey)
> File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line
> 1351, in _request (response, content)
> = self._conn_request(conn, request_uri, method, body, headers)
> File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line
> 1273, in _conn_request
> conn.request(method, request_uri, body, headers)
> File "/usr/lib/python2.7/httplib.py", line 1039, in request
> self._send_request(method, url, body, headers)
> File "/usr/lib/python2.7/httplib.py", line 1073, in _send_request
> self.endheaders(body)
> File "/usr/lib/python2.7/httplib.py", line 1035, in endheaders
> self._send_output(message_body)
> File "/usr/lib/python2.7/httplib.py", line 877, in _send_output
> msg += message_body TypeError: must be str, not unicode
> [while running 'write to datastore/Convert to Mutation']
> {noformat}
> My code is basically:
> {noformat}
> | 'convert from entity' >> beam.Map(ConvertFromEntity)
> | 'write to datastore' >> WriteToDatastore(client.project)
> {noformat}
> Where {{ConvertFromEntity}} converts from a google.cloud.datastore object
> (which has a nice API/interface) into the underlying protobuf (which is what
> the beam gcp/datastore library expects):
> {noformat}
> from google.cloud.datastore import helpers
> def ConvertFromEntity(entity):
> return helpers.entity_to_protobuf(entity)
> {noformat}
> I assume entity_to_protobuf works fine/normally, since it's also what is used
> by {{google/cloud/datastore/batch.py}} to write a bunch of
> {{entity_pb2.Entity}} objects into the
> {{datastore_pb2.CommitRequest.mutations[n].upsert}}:
> In batch.py: {{put() -> _assign_entity_to_pb() -> entity_to_protobuf()}}.
> In datastoreio.py:
> {{WriteToDatastore->DatastoreWriteFn.to_upsert_mutation->_Mutate.DatastoreWriteFn->helper.write_mutations}}
> Any idea what's going on here and why this doesn't work? Yes, I may have some
> unicode in my objects...but it works in my appengine DB/NDB usage. I will
> attempt to skip WriteToDatastore and just put unbatched entities using the
> datastore library and see if that goes any better for me...
--
This message was sent by Atlassian Jira
(v8.3.4#803005)