[ 
https://issues.apache.org/jira/browse/BEAM-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ahmet Altay reassigned BEAM-1800:
---------------------------------

    Assignee: Vikas Kedigehalli  (was: Ahmet Altay)

> Can't save datastore objects
> ----------------------------
>
>                 Key: BEAM-1800
>                 URL: https://issues.apache.org/jira/browse/BEAM-1800
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-py
>            Reporter: Mike Lambert
>            Assignee: Vikas Kedigehalli
>
> I can't seem to save my database objects using {{WriteToDatastore}}, as it 
> errors out on a strange unicode issue when trying to write a batch. 
> Stacktrace follows:
> {noformat}
> File "apache_beam/runners/common.py", line 195, in 
> apache_beam.runners.common.DoFnRunner.receive 
> (apache_beam/runners/common.c:5142)
>   self.process(windowed_value) 
> File "apache_beam/runners/common.py", line 267, in 
> apache_beam.runners.common.DoFnRunner.process 
> (apache_beam/runners/common.c:7201)
>   self.reraise_augmented(exn) 
> File "apache_beam/runners/common.py", line 279, in 
> apache_beam.runners.common.DoFnRunner.reraise_augmented 
> (apache_beam/runners/common.c:7590)
>   raise type(exn), args, sys.exc_info()[2] 
> File "apache_beam/runners/common.py", line 263, in 
> apache_beam.runners.common.DoFnRunner.process 
> (apache_beam/runners/common.c:7090)
>   self._dofn_simple_invoker(element) 
> File "apache_beam/runners/common.py", line 198, in 
> apache_beam.runners.common.DoFnRunner._dofn_simple_invoker 
> (apache_beam/runners/common.c:5262)
>   self._process_outputs(element, self.dofn_process(element.value)) 
> File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/datastoreio.py",
>  line 354, in process
>   self._flush_batch() 
> File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/datastoreio.py",
>  line 363, in _flush_batch
>   helper.write_mutations(self._datastore, self._project, self._mutations) 
> File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/helper.py",
>  line 187, in write_mutations
>   commit(commit_request) 
> File "/usr/local/lib/python2.7/dist-packages/apache_beam/utils/retry.py", 
> line 174, in wrapper
>   return fun(*args, **kwargs) 
> File 
> "/usr/local/lib/python2.7/dist-packages/apache_beam/io/gcp/datastore/v1/helper.py",
>  line 185, in commit
>   datastore.commit(req) 
> File "/usr/local/lib/python2.7/dist-packages/googledatastore/connection.py", 
> line 140, in commit
>   datastore_pb2.CommitResponse) 
> File "/usr/local/lib/python2.7/dist-packages/googledatastore/connection.py", 
> line 199, in _call_method
>   method='POST', body=payload, headers=headers) 
> File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 
> 631, in new_request
>   redirections, connection_type) 
> File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 
> 1609, in request (response, content)
>   = self._request(conn, authority, uri, request_uri, method, body, headers, 
> redirections, cachekey) 
> File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 
> 1351, in _request (response, content)
>   = self._conn_request(conn, request_uri, method, body, headers) 
> File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 
> 1273, in _conn_request
>   conn.request(method, request_uri, body, headers) 
> File "/usr/lib/python2.7/httplib.py", line 1039, in request
>   self._send_request(method, url, body, headers)
> File "/usr/lib/python2.7/httplib.py", line 1073, in _send_request
>    self.endheaders(body) 
> File "/usr/lib/python2.7/httplib.py", line 1035, in endheaders
>   self._send_output(message_body) 
> File "/usr/lib/python2.7/httplib.py", line 877, in _send_output
>   msg += message_body TypeError: must be str, not unicode
> [while running 'write to datastore/Convert to Mutation']
> {noformat}
> My code is basically:
> {noformat}
>         | 'convert from entity' >> beam.Map(ConvertFromEntity)
>         | 'write to datastore' >> WriteToDatastore(client.project)
> {noformat}
> Where {{ConvertFromEntity}} converts from a google.cloud.datastore object 
> (which has a nice API/interface) into the underlying protobuf (which is what 
> the beam gcp/datastore library expects):
> {noformat}
> from google.cloud.datastore import helpers
> def ConvertFromEntity(entity):
>     return helpers.entity_to_protobuf(entity)
> {noformat}
> I assume entity_to_protobuf works fine/normally, since it's also what is used 
> by {{google/cloud/datastore/batch.py}} to write a bunch of 
> {{entity_pb2.Entity}} objects into the 
> {{datastore_pb2.CommitRequest.mutations[n].upsert}}:
> In batch.py: {{put() -> _assign_entity_to_pb() -> entity_to_protobuf()}}.
> In datastoreio.py: 
> {{WriteToDatastore->DatastoreWriteFn.to_upsert_mutation->_Mutate.DatastoreWriteFn->helper.write_mutations}}
> Any idea what's going on here and why this doesn't work? Yes, I may have some 
> unicode in my objects...but it works in my appengine DB/NDB usage. I will 
> attempt to skip WriteToDatastore and just put unbatched entities using the 
> datastore library and see if that goes any better for me...



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to