[ 
https://issues.apache.org/jira/browse/BEAM-11979?focusedWorklogId=566583&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-566583
 ]

ASF GitHub Bot logged work on BEAM-11979:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Mar/21 22:15
            Start Date: 15/Mar/21 22:15
    Worklog Time Spent: 10m 
      Work Description: y1chi commented on a change in pull request #14237:
URL: https://github.com/apache/beam/pull/14237#discussion_r594720288



##########
File path: sdks/python/apache_beam/io/mongodbio.py
##########
@@ -264,9 +264,13 @@ def display_data(self):
     res['uri'] = _mask_uri_password(self.uri)
     res['database'] = self.db
     res['collection'] = self.coll
-    res['filter'] = json.dumps(self.filter)
+    res['filter'] = json.dumps(
+        self.filter, default=lambda x: 'not_serializable(%s)' % str(x))
     res['projection'] = str(self.projection)
-    res['mongo_client_spec'] = json.dumps(_mask_spec_password(self.spec))
+    res['mongo_client_spec'] = json.dumps(
+        _mask_spec_password(self.spec),
+        default=lambda x: 'not_serializable('
+        '%s)' % str(x))

Review comment:
       I think the spec is probably not that meaningful as display data so I 
removed it altogether. If user reports that they need to display additional 
metadata we could then give them an option to pass a dict to the constructor.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 566583)
    Time Spent: 2h  (was: 1h 50m)

> Can't use ReadFromMongoDB with a datetime in filter
> ---------------------------------------------------
>
>                 Key: BEAM-11979
>                 URL: https://issues.apache.org/jira/browse/BEAM-11979
>             Project: Beam
>          Issue Type: Bug
>          Components: io-py-mongodb
>    Affects Versions: 2.28.0
>            Reporter: Gaël
>            Assignee: Yichi Zhang
>            Priority: P2
>          Time Spent: 2h
>  Remaining Estimate: 0h
>
> I'm having an issue while using a filter containing a datetime.
> This filter works directly in pymongo but not in ReadFromMongoDB.
> _BoundedMongoSource.display_data() seems to be the source of the issue.
>  
> Fixing the line 267:
> {code:java}
> res['filter'] = json.dumps(self.filter){code}
> by using: 
>  
> {code:java}
> # from bson import json_util
> res['filter'] = json.dumps(self.filter, default=json_util.default){code}
> Here is an example of my code : 
> {code:java}
> import apache_beam as beam
> from apache_beam.io import ReadFromMongoDB
> import datetime
> inputs_query = {"created_at": { "$gte": datetime.datetime.now() } } 
> with beam.Pipeline() as p:
>     p_inputs = (p  | 'Read Mongo Inputs' >> ReadFromMongoDB(uri=mongo_db_uri, 
>                                                             db=db, 
>                                                             coll=input_coll, 
>                                                             
> filter=inputs_query
>                                                             )
>                     | 'Count all Inputs' >> beam.combiners.Count.Globally()
>                     | 'Print Inputs' >> beam.Map(print)
>                 )
> {code}
> I get the following error : 
> {code:java}
> mongomicrotest.py:19: FutureWarning: ReadFromMongoDB is experimental.
>   | 'Print Inputs' >> beam.Map(print)
> Traceback (most recent call last):
>   File "mongomicrotest.py", line 19, in <module>
>     | 'Print Inputs' >> beam.Map(print)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/transforms/ptransform.py",
>  line 1058, in __ror__
>     return self.transform.__ror__(pvalueish, self.label)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/transforms/ptransform.py",
>  line 573, in __ror__
>     result = p.apply(self, pvalueish, label)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/pipeline.py",
>  line 646, in apply
>     return self.apply(transform, pvalueish)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/pipeline.py",
>  line 689, in apply
>     pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/runners/runner.py",
>  line 188, in apply
>     return m(transform, input, options)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/runners/runner.py",
>  line 218, in apply_PTransform
>     return transform.expand(input)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/io/mongodbio.py",
>  line 163, in expand
>     return pcoll | iobase.Read(self._mongo_source)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/pvalue.py", 
> line 141, in __or__
>     return self.pipeline.apply(ptransform, self)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/pipeline.py",
>  line 689, in apply
>     pvalueish_result = self.runner.apply(transform, pvalueish, self._options)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/runners/runner.py",
>  line 188, in apply
>     return m(transform, input, options)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/runners/runner.py",
>  line 218, in apply_PTransform
>     return transform.expand(input)
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/io/iobase.py",
>  line 894, in expand
>     display_data = self.source.display_data() or {}
>   File 
> "/Users/gael/venv/venv36/lib/python3.6/site-packages/apache_beam/io/mongodbio.py",
>  line 267, in display_data
>     res['filter'] = json.dumps(self.filter)
>   File "/Users/gael/.pyenv/versions/3.6.11/lib/python3.6/json/__init__.py", 
> line 231, in dumps
>     return _default_encoder.encode(obj)
>   File "/Users/gael/.pyenv/versions/3.6.11/lib/python3.6/json/encoder.py", 
> line 199, in encode
>     chunks = self.iterencode(o, _one_shot=True)
>   File "/Users/gael/.pyenv/versions/3.6.11/lib/python3.6/json/encoder.py", 
> line 257, in iterencode
>     return _iterencode(o, 0)
>   File "/Users/gael/.pyenv/versions/3.6.11/lib/python3.6/json/encoder.py", 
> line 180, in default
>     o.__class__.__name__)
> TypeError: Object of type 'datetime' is not JSON serializable
> {code}
> Maybe there is a way to correctly pass a datetime in the filter ?
> This is a blocker in our company project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to