[ https://issues.apache.org/jira/browse/BEAM-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15491845#comment-15491845 ]
ASF GitHub Bot commented on BEAM-618: ------------------------------------- Github user ajamato closed the pull request at: https://github.com/apache/incubator-beam/pull/947 > Python SDKs writes non RFC compliant JSON files for BQ Export > ------------------------------------------------------------- > > Key: BEAM-618 > URL: https://issues.apache.org/jira/browse/BEAM-618 > Project: Beam > Issue Type: Bug > Components: sdk-py > Reporter: Alex Amato > Assignee: Frances Perry > > Python SDK uses the built in json.dumps to write JSON files to GCS for the BQ > Exporter. BigQuery can fail to parse these files when it tries to load these > files into a BQ table because json.dumps can export JSON which does not > conform to the IEEE RFC. > There are a few cases which are not RFC compilant listed in that module. > https://docs.python.org/2/library/json.html#standard-compliance-and-interoperability > The main issue we run into is the NAN, INF and -INF values. > These fails with a confusing error (and we delete the GCS files making it > hard to debug): > JSON table encountered too many errors, giving up. Rows JSON parsing error in > row starting at position > We can set the allow_nan argument to json.dumps to false to address these > issues. So that when a user tries to write a file with INF, -INF or NAN > Setting this argument will produce this type of error when json.dumps is > called with NAN/INF values. We may want to catch this error to mention the > fact that INF and NAN are not allowed. > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/usr/lib/python2.7/json/__init__.py", line 250, in dumps > sort_keys=sort_keys, **kw).encode(obj) > File "/usr/lib/python2.7/json/encoder.py", line 207, in encode > chunks = self.iterencode(o, _one_shot=True) > File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode > return _iterencode(o, 0) > ValueError: Out of range float values are not JSON compliant -- This message was sent by Atlassian JIRA (v6.3.4#6332)