[
https://issues.apache.org/jira/browse/BEAM-618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959280#comment-15959280
]
Ahmet Altay commented on BEAM-618:
----------------------------------
[[email protected]] is this fixed?
> Python SDKs writes non RFC compliant JSON files for BQ Export
> -------------------------------------------------------------
>
> Key: BEAM-618
> URL: https://issues.apache.org/jira/browse/BEAM-618
> Project: Beam
> Issue Type: Bug
> Components: sdk-py
> Reporter: Alex Amato
> Assignee: Frances Perry
>
> Python SDK uses the built in json.dumps to write JSON files to GCS for the BQ
> Exporter. BigQuery can fail to parse these files when it tries to load these
> files into a BQ table because json.dumps can export JSON which does not
> conform to the IEEE RFC.
> There are a few cases which are not RFC compilant listed in that module.
> https://docs.python.org/2/library/json.html#standard-compliance-and-interoperability
> The main issue we run into is the NAN, INF and -INF values.
> These fails with a confusing error (and we delete the GCS files making it
> hard to debug):
> JSON table encountered too many errors, giving up. Rows JSON parsing error in
> row starting at position
> We can set the allow_nan argument to json.dumps to false to address these
> issues. So that when a user tries to write a file with INF, -INF or NAN
> Setting this argument will produce this type of error when json.dumps is
> called with NAN/INF values. We may want to catch this error to mention the
> fact that INF and NAN are not allowed.
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "/usr/lib/python2.7/json/__init__.py", line 250, in dumps
> sort_keys=sort_keys, **kw).encode(obj)
> File "/usr/lib/python2.7/json/encoder.py", line 207, in encode
> chunks = self.iterencode(o, _one_shot=True)
> File "/usr/lib/python2.7/json/encoder.py", line 270, in iterencode
> return _iterencode(o, 0)
> ValueError: Out of range float values are not JSON compliant
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)