[
https://issues.apache.org/jira/browse/BEAM-9613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17091820#comment-17091820
]
Kenneth Knowles edited comment on BEAM-9613 at 4/24/20, 6:37 PM:
-----------------------------------------------------------------
JSON is a textual format in the same was as any programming language literal.
The text is an encoding. We don't usually think about programs and values in
programs as text strings, but you can if you want... The value and type of a
JSON value is unambiguous.
- {{The JSON value \{ pi: 3.14159 } has a single field "pi" and its value is
the number {{3.14159}}}}
- {{The JSON value \{ pi: "3.14.159" } has a single field "pi" and its value
is the string {{"3.14159"}}}}
JSON has one number type (specified to be arbitrary precision IIRC) not
separate integer and float types.
These are distinct values. A TableRow for a BQ table with a FLOAT64 field may
choose one or the other (likely), export both in different contexts (yuck),
accept both and fuzzily do best-effort coercion on writes, etc.
What we need is something like
[https://cloud.google.com/bigquery/docs/exporting-data#avro_export_details] for
the JSON format, and to make sure we are consistent between our treatment of
the Avro and JSON format. It seems that when we read the Avro and then convert
to TableRow, we end up with a different TableRow than we would if we did a JSON
export and read it in, or read the row directly as a TableRow.
was (Author: kenn):
JSON is a textual format in the same was as any programming language literal.
The text is an encoding. We don't usually think about programs and values in
programs as text strings, but you can if you want... The value and type of a
JSON value is unambiguous.
- The JSON value { pi: 3.14159 \} has a single field "pi" and its value is the
number {{3.14159}}
- The JSON value { pi: "3.14.159 \} has a single field "pi" and its value is
the string {{"3.14159"}}
JSON has one number type (specified to be arbitrary precision IIRC) not
separate integer and float types.
These are distinct values. A TableRow for a BQ table with a FLOAT64 field may
choose one or the other (likely), export both in different contexts (yuck),
accept both and fuzzily do best-effort coercion on writes, etc.
What we need is something like
https://cloud.google.com/bigquery/docs/exporting-data#avro_export_details for
the JSON format, and to make sure we are consistent between our treatment of
the Avro and JSON format. It seems that when we read the Avro and then convert
to TableRow, we end up with a different TableRow than we would if we did a JSON
export and read it in, or read the row directly as a TableRow.
> BigQuery IO not support convert double type for beam row
> --------------------------------------------------------
>
> Key: BEAM-9613
> URL: https://issues.apache.org/jira/browse/BEAM-9613
> Project: Beam
> Issue Type: Bug
> Components: io-java-gcp
> Reporter: MAKSIM TSYGAN
> Priority: Major
>
> If execute query with double columnĀ via BigQueryIO.readFrom(), I get
> exception:
> Caused by: java.lang.UnsupportedOperationException: Converting BigQuery type
> 'class java.lang.Double' to 'FieldType\{typeName=DOUBLE, nullable=true,
> logicalType=null, collectionElementType=null, mapKeyType=null,
> mapValueType=null, rowSchema=null, metadata={}}' is not supported
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamValue(BigQueryUtils.java:532)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRowFieldValue(BigQueryUtils.java:483)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamRow$6(BigQueryUtils.java:469)
> at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
> at
> java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
> at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
> at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
> at
> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRow(BigQueryUtils.java:470)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)