Andre created BEAM-2767:
---------------------------
Summary: BigQueryIO result different for REPEATED field between
DirectRunner and DataflowRunner
Key: BEAM-2767
URL: https://issues.apache.org/jira/browse/BEAM-2767
Project: Beam
Issue Type: Bug
Components: runner-dataflow, runner-direct, sdk-java-gcp
Affects Versions: 2.0.0
Reporter: Andre
Assignee: Thomas Groh
When running a query against BigQueryIO with a REPEATED RECORD field the
behavior is different between DirectRunner and DataflowRunner. The field
containing the repeated record has to be cast to access the records. Apparently
the following implementations work for each runner but I would expect them to
be the same as my pipeline otherwise only runs on one.
DirectRunner:
{code:java}
ArrayList<LinkedHashMap> orderLines = (ArrayList<LinkedHashMap>)
c.element().get("RepeatedField");
{code}
DataflowRunner:
{code:java}
ImmutableList<TableRow> orderLines = (ImmutableList<TableRow>)
c.element().get("RepeatedField");
{code}
For example when using the ImmutableList implementation on DirectRunner the
following exception is thrown:
{code:java}
java.lang.ClassCastException: java.util.ArrayList cannot be cast to
com.google.common.collect.ImmutableList
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)