Thanks Cham, I'll keep an eye on that issue. Let me know if you want me to test anything out.
Josh On Tue, 21 Apr 2020, at 7:41 AM, Chamikara Jayalath wrote: > Thanks. This does sound like a bug and this code path was added recently. > Created https://issues.apache.org/jira/browse/BEAM-9790. > > Thanks, > Cham > > On Fri, Apr 17, 2020 at 3:03 PM Joshua Bassett <he...@joshbassett.info> wrote: >> org.apache.beam.sdk.Pipeline$PipelineExecutionException: >> java.lang.ClassCastException: java.lang.String cannot be cast to >> java.util.Map >> at >> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish >> (DirectRunner.java:348) >> at >> org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish >> (DirectRunner.java:318) >> at org.apache.beam.runners.direct.DirectRunner.run (DirectRunner.java:213) >> at org.apache.beam.runners.direct.DirectRunner.run (DirectRunner.java:67) >> at org.apache.beam.sdk.Pipeline.run (Pipeline.java:317) >> at org.apache.beam.sdk.Pipeline.run (Pipeline.java:303) >> at com.theconversation.data.TopArticlesEnriched.main >> (TopArticlesEnriched.java:181) >> at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method) >> at sun.reflect.NativeMethodAccessorImpl.invoke >> (NativeMethodAccessorImpl.java:62) >> at sun.reflect.DelegatingMethodAccessorImpl.invoke >> (DelegatingMethodAccessorImpl.java:43) >> at java.lang.reflect.Method.invoke (Method.java:498) >> at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282) >> at java.lang.Thread.run (Thread.java:748) >> Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to >> java.util.Map >> at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamValue$12 >> (BigQueryUtils.java:528) >> at java.util.stream.ReferencePipeline$3$1.accept >> (ReferencePipeline.java:193) >> at java.util.ArrayList$ArrayListSpliterator.forEachRemaining >> (ArrayList.java:1382) >> at java.util.stream.AbstractPipeline.copyInto (AbstractPipeline.java:482) >> at java.util.stream.AbstractPipeline.wrapAndCopyInto >> (AbstractPipeline.java:472) >> at java.util.stream.ReduceOps$ReduceOp.evaluateSequential >> (ReduceOps.java:708) >> at java.util.stream.AbstractPipeline.evaluate (AbstractPipeline.java:234) >> at java.util.stream.ReferencePipeline.collect (ReferencePipeline.java:566) >> at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamValue >> (BigQueryUtils.java:530) >> at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRowFieldValue >> (BigQueryUtils.java:491) >> at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$toBeamRow$6 >> (BigQueryUtils.java:477) >> at java.util.stream.ReferencePipeline$3$1.accept >> (ReferencePipeline.java:193) >> at java.util.ArrayList$ArrayListSpliterator.forEachRemaining >> (ArrayList.java:1382) >> at java.util.stream.AbstractPipeline.copyInto (AbstractPipeline.java:482) >> at java.util.stream.AbstractPipeline.wrapAndCopyInto >> (AbstractPipeline.java:472) >> at java.util.stream.ReduceOps$ReduceOp.evaluateSequential >> (ReduceOps.java:708) >> at java.util.stream.AbstractPipeline.evaluate (AbstractPipeline.java:234) >> at java.util.stream.ReferencePipeline.collect (ReferencePipeline.java:566) >> at org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.toBeamRow >> (BigQueryUtils.java:478) >> at >> org.apache.beam.sdk.io.gcp.bigquery.BigQueryUtils.lambda$static$9bc3d4b2$1 >> (BigQueryUtils.java:341) >> at org.apache.beam.sdk.schemas.SchemaCoder.encode (SchemaCoder.java:166) >> at org.apache.beam.sdk.coders.Coder.encode (Coder.java:136) >> at org.apache.beam.sdk.util.CoderUtils.encodeToSafeStream >> (CoderUtils.java:82) >> at org.apache.beam.sdk.util.CoderUtils.encodeToByteArray >> (CoderUtils.java:66) >> at org.apache.beam.sdk.util.CoderUtils.encodeToByteArray >> (CoderUtils.java:51) >> at org.apache.beam.sdk.util.CoderUtils.clone (CoderUtils.java:141) >> at >> org.apache.beam.sdk.util.MutationDetectors$CodedValueMutationDetector.<init> >> (MutationDetectors.java:115) >> at org.apache.beam.sdk.util.MutationDetectors.forValueWithCoder >> (MutationDetectors.java:46) >> at >> org.apache.beam.runners.direct.ImmutabilityCheckingBundleFactory$ImmutabilityEnforcingBundle.add >> (ImmutabilityCheckingBundleFactory.java:112) >> at org.apache.beam.runners.direct.ParDoEvaluator$BundleOutputManager.output >> (ParDoEvaluator.java:299) >> at >> org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner.outputWindowedValue >> (SimpleDoFnRunner.java:258) >> at >> org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner.access$700 >> (SimpleDoFnRunner.java:78) >> at >> org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner$DoFnProcessContext.output >> (SimpleDoFnRunner.java:627) >> at >> org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner$DoFnProcessContext.output >> (SimpleDoFnRunner.java:615) >> at >> org.apache.beam.sdk.io.gcp.bigquery.PassThroughThenCleanup$IdentityFn.processElement >> (PassThroughThenCleanup.java:83) >> >> On Sat, 18 Apr 2020, at 12:59 AM, Chamikara Jayalath wrote: >> > Do you have the full stack trace ? >> > Also, does readTableRows() work for you (without using schemas) ? >> > >> > On Fri, Apr 17, 2020 at 3:44 AM Joshua Bassett <he...@joshbassett.info> >> wrote: >> >> Hi there >> >> >> >> I'm trying to read rows from a BigQuery table that contains a repeated >> field into POJOs. Unfortunately, I'm running into issues and I can't figure >> it out. >> >> >> >> I have something like this: >> >> >> >> @DefaultSchema(JavaFieldSchema.class) >> >> class Article implements Serializable { >> >> public Long id; >> >> public String title; >> >> @SchemaFieldName("author_ids") >> >> public Long[] authorIds; >> >> } >> >> >> >> PCollection<Article> articles = pipeline >> >> .apply( >> >> BigQueryIO >> >> .readTableRowsWithSchema() >> >> .from("myproject:data_warehouse.articles") >> >> ) >> >> .apply(Convert.to(Article.class)); >> >> >> >> The schema looks like this: >> >> >> >> [ >> >> { >> >> "mode": "NULLABLE", >> >> "name": "id", >> >> "type": "INTEGER" >> >> }, >> >> { >> >> "mode": "NULLABLE", >> >> "name": "title", >> >> "type": "STRING" >> >> }, >> >> { >> >> "mode": "REPEATED", >> >> "name": "author_ids", >> >> "type": "INTEGER" >> >> } >> >> ] >> >> >> >> When I run the pipeline, I end up with the following exception: >> >> >> >> java.lang.ClassCastException: java.lang.String cannot be cast to >> java.util.Map >> >> >> >> Should this be possible? Strangely, when I remove the repeated field >> from the schema/POJO it works perfectly. >> >> >> >> I'm using Beam SDK 2.19.0 with the direct runner. Any help would be much >> appreciated. >> >> >> >> Josh Kind regards Josh