damccorm opened a new issue, #20718:
URL: https://github.com/apache/beam/issues/20718

   Hi Beam community,
   
    I am seeing an error when reading an array field using ParquetIO. I was 
using beam 2.25.  Both direct runner and spark runner testing is seeing this 
issue. This is a blocker issue to me for the beam adoption, so a prompt help 
would be appreciated.
   
    Below is the schema tree as a quick visualization. The array field name is 
"numbers" and the element type is int. 
   
    
   
   root |
   
        \-- numbers: array (nullable = true) | |
   
              \-- element: integer (containsNull = true)
   
    
   
   The beam code is very simple: 
pipeline.apply(ParquetIO.read(avroSchema).from(parquetPath));
   
    
   
   Below is the error when running that code:
   
    
   ```
   
   Exception in thread "main" 
org.apache.beam.sdk.Pipeline$PipelineExecutionException: 
java.lang.ClassCastException:
   org.apache.avro.generic.GenericData$Record cannot be cast to java.lang.Number
   
                   at
   
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:353)
   
                  
   at 
org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:321)
   
                  
   at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:216)
   
                   at 
org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:67)
   
                  
   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:317)
   
                   at org.apache.beam.sdk.Pipeline.run(Pipeline.java:303)
   
   Caused
   by: java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record 
cannot be cast to java.lang.Number
   
                  
   at 
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:156)
   
                  
   at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:82)
   
                   at
   
org.apache.avro.generic.GenericDatumWriter.writeArray(GenericDatumWriter.java:234)
   
                  
   at 
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:136)
   
                  
   at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:82)
   
                   at
   
org.apache.avro.generic.GenericDatumWriter.writeField(GenericDatumWriter.java:206)
   
                  
   at 
org.apache.avro.generic.GenericDatumWriter.writeRecord(GenericDatumWriter.java:195)
   
                  
   at 
org.apache.avro.generic.GenericDatumWriter.writeWithoutConversion(GenericDatumWriter.java:130)
   
                  
   at 
org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:82)
   
                   at
   org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:72)
   
                   at 
org.apache.beam.sdk.coders.AvroCoder.encode(AvroCoder.java:317)
   
                  
   at org.apache.beam.sdk.coders.Coder.encode(Coder.java:136)
   
                   at 
org.apache.beam.sdk.util.CoderUtils.encodeToSafeStream(CoderUtils.java:82)
   
                  
   at org.apache.beam.sdk.util.CoderUtils.encodeToByteArray(CoderUtils.java:66)
   
                   at 
org.apache.beam.sdk.util.CoderUtils.encodeToByteArray(CoderUtils.java:51)
   
                  
   at org.apache.beam.sdk.util.CoderUtils.clone(CoderUtils.java:141)
   
                   at 
org.apache.beam.sdk.util.MutationDetectors$CodedValueMutationDetector.<init>(MutationDetectors.java:115)
   
                  
   at 
org.apache.beam.sdk.util.MutationDetectors.forValueWithCoder(MutationDetectors.java:46)
   
                  
   at 
org.apache.beam.runners.direct.ImmutabilityCheckingBundleFactory$ImmutabilityEnforcingBundle.add(ImmutabilityCheckingBundleFactory.java:112)
   
                  
   at 
org.apache.beam.runners.direct.ParDoEvaluator$BundleOutputManager.output(ParDoEvaluator.java:301)
   
                  
   at 
org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner.outputWindowedValue(SimpleDoFnRunner.java:267)
   
                  
   at 
org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner.access$900(SimpleDoFnRunner.java:79)
   
                  
   at 
org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:413)
   
                  
   at 
org.apache.beam.repackaged.direct_java.runners.core.SimpleDoFnRunner$DoFnProcessContext.output(SimpleDoFnRunner.java:401)
   
                  
   at 
org.apache.beam.sdk.io.parquet.ParquetIO$ReadFiles$ReadFn.processElement(ParquetIO.java:646)
   
   
   ```
   
   
   Imported from Jira 
[BEAM-11721](https://issues.apache.org/jira/browse/BEAM-11721). Original Jira 
may contain additional context.
   Reported by: sekiforever.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to