[ 
https://issues.apache.org/jira/browse/PARQUET-952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16511397#comment-16511397
 ] 

ASF GitHub Bot commented on PARQUET-952:
----------------------------------------

michaelandrepearce commented on issue #459: PARQUET-952: Avro union with single 
type fails with 'is not a group'
URL: https://github.com/apache/parquet-mr/pull/459#issuecomment-397003375
 
 
   Any update on this getting merged and in a release? We have his the same 
issue.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Avro union with single type fails with 'is not a group'
> -------------------------------------------------------
>
>                 Key: PARQUET-952
>                 URL: https://issues.apache.org/jira/browse/PARQUET-952
>             Project: Parquet
>          Issue Type: Bug
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Nandor Kollar
>            Priority: Major
>         Attachments: PARQUET-952-repro.tar.gz
>
>
> When one uses Avro schema with a union that has only one type specified, the 
> {{AvroParquetWriter}} throws an exception. See the following repro test case:
> {code}
>   @Test
>   public void reproCase() throws Exception {
>     System.out.println("Parquet version: " + Version.FULL_VERSION);
>     // Schema with a single field 'value' with type of union that have a 
> single item (=string)
>     Schema avroSchema = Schema.parse("{" +
>       "\"type\": \"record\", " +
>       "\"name\": \"RandomRecord\", " +
>       "\"fields\": [" +
>       "{\"name\": \"value\", \"type\": [\"string\"] }" +
>       "]" +
>       "}");
>     // Parquet writer
>     ParquetWriter parquetWriter = 
> AvroParquetWriter.builder(path).withSchema(avroSchema)
>       .withConf(new Configuration())
>       .build();
>     GenericRecord record = new GenericRecordBuilder(avroSchema)
>       .set("value", "Surprise!")
>       .build();
>     parquetWriter.write(record);
>   }
> {code}
> Will result in:
> {code}
> Parquet version: parquet-mr version 1.9.0 (build 
> 38262e2c80015d0935dad20f8e18f2d6f9fbd03c)
> java.lang.ClassCastException: required binary value (UTF8) is not a group
>       at org.apache.parquet.schema.Type.asGroupType(Type.java:202)
>       at 
> org.apache.parquet.avro.AvroWriteSupport.writeValueWithoutConversion(AvroWriteSupport.java:357)
>       at 
> org.apache.parquet.avro.AvroWriteSupport.writeValue(AvroWriteSupport.java:274)
>       at 
> org.apache.parquet.avro.AvroWriteSupport.writeRecordFields(AvroWriteSupport.java:187)
>       at 
> org.apache.parquet.avro.AvroWriteSupport.write(AvroWriteSupport.java:161)
>       at 
> org.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:123)
>       at org.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:292)
>       at net.jarcec.AvroParquet.reproCase(AvroParquet.java:49)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>       at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>       at 
> com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:51)
>       at 
> com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:237)
>       at 
> com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70)
> {code}
> I'm attaching a small maven project with all the dependencies to make it 
> easier to reproduce locally.
> Trying to isolate the problem further, it seems that the 
> [AvroSchemaConverter|https://git-wip-us.apache.org/repos/asf?p=parquet-mr.git;a=blob;f=parquet-avro/src/main/java/org/apache/parquet/avro/AvroSchemaConverter.java;h=70b6525f6059889889dd5fc321322d045613a8cd;hb=refs/heads/master#l111]
>  converts the Avro schema to just {{required binary value (UTF8);}} (e.g. 
> primitive type). But then the writer will go based on the Avro schema (which 
> is a union) and [tries to call asGroupType() on the primite 
> type|https://git-wip-us.apache.org/repos/asf?p=parquet-mr.git;a=blob;f=parquet-avro/src/main/java/org/apache/parquet/avro/AvroWriteSupport.java;h=460565bb01d27c46d342e635860a81cbf45f7b09;hb=refs/heads/master#l357].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to