[ https://issues.apache.org/jira/browse/HUDI-826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gary Li updated HUDI-826: ------------------------- Fix Version/s: (was: 0.8.0) 0.9.0 > Spark to avro schema in 0.6 incompatible with 0.5 for fixed types > ----------------------------------------------------------------- > > Key: HUDI-826 > URL: https://issues.apache.org/jira/browse/HUDI-826 > Project: Apache Hudi > Issue Type: Bug > Affects Versions: 0.9.0 > Reporter: Alexander Filipchik > Priority: Major > Fix For: 0.9.0 > > > Let's say we had some dataset created with SQL transformer using a query: > {code:java} > // select bla AS DECIMAL(20, 9)) bla > {code} > In 0.5 spark->avro converter (Databrics) would generate something like: > > {code:java} > // { > "name": "bla", > "type": [ > "string", > "null" > ] > }, > {code} > in 0.6 (Spark): > > > {code:java} > // { > "name": "bla", > "type": [ > { > "type": "fixed", > "name": "order_subtotal", > "namespace": "", > "size": 16, > "logicalType": "decimal", > "precision": 38, > "scale": 17 > }, "null" > ] > }, > {code} > types are very different in that case. During the merge reader would fail > with: > {code:java} > // at > org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251) > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132) > at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136) > at > org.apache.hudi.utilities.TestCss.testParquetWithSchema(TestCss.java:270) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > at org.junit.runners.ParentRunner.run(ParentRunner.java:363) > at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > at > com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) > at > com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33) > at > com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230) > at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58) > Caused by: java.lang.ArrayIndexOutOfBoundsException > at java.lang.System.arraycopy(Native Method) > at > org.apache.avro.generic.GenericData.createFixed(GenericData.java:1168) > at > org.apache.parquet.avro.AvroConverters$FieldFixedConverter.convert(AvroConverters.java:310 > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)