[ 
https://issues.apache.org/jira/browse/HUDI-826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Li updated HUDI-826:
-------------------------
    Fix Version/s:     (was: 0.8.0)
                   0.9.0

> Spark to avro schema in 0.6 incompatible with 0.5 for fixed types
> -----------------------------------------------------------------
>
>                 Key: HUDI-826
>                 URL: https://issues.apache.org/jira/browse/HUDI-826
>             Project: Apache Hudi
>          Issue Type: Bug
>    Affects Versions: 0.9.0
>            Reporter: Alexander Filipchik
>            Priority: Major
>             Fix For: 0.9.0
>
>
> Let's say we had some dataset created with SQL transformer using a query: 
> {code:java}
> // select bla AS DECIMAL(20, 9)) bla
> {code}
> In 0.5 spark->avro converter (Databrics) would generate something like:
>  
> {code:java}
> // {
>   "name": "bla",
>   "type": [
>     "string", 
>     "null"    
>   ]
> },
> {code}
> in 0.6 (Spark):
>  
>  
> {code:java}
> // {
>   "name": "bla",
>   "type": [    
>     {
>       "type": "fixed",
>       "name": "order_subtotal",
>       "namespace": "",
>       "size": 16,
>       "logicalType": "decimal",
>       "precision": 38,
>       "scale": 17
>     }, "null"
>   ]
> },
> {code}
> types are very different in that case. During the merge reader would fail 
> with:
> {code:java}
> //    at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:251)
>       at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:132)
>       at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:136)
>       at 
> org.apache.hudi.utilities.TestCss.testParquetWithSchema(TestCss.java:270)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>       at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
>       at 
> com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
>       at 
> com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
>       at 
> com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
>       at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
> Caused by: java.lang.ArrayIndexOutOfBoundsException
>       at java.lang.System.arraycopy(Native Method)
>       at 
> org.apache.avro.generic.GenericData.createFixed(GenericData.java:1168)
>       at 
> org.apache.parquet.avro.AvroConverters$FieldFixedConverter.convert(AvroConverters.java:310
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to