[ 
https://issues.apache.org/jira/browse/PARQUET-1656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Balajee Nagasubramaniam updated PARQUET-1656:
---------------------------------------------
    Description: 
Following exception was seen with parquet 1.8.1 (and in parquet 1.12.0, when 
trying to reproduce it).

Exception in thread "main" java.lang.ClassCastException: optional binary 
phone_number (STRING) is not a group
at 
com.uber.komondor.shaded.org.apache.parquet.schema.Type.asGroupType(Type.java:250)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:232)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.access$100(AvroRecordConverter.java:78)
at 
org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter$ElementConverter.<init>(AvroRecordConverter.java:536)
at 
org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter.<init>(AvroRecordConverter.java:486)
at 
org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:289)
at 
org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:141)
at 
org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279)
at 
org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:141)
at 
org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:95)
at 
org.apache.parquet.avro.AvroRecordMaterializer.<init>(AvroRecordMaterializer.java:33)
at 
org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:138)
at 
org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:183)
at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156)
at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135)
at 
util.ParquetToAvroSchemaConverter$.convert(ParquetToAvroSchemaConverter.scala:46)
at 
util.ParquetToAvroSchemaConverter$.main(ParquetToAvroSchemaConverter.scala:20)
at util.ParquetToAvroSchemaConverter.main(ParquetToAvroSchemaConverter.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)


Original exception was triggered by the following schema change.
Schema Before change:
                         {
                            "default": null,
                            "name": "master_cluster",
                            "type": [
                                "null",
                                {
                                    "fields": [
                                        {
                                            "name": "uuid",
                                            "type": "string"
                                        },
                                        {
                                            "name": "namespace",
                                            "type": "string"
                                        },
                                        {
                                            "name": "version",
                                            "type": "long"
                                        }
                                    ],
                                    "name": "master_cluster",
                                    "type": "record"
                                }
                            ]
                        },

After schema change:
                        {
                            "default": null,
                            "name": "master_cluster",
                            "type": [
                                "null",
                                {
                                    "fields": [
                                        {
                                            "default": null,
                                            "name": "uuid",
                                            "type": [
                                                "null",
                                                "string"
                                            ]
                                        },
                                        {
                                            "default": null,
                                            "name": "namespace",
                                            "type": [
                                                "null",
                                                "string"
                                            ]
                                        },
                                        {
                                            "default": null,
                                            "name": "version",
                                            "type": [
                                                "null",
                                                "long"
                                            ]
                                        }
                                    ],
                                    "name": "VORGmaster_cluster",
                                    "type": "record"
                                }
                            ]
                        },

We were suspecting PARQUET-1441 could be in play and tried to reproduce the 
issue on parquet-1.12.0 and seeing the same exception.

During the repro noticed that issue could be with avroSchema conversion (field 
name was substituted with generic name "array").  While we look into this 
further, want to get community input on whether this is a known issue and any 
thoughts on path forward.

19/09/12 22:34:37 DEBUG avro.SchemaCompatibility: Checking compatibility of 
reader 
{"type":"record","name":"IDphones_items","fields":[{"name":"phone_number","type":["null","string"],"default":null}]}
 with writer 
{"type":"record","name":"array","fields":[{"name":"phone_number","type":["null","string"],"default":null}]}

  was:
Following exception was seen with parquet 1.8.1 (and in parquet 1.12.0, when 
trying to reproduce it).

Exception in thread "main" java.lang.ClassCastException: optional binary 
phone_number (STRING) is not a group
at 
com.uber.komondor.shaded.org.apache.parquet.schema.Type.asGroupType(Type.java:250)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:232)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.access$100(AvroRecordConverter.java:78)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter$ElementConverter.<init>(AvroRecordConverter.java:536)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter.<init>(AvroRecordConverter.java:486)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:289)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:141)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:141)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:95)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordMaterializer.<init>(AvroRecordMaterializer.java:33)
at 
com.uber.komondor.shaded.org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:138)
at 
com.uber.komondor.shaded.org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:183)
at 
com.uber.komondor.shaded.org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156)
at 
com.uber.komondor.shaded.org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135)
at 
com.uber.komondor.util.ParquetToAvroSchemaConverter$.convert(ParquetToAvroSchemaConverter.scala:46)
at 
com.uber.komondor.util.ParquetToAvroSchemaConverter$.main(ParquetToAvroSchemaConverter.scala:20)
at 
com.uber.komondor.util.ParquetToAvroSchemaConverter.main(ParquetToAvroSchemaConverter.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)


Original exception was triggered by the following schema change.
Schema Before change:
                         {
                            "default": null,
                            "name": "master_cluster",
                            "type": [
                                "null",
                                {
                                    "fields": [
                                        {
                                            "name": "uuid",
                                            "type": "string"
                                        },
                                        {
                                            "name": "namespace",
                                            "type": "string"
                                        },
                                        {
                                            "name": "version",
                                            "type": "long"
                                        }
                                    ],
                                    "name": "master_cluster",
                                    "type": "record"
                                }
                            ]
                        },

After schema change:
                        {
                            "default": null,
                            "name": "master_cluster",
                            "type": [
                                "null",
                                {
                                    "fields": [
                                        {
                                            "default": null,
                                            "name": "uuid",
                                            "type": [
                                                "null",
                                                "string"
                                            ]
                                        },
                                        {
                                            "default": null,
                                            "name": "namespace",
                                            "type": [
                                                "null",
                                                "string"
                                            ]
                                        },
                                        {
                                            "default": null,
                                            "name": "version",
                                            "type": [
                                                "null",
                                                "long"
                                            ]
                                        }
                                    ],
                                    "name": "VENUE_ORGANIZATIONmaster_cluster",
                                    "type": "record"
                                }
                            ]
                        },

We were suspecting PARQUET-1441 could be in play and tried to reproduce the 
issue on parquet-1.12.0 and seeing the same exception.

During the repro noticed that issue could be with avroSchema conversion (field 
name was substituted with generic name "array").  While we look into this 
further, want to get community input on whether this is a known issue and any 
thoughts on path forward.

19/09/12 22:34:37 DEBUG avro.SchemaCompatibility: Checking compatibility of 
reader 
{"type":"record","name":"IDENTITYphones_items","fields":[{"name":"phone_number","type":["null","string"],"default":null}]}
 with writer 
{"type":"record","name":"array","fields":[{"name":"phone_number","type":["null","string"],"default":null}]}


> Schema change  results in exception - java.lang.ClassCastException
> ------------------------------------------------------------------
>
>                 Key: PARQUET-1656
>                 URL: https://issues.apache.org/jira/browse/PARQUET-1656
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-avro
>    Affects Versions: 1.8.1, 1.12.0
>         Environment: Hoodie/Parquet/Avro
> Parquet-1.8.1
> Avro-1.7.6
>            Reporter: Balajee Nagasubramaniam
>            Priority: Major
>              Labels: Parquet, avro
>
> Following exception was seen with parquet 1.8.1 (and in parquet 1.12.0, when 
> trying to reproduce it).
> Exception in thread "main" java.lang.ClassCastException: optional binary 
> phone_number (STRING) is not a group
> at 
> com.uber.komondor.shaded.org.apache.parquet.schema.Type.asGroupType(Type.java:250)
> at 
> com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279)
> at 
> com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:232)
> at 
> com.uber.komondor.shaded.org.apache.parquet.avro.AvroRecordConverter.access$100(AvroRecordConverter.java:78)
> at 
> org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter$ElementConverter.<init>(AvroRecordConverter.java:536)
> at 
> org.apache.parquet.avro.AvroRecordConverter$AvroCollectionConverter.<init>(AvroRecordConverter.java:486)
> at 
> org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:289)
> at 
> org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:141)
> at 
> org.apache.parquet.avro.AvroRecordConverter.newConverter(AvroRecordConverter.java:279)
> at 
> org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:141)
> at 
> org.apache.parquet.avro.AvroRecordConverter.<init>(AvroRecordConverter.java:95)
> at 
> org.apache.parquet.avro.AvroRecordMaterializer.<init>(AvroRecordMaterializer.java:33)
> at 
> org.apache.parquet.avro.AvroReadSupport.prepareForRead(AvroReadSupport.java:138)
> at 
> org.apache.parquet.hadoop.InternalParquetRecordReader.initialize(InternalParquetRecordReader.java:183)
> at org.apache.parquet.hadoop.ParquetReader.initReader(ParquetReader.java:156)
> at org.apache.parquet.hadoop.ParquetReader.read(ParquetReader.java:135)
> at 
> util.ParquetToAvroSchemaConverter$.convert(ParquetToAvroSchemaConverter.scala:46)
> at 
> util.ParquetToAvroSchemaConverter$.main(ParquetToAvroSchemaConverter.scala:20)
> at util.ParquetToAvroSchemaConverter.main(ParquetToAvroSchemaConverter.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
> Original exception was triggered by the following schema change.
> Schema Before change:
>                          {
>                             "default": null,
>                             "name": "master_cluster",
>                             "type": [
>                                 "null",
>                                 {
>                                     "fields": [
>                                         {
>                                             "name": "uuid",
>                                             "type": "string"
>                                         },
>                                         {
>                                             "name": "namespace",
>                                             "type": "string"
>                                         },
>                                         {
>                                             "name": "version",
>                                             "type": "long"
>                                         }
>                                     ],
>                                     "name": "master_cluster",
>                                     "type": "record"
>                                 }
>                             ]
>                         },
> After schema change:
>                         {
>                             "default": null,
>                             "name": "master_cluster",
>                             "type": [
>                                 "null",
>                                 {
>                                     "fields": [
>                                         {
>                                             "default": null,
>                                             "name": "uuid",
>                                             "type": [
>                                                 "null",
>                                                 "string"
>                                             ]
>                                         },
>                                         {
>                                             "default": null,
>                                             "name": "namespace",
>                                             "type": [
>                                                 "null",
>                                                 "string"
>                                             ]
>                                         },
>                                         {
>                                             "default": null,
>                                             "name": "version",
>                                             "type": [
>                                                 "null",
>                                                 "long"
>                                             ]
>                                         }
>                                     ],
>                                     "name": "VORGmaster_cluster",
>                                     "type": "record"
>                                 }
>                             ]
>                         },
> We were suspecting PARQUET-1441 could be in play and tried to reproduce the 
> issue on parquet-1.12.0 and seeing the same exception.
> During the repro noticed that issue could be with avroSchema conversion 
> (field name was substituted with generic name "array").  While we look into 
> this further, want to get community input on whether this is a known issue 
> and any thoughts on path forward.
> 19/09/12 22:34:37 DEBUG avro.SchemaCompatibility: Checking compatibility of 
> reader 
> {"type":"record","name":"IDphones_items","fields":[{"name":"phone_number","type":["null","string"],"default":null}]}
>  with writer 
> {"type":"record","name":"array","fields":[{"name":"phone_number","type":["null","string"],"default":null}]}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to