ArnavBalyan opened a new pull request, #3332:
URL: https://github.com/apache/parquet-java/pull/3332
### Rationale for this change
- Parquet cat fails for files with hyphens since cat uses avro reader by
default which has stricter rules.
- Ensure we can still read pure parquet files, with parquet group reader as
fallback.
- Existing reader remains unchanged, we just read with parquet group reader
if the read fails.
Before:
```
Time elapsed: 1.170 s <<< ERROR!
org.apache.avro.SchemaParseException: Illegal character in: customer-name
at org.apache.avro.Schema.validateName(Schema.java:1625)
at org.apache.avro.Schema.access$400(Schema.java:94)
at org.apache.avro.Schema$Field.<init>(Schema.java:558)
at
org.apache.avro.SchemaBuilder$FieldBuilder.completeField(SchemaBuilder.java:2258)
at
org.apache.avro.SchemaBuilder$FieldBuilder.completeField(SchemaBuilder.java:2254)
at
org.apache.avro.SchemaBuilder$FieldBuilder.access$5100(SchemaBuilder.java:2150)
at
org.apache.avro.SchemaBuilder$GenericDefault.noDefault(SchemaBuilder.java:2557)
at
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:376)
```
After:
```
order_id: 1001
customer-name: John Smith
product-category: Electronics
sale-amount: 299.99
region: North
```
### Are these changes tested?
- Yes
### Are there any user-facing changes?
- Yes
Closes: #2836
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]