ArnavBalyan opened a new pull request, #3332:
URL: https://github.com/apache/parquet-java/pull/3332

   ### Rationale for this change
    - Parquet cat fails for files with hyphens since cat uses avro reader by 
default which has stricter rules.
    - Ensure we can still read pure parquet files, with parquet group reader as 
fallback.
    - Existing reader remains unchanged, we just read with parquet group reader 
if the read fails.
   
   Before:
   ```
   Time elapsed: 1.170 s <<< ERROR!
   org.apache.avro.SchemaParseException: Illegal character in: customer-name
           at org.apache.avro.Schema.validateName(Schema.java:1625)
           at org.apache.avro.Schema.access$400(Schema.java:94)
           at org.apache.avro.Schema$Field.<init>(Schema.java:558)
           at 
org.apache.avro.SchemaBuilder$FieldBuilder.completeField(SchemaBuilder.java:2258)
           at 
org.apache.avro.SchemaBuilder$FieldBuilder.completeField(SchemaBuilder.java:2254)
           at 
org.apache.avro.SchemaBuilder$FieldBuilder.access$5100(SchemaBuilder.java:2150)
           at 
org.apache.avro.SchemaBuilder$GenericDefault.noDefault(SchemaBuilder.java:2557)
           at 
org.apache.parquet.avro.AvroSchemaConverter.convertFields(AvroSchemaConverter.java:376)
   ```
   
   After:
   ```
   order_id: 1001
   customer-name: John Smith
   product-category: Electronics
   sale-amount: 299.99
   region: North
   ```
   
   ### Are these changes tested?
   - Yes
   
   ### Are there any user-facing changes?
    - Yes
   
   Closes: #2836


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to