FawnD2 opened a new pull request #9649: URL: https://github.com/apache/arrow/pull/9649
In parquet-reader there are two ways to output the schema for a Parquet file: DebugPrint and JSONPrint. When output in JSON format, the Column name is short name instead of full-qualified name. For example, for schema (1), there will be 2 Columns with `"Name": "key"`. That's very confusing. In this PR we start using full-qualified name for Column in JSONPrint instead of short name, similar to DebugPrint. (1): ``` required group field_id=0 spark_schema { optional group field_id=1 a (Map) { repeated group field_id=2 key_value { required binary field_id=3 key (String); optional group field_id=4 value (Map) { repeated group field_id=5 key_value { required int32 field_id=6 key; required boolean field_id=7 value; } } } } } ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org