pyckle opened a new pull request, #3365: URL: https://github.com/apache/parquet-java/pull/3365
<!-- Thanks for opening a pull request! If you're new to Parquet-Java, information on how to contribute can be found here: https://parquet.apache.org/docs/contribution-guidelines/contributing Please open a GitHub issue for this pull request: https://github.com/apache/parquet-java/issues/new/choose and format pull request title as below: GH-${GITHUB_ISSUE_ID}: ${SUMMARY} or simply use the title below if it is a minor issue: MINOR: ${SUMMARY} --> ### Rationale for this change ParquetMetadataConverter has gotten way too large - it needs to be broken up. SchemaElement conversion is a good starting point to refactor into an external class because: - It is an actively developed part of the class - recent changes for variant and geographical types have changed this code - It's not strongly coupled to other conversion logic - Moving it to parquet-column will reduce code duplication in parquet readers want hadoop dependencies (full disclosure: I [had to duplicate some of this code](https://github.com/Earnix/parquetforge/blob/master/parquetforge-base/src/main/java/com/earnix/parquet/columnar/reader/ParquetMetadataUtils.java) in my downstream parquet lib) ### What changes are included in this PR? All SchemaElement logic is moved to ParquetSchemaConverter in the parquet-column project. Further cleanup to remove boiler plate enum conversion logic to a different separate class has been done. Tests are also moved appropriately. Minor deduplication was done for getting LogicalTypeAnnotation from deprecated ConvertedType enum. ### Are these changes tested? Existing tests have been carefully moved to ensure no changes in behavior. ### Are there any user-facing changes? - Conversion functions for SchemaElement to and from MessageType are now public. - Existing public functions that were moved are now deprecated delegates to ensure backwards compatibility. <!-- Please uncomment the line below and replace ${GITHUB_ISSUE_ID} with the actual Github issue id. --> Closes #1835 Further cleanup of this class is needed, and as such, perhaps closing this issue is not the correct action. I think the next candidate to refactor out is the ColumnChunk metadata conversion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
