pyckle opened a new pull request, #3365:
URL: https://github.com/apache/parquet-java/pull/3365

   <!--
   Thanks for opening a pull request!
   
   If you're new to Parquet-Java, information on how to contribute can be found 
here: https://parquet.apache.org/docs/contribution-guidelines/contributing
   
   Please open a GitHub issue for this pull request: 
https://github.com/apache/parquet-java/issues/new/choose
   and format pull request title as below:
   
       GH-${GITHUB_ISSUE_ID}: ${SUMMARY}
   
   or simply use the title below if it is a minor issue:
   
       MINOR: ${SUMMARY}
   
   -->
   
   ### Rationale for this change
   
   ParquetMetadataConverter has gotten way too large - it needs to be broken up.
   SchemaElement conversion is a good starting point to refactor into an 
external class because:
   - It is an actively developed part of the class - recent changes for variant 
and geographical types have changed this code
   - It's not strongly coupled to other conversion logic
   - Moving it to parquet-column will reduce code duplication in parquet 
readers want hadoop dependencies (full disclosure: I [had to duplicate some of 
this 
code](https://github.com/Earnix/parquetforge/blob/master/parquetforge-base/src/main/java/com/earnix/parquet/columnar/reader/ParquetMetadataUtils.java)
 in my downstream parquet lib)
   
   ### What changes are included in this PR?
   All SchemaElement logic is moved to ParquetSchemaConverter in the 
parquet-column project.
   Further cleanup to remove boiler plate enum conversion logic to a different 
separate class has been done. Tests are also moved appropriately.
   Minor deduplication was done for getting LogicalTypeAnnotation from 
deprecated ConvertedType enum.
   
   ### Are these changes tested?
   Existing tests have been carefully moved to ensure no changes in behavior.
   
   ### Are there any user-facing changes?
   - Conversion functions for SchemaElement to and from MessageType are now 
public.
   - Existing public functions that were moved are now deprecated delegates to 
ensure backwards compatibility. 
   
   <!-- Please uncomment the line below and replace ${GITHUB_ISSUE_ID} with the 
actual Github issue id. -->
   Closes #1835
   Further cleanup of this class is needed, and as such, perhaps closing this 
issue is not the correct action. I think the next candidate to refactor out is 
the ColumnChunk metadata conversion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to