[jira] [Commented] (PARQUET-2382) Remove the deprecated OriginalType

2023-11-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789365#comment-17789365
 ] 

ASF GitHub Bot commented on PARQUET-2382:
-

Fokko commented on PR #1194:
URL: https://github.com/apache/parquet-mr/pull/1194#issuecomment-1825293064

   Makes sense. I think it would be good to remove `OriginalType` at some 
point. Let's target this PR for 2.0 and leave it for now. I'll create another 
PR to mark the `getOriginalType()` deprecated (this one was marked as private 
by Yetus before, but I think it would be best to mark them as deprecated for 
1.14.0).




> Remove the deprecated OriginalType
> --
>
> Key: PARQUET-2382
> URL: https://issues.apache.org/jira/browse/PARQUET-2382
> Project: Parquet
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2382) Remove the deprecated OriginalType

2023-11-17 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17787093#comment-17787093
 ] 

ASF GitHub Bot commented on PARQUET-2382:
-

gszadovszky commented on PR #1194:
URL: https://github.com/apache/parquet-mr/pull/1194#issuecomment-1815995828

   I don't think it is a good idea to remove deprecated API in a minor release. 
That's why we have japicmp to ensure backward compatibility.
   I think, there is no harm for Parquet if they use the old `OriginalType`s. 
If we enforce significant code changes in minor releases we would also slow 
down the upgrades.




> Remove the deprecated OriginalType
> --
>
> Key: PARQUET-2382
> URL: https://issues.apache.org/jira/browse/PARQUET-2382
> Project: Parquet
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2382) Remove the deprecated OriginalType

2023-11-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786835#comment-17786835
 ] 

ASF GitHub Bot commented on PARQUET-2382:
-

wgtmac commented on PR #1194:
URL: https://github.com/apache/parquet-mr/pull/1194#issuecomment-1814785912

   I am fine with it but I'd like to seek advices from @gszadovszky @shangxinli 




> Remove the deprecated OriginalType
> --
>
> Key: PARQUET-2382
> URL: https://issues.apache.org/jira/browse/PARQUET-2382
> Project: Parquet
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2382) Remove the deprecated OriginalType

2023-11-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786452#comment-17786452
 ] 

ASF GitHub Bot commented on PARQUET-2382:
-

Fokko commented on code in PR #1194:
URL: https://github.com/apache/parquet-mr/pull/1194#discussion_r1394558775


##
parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java:
##
@@ -1715,15 +1729,15 @@ private void buildChildren(Types.GroupBuilder builder,
 childBuilder.as(getLogicalTypeAnnotation(schemaElement.logicalType));
   }
   if (schemaElement.isSetConverted_type()) {
-OriginalType originalType = 
getLogicalTypeAnnotation(schemaElement.converted_type, 
schemaElement).toOriginalType();
-OriginalType newOriginalType = (schemaElement.isSetLogicalType() && 
getLogicalTypeAnnotation(schemaElement.logicalType) != null) ?
-   
getLogicalTypeAnnotation(schemaElement.logicalType).toOriginalType() : null;
-if (!originalType.equals(newOriginalType)) {

Review Comment:
   I think there was a  hidden here. Because we would only compare on the 
logical type itself, and not its properties (precision/scale for decimal, or 
adjust-for-utc for time/timestamp). Tests started failing, therefore I added 
the `getAdjustToUtc` to retrieve the actual value from the Parquet structure.



##
parquet-hadoop/src/main/java/org/apache/parquet/format/converter/ParquetMetadataConverter.java:
##
@@ -1715,15 +1729,15 @@ private void buildChildren(Types.GroupBuilder builder,
 childBuilder.as(getLogicalTypeAnnotation(schemaElement.logicalType));
   }
   if (schemaElement.isSetConverted_type()) {
-OriginalType originalType = 
getLogicalTypeAnnotation(schemaElement.converted_type, 
schemaElement).toOriginalType();
-OriginalType newOriginalType = (schemaElement.isSetLogicalType() && 
getLogicalTypeAnnotation(schemaElement.logicalType) != null) ?
-   
getLogicalTypeAnnotation(schemaElement.logicalType).toOriginalType() : null;
-if (!originalType.equals(newOriginalType)) {

Review Comment:
   I think there was a  hidden here. Because we would only compare on the 
logical type itself, and not its properties (precision/scale for decimal, or 
adjust-for-utc for time/timestamp). Tests started failing, therefore I added 
the `getAdjustToUtc` to retrieve the actual value from the Parquet structure.





> Remove the deprecated OriginalType
> --
>
> Key: PARQUET-2382
> URL: https://issues.apache.org/jira/browse/PARQUET-2382
> Project: Parquet
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (PARQUET-2382) Remove the deprecated OriginalType

2023-11-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/PARQUET-2382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17786450#comment-17786450
 ] 

ASF GitHub Bot commented on PARQUET-2382:
-

Fokko opened a new pull request, #1194:
URL: https://github.com/apache/parquet-mr/pull/1194

   Make sure you have checked _all_ steps below.
   
   For Iceberg we're adding nanosecond timestamps. During my investigation in 
Parquet, I noticed that there are two ways of declaring logical types:
   
   1. Through the deprecated `OriginalType`
   2. Using the new LogicalTypeAnnotation API.
   
   The old API does not support nano's but is still used in downstream 
projects, such as Parquet, where I'm working on migrating to the new API: 
https://github.com/apache/iceberg/pull/9063
   
   However, since it was five years ago in 
[PARQUET-1452](https://issues.apache.org/jira/browse/PARQUET-1452) when it was 
marked as deprecated, released in Parquet 1.11.0. I would love to remove the 
old API to make sure that downstream engines migrate to the new API and handle 
nano's correctly.
   
   ### Jira
   
   - [ ] My PR addresses the following [Parquet 
Jira](https://issues.apache.org/jira/browse/PARQUET/) issues and references 
them in the PR title. For example, "PARQUET-1234: My Parquet PR"
 - https://issues.apache.org/jira/browse/PARQUET-XXX
 - In case you are adding a dependency, check if the license complies with 
the [ASF 3rd Party License 
Policy](https://www.apache.org/legal/resolved.html#category-x).
   
   ### Tests
   
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   ### Commits
   
   - [ ] My commits all reference Jira issues in their subject lines. In 
addition, my commits follow the guidelines from "[How to write a good git 
commit message](http://chris.beams.io/posts/git-commit/)":
 1. Subject is separated from body by a blank line
 1. Subject is limited to 50 characters (not including Jira issue reference)
 1. Subject does not end with a period
 1. Subject uses the imperative mood ("add", not "adding")
 1. Body wraps at 72 characters
 1. Body explains "what" and "why", not "how"
   
   ### Documentation
   
   - [ ] In case of new functionality, my PR adds documentation that describes 
how to use it.
 - All the public functions and the classes in the PR contain Javadoc that 
explain what it does
   




> Remove the deprecated OriginalType
> --
>
> Key: PARQUET-2382
> URL: https://issues.apache.org/jira/browse/PARQUET-2382
> Project: Parquet
>  Issue Type: Improvement
>Reporter: Fokko Driesprong
>Assignee: Fokko Driesprong
>Priority: Major
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)