[ 
https://issues.apache.org/jira/browse/HIVE-26233?focusedWorklogId=774483&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-774483
 ]

ASF GitHub Bot logged work on HIVE-26233:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 25/May/22 11:12
            Start Date: 25/May/22 11:12
    Worklog Time Spent: 10m 
      Work Description: zabetak commented on code in PR #3295:
URL: https://github.com/apache/hive/pull/3295#discussion_r881514185


##########
common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java:
##########
@@ -101,6 +101,16 @@ public class Timestamp implements Comparable<Timestamp> {
       // Fractional Part (Optional)
       .optionalStart().appendFraction(ChronoField.NANO_OF_SECOND, 0, 9, 
true).optionalEnd().toFormatter();
 
+  private static final DateTimeFormatter PRINT_LENIENT_FORMATTER = new 
DateTimeFormatterBuilder()
+      // Date and Time Parts
+      .appendValue(YEAR, 4, 10, 
SignStyle.NORMAL).appendLiteral('-').appendValue(MONTH_OF_YEAR, 2, 2, 
SignStyle.NORMAL)
+      .appendLiteral('-').appendValue(DAY_OF_MONTH, 2, 2, SignStyle.NORMAL)
+      .appendLiteral(" ").appendValue(HOUR_OF_DAY, 2, 2, 
SignStyle.NORMAL).appendLiteral(':')
+      .appendValue(MINUTE_OF_HOUR, 2, 2, SignStyle.NORMAL).appendLiteral(':')
+      .appendValue(SECOND_OF_MINUTE, 2, 2, SignStyle.NORMAL)
+      // Fractional Part (Optional)
+      .optionalStart().appendFraction(ChronoField.NANO_OF_SECOND, 0, 9, 
true).optionalEnd().toFormatter();
+

Review Comment:
   Maybe it would be better to move this formatter in `TimestampTZUtil` since 
it should be strictly used for `LEGACY` purposes and since there is another 
formatter there as well.



##########
ql/src/test/org/apache/hadoop/hive/ql/io/parquet/serde/TestParquetTimestampUtils.java:
##########
@@ -389,6 +390,26 @@ public void testIllegalInt64TimestampStrings() {
     verifyInt64TimestampValue("2262-04-11 23:47:16.854775808", 
LogicalTypeAnnotation.TimeUnit.NANOS, false);
   }
 
+  @Test
+  public void testTimestamp9999() {

Review Comment:
   What do you think of refactoring and moving the test in 
`TestParquetTimestampsHive2Compatibility` which exactly about compatibility 
with Hive 2? 



##########
common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java:
##########
@@ -101,6 +101,16 @@ public class Timestamp implements Comparable<Timestamp> {
       // Fractional Part (Optional)
       .optionalStart().appendFraction(ChronoField.NANO_OF_SECOND, 0, 9, 
true).optionalEnd().toFormatter();
 
+  private static final DateTimeFormatter PRINT_LENIENT_FORMATTER = new 
DateTimeFormatterBuilder()
+      // Date and Time Parts
+      .appendValue(YEAR, 4, 10, 
SignStyle.NORMAL).appendLiteral('-').appendValue(MONTH_OF_YEAR, 2, 2, 
SignStyle.NORMAL)
+      .appendLiteral('-').appendValue(DAY_OF_MONTH, 2, 2, SignStyle.NORMAL)
+      .appendLiteral(" ").appendValue(HOUR_OF_DAY, 2, 2, 
SignStyle.NORMAL).appendLiteral(':')
+      .appendValue(MINUTE_OF_HOUR, 2, 2, SignStyle.NORMAL).appendLiteral(':')
+      .appendValue(SECOND_OF_MINUTE, 2, 2, SignStyle.NORMAL)
+      // Fractional Part (Optional)
+      .optionalStart().appendFraction(ChronoField.NANO_OF_SECOND, 0, 9, 
true).optionalEnd().toFormatter();
+

Review Comment:
   Also using `LENIENT` in the name is a bit misleading since it implies that 
`DateTimeFormatterBuilder#parseLenient` is in use which is not the case here.



##########
common/src/java/org/apache/hadoop/hive/common/type/Timestamp.java:
##########
@@ -128,6 +138,10 @@ public String toString() {
     return localDateTime.format(PRINT_FORMATTER);
   }
 
+  public String toStingWithLenientFormatter() {
+    return localDateTime.format(PRINT_LENIENT_FORMATTER);
+  }
+

Review Comment:
   The use of `Lenient` is a bit misleading as I wrote previously. Also there 
is a small typo in the method name `toSting` vs `toString`.
   
   Instead of adding a new method we could use `Timestamp#format` passing in 
the desired formatter. With the right naming for the formatter parameter it 
would make the intention more clear.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 774483)
    Time Spent: 40m  (was: 0.5h)

> Problems reading back PARQUET timestamps above 10000 years
> ----------------------------------------------------------
>
>                 Key: HIVE-26233
>                 URL: https://issues.apache.org/jira/browse/HIVE-26233
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Peter Vary
>            Assignee: Peter Vary
>            Priority: Major
>              Labels: backwards-compatibility, pull-request-available, 
> timestamp
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> Timestamp values above year 10000 are not supported, but during the migration 
> from Hive2 to Hive3 some might appear because of TZ issues. We should be able 
> to at least read these tables before rewriting the data.
> For this we need to change the Timestamp.PRINT_FORMATTER, so no {{+}} sign is 
> appended to the timestamp if the year exceeds 4 digits.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to