MaxGekk commented on a change in pull request #27524: [WIP][SQL] Support
`SimpleDateFormat` and `FastDateFormat` as legacy date/timestamp formatters
URL: https://github.com/apache/spark/pull/27524#discussion_r377357685
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/TimestampFormatter.scala
##########
@@ -90,47 +91,168 @@ class FractionTimestampFormatter(zoneId: ZoneId)
override protected lazy val formatter =
DateTimeFormatterHelper.fractionFormatter
}
-class LegacyTimestampFormatter(
+/**
+ * The custom sub-class of `GregorianCalendar` is needed to get access to
+ * protected `fields` immediately after parsing. We cannot use
+ * the `get()` method because it performs normalization of the fraction
+ * part. Accordingly, the `MILLISECOND` field doesn't contain original value.
+ *
+ * Also this class allows to set raw value to the `MILLISECOND` field
+ * directly before formatting.
+ */
+class MicrosCalendar(tz: TimeZone, digitsInFraction: Int)
+ extends GregorianCalendar(tz, Locale.US) {
+ // Converts parsed `MILLISECOND` field to seconds fraction in microsecond
precision.
+ // For example if the fraction pattern is `SSSS` then `digitsInFraction` =
4, and
+ // if the `MILLISECOND` field was parsed to `1234`.
+ def getMicros(): SQLTimestamp = {
+ // Append 6 zeros to the field: 1234 -> 1234000000
+ val d = fields(Calendar.MILLISECOND) * MICROS_PER_SECOND
+ // Take the first 6 digits from `d`: 1234000000 -> 123400
+ // The rest contains exactly `digitsInFraction`: `0000` = 10 ^
digitsInFraction
+ // So, the result is `(1234 * 1000000) / (10 ^ digitsInFraction)
+ d / Decimal.POW_10(digitsInFraction)
+ }
+
+ // Converts the seconds fraction in microsecond precision to a value
+ // that can be correctly formatted according to the specified fraction
pattern.
+ // The method performs operations opposite to `getMicros()`.
+ def setMicros(micros: Long): Unit = {
+ val d = micros * Decimal.POW_10(digitsInFraction)
+ fields(Calendar.MILLISECOND) = (d / MICROS_PER_SECOND).toInt
+ }
+}
+
+/**
+ * An instance of the class is aimed to re-use many times. It contains helper
objects
+ * `cal` which is reused between `parse()` and `format` invokes.
+ */
+class LegacyFastDateFormat(fastDateFormat: FastDateFormat) {
Review comment:
Copy-pasted from 2.4 to support arbitrary second fraction, and
parsing/formatting in microsecond precision using the legacy parser -
`FastDateFormat` cannot do that by default.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]