GWphua commented on code in PR #18477:
URL: https://github.com/apache/druid/pull/18477#discussion_r2321411252


##########
processing/src/test/java/org/apache/druid/java/util/common/IntervalsTest.java:
##########
@@ -79,6 +79,44 @@ public void testFindOverlappingInterval()
     );
   }
 
+  @Test
+  public void testValidIntervalStrings()
+  {
+    final String[] intervalStringRepresentations = new String[]{
+        // Tests that use does not fallback to Intervals.of()
+        // Zulu with millis
+        "2022-01-01T00:00:00.000Z/2022-01-02T00:00:00.000Z",
+        "2021-03-14T12:34:56.789Z/2021-03-15T12:34:56.789Z",
+
+        // Offset with colon
+        "2022-01-01T00:00:00.000+05:30/2022-01-01T01:00:00.000+05:30",
+        "2022-01-01T07:00:00.000-07:00/2022-01-01T08:00:00.000-07:00",
+
+        // Basic offset without colon
+        "2022-01-01T00:00:00.000+0530/2022-01-01T01:00:00.000+0530",
+
+        // Tests that fallback to Intervals.of()
+        // Zulu without millis
+        "2022-01-01T00:00:00Z/2022-01-02T00:00:00Z",
+        // Date-only
+        "2022-01-01/2022-01-02",
+        // start/period
+        "2022-01-01T00:00:00.000Z/P1D",
+        "2022-01-01T12:00:00Z/PT6H",
+        "2022-01-01T00:00:00Z/P2DT3H4M5S",
+        // period/end
+        "P1D/2022-01-02T00:00:00.000Z",
+        "PT6H/2022-01-01T18:00:00Z",
+        "P2DT3H4M5S/2022-01-03T03:04:05Z"

Review Comment:
   Hi @kfaraz 
   
   I attempted to make `tryOptimizedIntervalDeserialization` more inclusive 
(e.g. Work on Date-only, and Without Millis case), but this requires changing 
the `FAST_ISO_UTC_FORMATTER` DateTimeFormatter to a more flexible formatter. 
This causes performance to resemble the un-optimized versions.
   
   These 2 formatters allow formats like `"2022-01-01/2022-01-02"` or 
`"2022-01-01T00:00:00Z/2022-01-02T00:00:00Z"`, but the performance is lacking...
   
   ```
   private static final DateTimeFormatter FAST_ISO_UTC_FORMATTER =
         ISODateTimeFormat.dateOptionalTimeParser()
                          .withOffsetParsed()
                          .withChronology(ISOChronology.getInstanceUTC());
   
   Benchmark                                                          
(numValues)  Mode  Cnt   Score   Error  Units
   JodaIntervalDeserializationBenchmark.deserializeLegacy                   
20000  avgt    2  21.243          ms/op
   JodaIntervalDeserializationBenchmark.deserializeLegacyFallback           
20000  avgt    2  15.217          ms/op
   JodaIntervalDeserializationBenchmark.deserializeOptimized                
20000  avgt    2  20.605          ms/op
   JodaIntervalDeserializationBenchmark.deserializeOptimizedFallback        
20000  avgt    2  31.051          ms/op
   ```
   
   ```
   private static final DateTimeFormatter FAST_ISO_UTC_FORMATTER =
         
ISODateTimeFormat.dateTimeParser().withChronology(ISOChronology.getInstanceUTC());
   
   
   Benchmark                                                          
(numValues)  Mode  Cnt   Score   Error  Units
   JodaIntervalDeserializationBenchmark.deserializeLegacy                   
20000  avgt    3  21.502 ± 0.995  ms/op
   JodaIntervalDeserializationBenchmark.deserializeLegacyFallback           
20000  avgt    3  15.365 ± 1.227  ms/op
   JodaIntervalDeserializationBenchmark.deserializeOptimized                
20000  avgt    3  21.239 ± 1.543  ms/op
   JodaIntervalDeserializationBenchmark.deserializeOptimizedFallback        
20000  avgt    3  31.526 ± 3.083  ms/op
   ```
   
   I think we can simply take advantage of knowing how our data will look like, 
and optimize the parse process for the most frequently-used data format. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org
For additional commands, e-mail: commits-h...@druid.apache.org

Reply via email to