kfaraz commented on code in PR #18477:
URL: https://github.com/apache/druid/pull/18477#discussion_r2325372696


##########
processing/src/main/java/org/apache/druid/java/util/common/Intervals.java:
##########
@@ -53,6 +57,74 @@ public static Interval of(String format, Object... 
formatArgs)
     return of(StringUtils.format(format, formatArgs));
   }
 
+  /**
+   * A performance-optimized method for parsing a Joda-Time {@link Interval} 
from a string.
+   * This method is significantly faster than the standard {@link 
Intervals#of(String)} for the following
+   * group of offsets:
+   * <ol>
+   *   <li>"2022-01-01T00:00:00.000Z/2022-01-02T00:00:00.000Z"</li>
+   *   <li>"2022-01-01T00:00:00.000+05:30/2022-01-01T01:00:00.000+05:30"</li>
+   *   <li>"2022-01-01T00:00:00.000+0530/2022-01-01T01:00:00.000+0530"</li>
+   * </ol>
+   * <p>
+   * If the input string does not match the format, it will fall back to the 
more flexible but
+   * slower {@link Intervals#of(String)} parser. If you are dealing with any 
Intervals format examples below,
+   * consider using {@link Intervals#of(String)} instead:
+   * <ol>
+   *   <li>"2022-01-01T00:00:00Z/2022-01-02T00:00:00Z" (without millis)</li>
+   *   <li>"2022-01-01/2022-01-02" (Date only)</li>
+   *   <li>"2022-01-01T12:00:00.000Z/PT6H" (Periods in start / end)</li>
+   * </ol>
+   *
+   * Currently, this method is only used in {@link 
org.apache.druid.timeline.SegmentId}.
+   * Should you find additional places where the Interval format is guaranteed 
to be compatible with this method,
+   * feel free to open a PR.

Review Comment:
   Nit: We can leave out these lines.



##########
processing/src/main/java/org/apache/druid/java/util/common/Intervals.java:
##########
@@ -25,13 +25,17 @@
 import org.joda.time.DateTime;
 import org.joda.time.Interval;
 import org.joda.time.chrono.ISOChronology;
+import org.joda.time.format.DateTimeFormatter;
+import org.joda.time.format.ISODateTimeFormat;
 
 import javax.annotation.Nullable;
 
 public final class Intervals
 {
   public static final Interval ETERNITY = utc(JodaUtils.MIN_INSTANT, 
JodaUtils.MAX_INSTANT);
   public static final ImmutableList<Interval> ONLY_ETERNITY = 
ImmutableList.of(ETERNITY);
+  private static final DateTimeFormatter FAST_ISO_UTC_FORMATTER =

Review Comment:
   Maybe also check out the other two constant formatters in that class. See if 
they can help improve performance of the new method for other date formats. But 
this can be done in a follow up PR.



##########
processing/src/main/java/org/apache/druid/java/util/common/Intervals.java:
##########
@@ -25,13 +25,17 @@
 import org.joda.time.DateTime;
 import org.joda.time.Interval;
 import org.joda.time.chrono.ISOChronology;
+import org.joda.time.format.DateTimeFormatter;
+import org.joda.time.format.ISODateTimeFormat;
 
 import javax.annotation.Nullable;
 
 public final class Intervals
 {
   public static final Interval ETERNITY = utc(JodaUtils.MIN_INSTANT, 
JodaUtils.MAX_INSTANT);
   public static final ImmutableList<Interval> ONLY_ETERNITY = 
ImmutableList.of(ETERNITY);
+  private static final DateTimeFormatter FAST_ISO_UTC_FORMATTER =

Review Comment:
   I think you can use `DateTimes.ISO_DATE_TIME` instead of this since it uses 
the same formatter internally.



##########
processing/src/main/java/org/apache/druid/java/util/common/Intervals.java:
##########
@@ -53,6 +57,74 @@ public static Interval of(String format, Object... 
formatArgs)
     return of(StringUtils.format(format, formatArgs));
   }
 
+  /**
+   * A performance-optimized method for parsing a Joda-Time {@link Interval} 
from a string.
+   * This method is significantly faster than the standard {@link 
Intervals#of(String)} for the following
+   * group of offsets:
+   * <ol>
+   *   <li>"2022-01-01T00:00:00.000Z/2022-01-02T00:00:00.000Z"</li>
+   *   <li>"2022-01-01T00:00:00.000+05:30/2022-01-01T01:00:00.000+05:30"</li>
+   *   <li>"2022-01-01T00:00:00.000+0530/2022-01-01T01:00:00.000+0530"</li>
+   * </ol>
+   * <p>
+   * If the input string does not match the format, it will fall back to the 
more flexible but
+   * slower {@link Intervals#of(String)} parser. If you are dealing with any 
Intervals format examples below,
+   * consider using {@link Intervals#of(String)} instead:
+   * <ol>
+   *   <li>"2022-01-01T00:00:00Z/2022-01-02T00:00:00Z" (without millis)</li>
+   *   <li>"2022-01-01/2022-01-02" (Date only)</li>
+   *   <li>"2022-01-01T12:00:00.000Z/PT6H" (Periods in start / end)</li>
+   * </ol>
+   *
+   * Currently, this method is only used in {@link 
org.apache.druid.timeline.SegmentId}.
+   * Should you find additional places where the Interval format is guaranteed 
to be compatible with this method,
+   * feel free to open a PR.
+   */
+  public static Interval fromString(String string)
+  {
+    Interval interval = null;
+    if (canDeserializeIntervalOptimallyFromString(string)) {
+      interval = tryOptimizedIntervalDeserialization(string);
+    }
+
+    if (interval == null) {
+      return Intervals.of(string);
+    } else {
+      return interval;
+    }

Review Comment:
   Nit:
   ```suggestion
       return interval == null ? Intervals.of(string) : interval;
   ```



##########
processing/src/main/java/org/apache/druid/timeline/DataSegment.java:
##########
@@ -196,6 +197,37 @@ public DataSegment(
       @JsonProperty("size") long size,
       @JacksonInject PruneSpecsHolder pruneSpecsHolder
   )
+  {
+    this(

Review Comment:
   Please add a comment here mentioning that we take `interval` input as String 
so that we can deserialize it optimally.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to