[ 
https://issues.apache.org/jira/browse/GOBBLIN-1708?focusedWorklogId=810219&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-810219
 ]

ASF GitHub Bot logged work on GOBBLIN-1708:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Sep/22 00:56
            Start Date: 20/Sep/22 00:56
    Worklog Time Spent: 10m 
      Work Description: Will-Lo commented on code in PR #3563:
URL: https://github.com/apache/gobblin/pull/3563#discussion_r974794192


##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/TimeAwareRecursiveCopyableDataset.java:
##########
@@ -134,9 +134,51 @@ protected List<FileStatus> getFilesAtPath(FileSystem fs, 
Path path, PathFilter f
     return recursivelyGetFilesAtDatePath(fs, path, "", fileFilter, 1, 
startDate, endDate, formatter);
   }
 
+  /**
+   * Checks if the datePath provided is in the range of the start and end 
dates.
+   * Rounds startDate and endDate to the same granularity as datePath prior to 
comparing.
+   * Returns true if the datePath provided is in the range of start and end 
dates, inclusive.
+   * @param startDate
+   * @param endDate
+   * @param datePath
+   * @param datePathFormat (This is the user set desired format)
+   * @param level
+   * @return true/false
+   */
+  public Boolean checkPathDateTimeValidity(LocalDateTime startDate, 
LocalDateTime endDate, String datePath, String datePathFormat, int level) {
+    String [] array = datePathFormat.split("/");
+    StringBuilder datePathPattern = new StringBuilder();
+
+    for (int index = 1; index < level; index++) {
+      if (index > 1) {
+        datePathPattern.append("/");
+      }
+      datePathPattern.append(array[index - 1]);
+    }
+
+    try {
+      DateTimeFormatter formatGranularity = 
DateTimeFormat.forPattern(datePathPattern.toString());
+      LocalDateTime traversedDatePathRound = 
formatGranularity.parseLocalDateTime(datePath);
+      LocalDateTime startDateRound = 
formatGranularity.parseLocalDateTime(startDate.toString(datePathPattern.toString()));
+      LocalDateTime endDateRound = 
formatGranularity.parseLocalDateTime(endDate.toString(datePathPattern.toString()));
+
+      boolean afterOrOnStartDate = 
traversedDatePathRound.isAfter(startDateRound) || 
traversedDatePathRound.isEqual(startDateRound);
+      boolean beforeOrOnEndDate = 
traversedDatePathRound.isBefore(endDateRound) || 
traversedDatePathRound.isEqual(endDateRound);
+      return afterOrOnStartDate && beforeOrOnEndDate;
+    } catch (IllegalArgumentException e) {
+      log.error("Cannot parse path " + datePath);

Review Comment:
   Add some expectation around this log too, 
   `String.format("Cannot parse path at %s, expected in format of %s", 
datePath, datePathPattern)`



##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/TimeAwareRecursiveCopyableDataset.java:
##########
@@ -134,9 +134,51 @@ protected List<FileStatus> getFilesAtPath(FileSystem fs, 
Path path, PathFilter f
     return recursivelyGetFilesAtDatePath(fs, path, "", fileFilter, 1, 
startDate, endDate, formatter);
   }
 
+  /**
+   * Checks if the datePath provided is in the range of the start and end 
dates.
+   * Rounds startDate and endDate to the same granularity as datePath prior to 
comparing.
+   * Returns true if the datePath provided is in the range of start and end 
dates, inclusive.
+   * @param startDate
+   * @param endDate
+   * @param datePath
+   * @param datePathFormat (This is the user set desired format)
+   * @param level
+   * @return true/false
+   */
+  public Boolean checkPathDateTimeValidity(LocalDateTime startDate, 
LocalDateTime endDate, String datePath, String datePathFormat, int level) {
+    String [] array = datePathFormat.split("/");
+    StringBuilder datePathPattern = new StringBuilder();
+
+    for (int index = 1; index < level; index++) {
+      if (index > 1) {
+        datePathPattern.append("/");
+      }
+      datePathPattern.append(array[index - 1]);
+    }

Review Comment:
   I think an easier way of doing this is to define a list, and use
   `String.join("/", Arrays.asList(datePathFormatArray).subList(0, level));` 
which essentially gives you the reconstructed datePathFormat





Issue Time Tracking
-------------------

    Worklog Id:     (was: 810219)
    Time Spent: 3h 10m  (was: 3h)

> Improve TimeAwareRecursiveCopyableDataset to lookback only into datefolders 
> that match range
> --------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1708
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1708
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Andy Jiang
>            Priority: Major
>          Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to