[ 
https://issues.apache.org/jira/browse/GOBBLIN-1708?focusedWorklogId=809623&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-809623
 ]

ASF GitHub Bot logged work on GOBBLIN-1708:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 16/Sep/22 18:44
            Start Date: 16/Sep/22 18:44
    Worklog Time Spent: 10m 
      Work Description: Will-Lo commented on code in PR #3563:
URL: https://github.com/apache/gobblin/pull/3563#discussion_r973300712


##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/TimeAwareRecursiveCopyableDataset.java:
##########
@@ -134,9 +135,37 @@ protected List<FileStatus> getFilesAtPath(FileSystem fs, 
Path path, PathFilter f
     return recursivelyGetFilesAtDatePath(fs, path, "", fileFilter, 1, 
startDate, endDate, formatter);
   }
 
+  public Boolean checkPathDateTimeValidity(LocalDateTime startDate, 
LocalDateTime endDate, String traversedDatePath) {
+    int[] startDateSplit = new int[] { startDate.getYear(), 
startDate.getMonthOfYear(), startDate.getDayOfMonth(),
+        startDate.getHourOfDay(), startDate.getMinuteOfHour(), 
startDate.getSecondOfMinute(), startDate.getMillisOfSecond() };
+    int[] endDateSplit = new int[] { endDate.getYear(), 
endDate.getMonthOfYear(), endDate.getDayOfMonth(),
+        endDate.getHourOfDay(), endDate.getMinuteOfHour(), 
endDate.getSecondOfMinute(), endDate.getMillisOfSecond() };
+
+    String[] traversedDatePathSplit = traversedDatePath.split("/");
+
+    // Only check the number of parameters that the traversedDatePath has 
traversed through so far
+    for (int index = 0; index < traversedDatePathSplit.length; index++) {
+      try {
+        if (Integer.parseInt(traversedDatePathSplit[index]) < 
startDateSplit[index] ||
+            Integer.parseInt(traversedDatePathSplit[index]) > 
endDateSplit[index]) {
+          return false;
+        }
+      } catch (Exception e) {

Review Comment:
   What exception would be thrown here? We should avoid a wide catch and silent 
return



##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/TimeAwareRecursiveCopyableDataset.java:
##########
@@ -134,9 +135,37 @@ protected List<FileStatus> getFilesAtPath(FileSystem fs, 
Path path, PathFilter f
     return recursivelyGetFilesAtDatePath(fs, path, "", fileFilter, 1, 
startDate, endDate, formatter);
   }
 
+  public Boolean checkPathDateTimeValidity(LocalDateTime startDate, 
LocalDateTime endDate, String traversedDatePath) {
+    int[] startDateSplit = new int[] { startDate.getYear(), 
startDate.getMonthOfYear(), startDate.getDayOfMonth(),
+        startDate.getHourOfDay(), startDate.getMinuteOfHour(), 
startDate.getSecondOfMinute(), startDate.getMillisOfSecond() };
+    int[] endDateSplit = new int[] { endDate.getYear(), 
endDate.getMonthOfYear(), endDate.getDayOfMonth(),
+        endDate.getHourOfDay(), endDate.getMinuteOfHour(), 
endDate.getSecondOfMinute(), endDate.getMillisOfSecond() };
+
+    String[] traversedDatePathSplit = traversedDatePath.split("/");
+
+    // Only check the number of parameters that the traversedDatePath has 
traversed through so far
+    for (int index = 0; index < traversedDatePathSplit.length; index++) {
+      try {
+        if (Integer.parseInt(traversedDatePathSplit[index]) < 
startDateSplit[index] ||
+            Integer.parseInt(traversedDatePathSplit[index]) > 
endDateSplit[index]) {
+          return false;
+        }
+      } catch (Exception e) {
+        return false;
+      }
+    }
+    return true;
+  }
+
   private List<FileStatus> recursivelyGetFilesAtDatePath(FileSystem fs, Path 
path, String traversedDatePath, PathFilter fileFilter,
       int level,  LocalDateTime startDate, LocalDateTime endDate, 
DateTimeFormatter formatter) throws IOException {
     List<FileStatus> fileStatuses = Lists.newArrayList();
+    if (!Objects.equals(traversedDatePath, "")) {

Review Comment:
   you can do traversedDatePath.equals(""), unless you think it can be null?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 809623)
    Time Spent: 20m  (was: 10m)

> Improve TimeAwareRecursiveCopyableDataset to lookback only into datefolders 
> that match range
> --------------------------------------------------------------------------------------------
>
>                 Key: GOBBLIN-1708
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1708
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Andy Jiang
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to