[ 
https://issues.apache.org/jira/browse/GOBBLIN-2147?focusedWorklogId=947174&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-947174
 ]

ASF GitHub Bot logged work on GOBBLIN-2147:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Dec/24 10:19
            Start Date: 08/Dec/24 10:19
    Worklog Time Spent: 10m 
      Work Description: Blazer-007 commented on code in PR #4044:
URL: https://github.com/apache/gobblin/pull/4044#discussion_r1874733767


##########
gobblin-core/src/main/java/org/apache/gobblin/source/PartitionAwareFileRetrieverUtils.java:
##########
@@ -52,4 +60,29 @@ public static Duration getLeadTimeDurationFromConfig(State 
state) {
 
     return new Duration(leadTime * leadTimeGranularity.getUnitMilliseconds());
   }
+
+  /**
+   * Calculates the lookback time duration based on the provided lookback time 
string.
+   *
+   * @param lookBackTime the lookback time string, which should include a 
numeric value followed by a time unit character.
+   *                     For example, "5d" for 5 days or "10h" for 10 hours.
+   * @return an {@link Optional} containing the {@link Duration} if the 
lookback time is valid, or
+   *         an empty {@link Optional} if the lookback time is invalid or 
cannot be parsed.
+   */
+  public static Optional<Duration> getLookbackTimeDuration(String 
lookBackTime) {

Review Comment:
   Updated to throw IOException





Issue Time Tracking
-------------------

    Worklog Id:     (was: 947174)
    Time Spent: 1h 10m  (was: 1h)

> Add lookback time property in PartitionedFileSource
> ---------------------------------------------------
>
>                 Key: GOBBLIN-2147
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2147
>             Project: Apache Gobblin
>          Issue Type: Task
>            Reporter: Vivek Rai
>            Priority: Major
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> All FileBasedSource implementations should have config for lookback time.
>  
> Currently 
> FileBasedSources look for data since the time set by 
> `conversion.min.watermark` and time granularity is decided by the lowest time 
> denomination. that denomination in many cases, including this one, is 1 second
> (determined by 
> |gobblin.flow.input.dataset.descriptor.partition.pattern|yyyy-MM-dd_HH_mm_ss|
>  
> It is an extremely abusive way to find workunits.
> Let's enable these jobs to use lookback time configs like several other 
> dataset finders do.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to