[ 
https://issues.apache.org/jira/browse/GOBBLIN-2147?focusedWorklogId=932805&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-932805
 ]

ASF GitHub Bot logged work on GOBBLIN-2147:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 01/Sep/24 16:05
            Start Date: 01/Sep/24 16:05
    Worklog Time Spent: 10m 
      Work Description: Blazer-007 commented on code in PR #4044:
URL: https://github.com/apache/gobblin/pull/4044#discussion_r1740154470


##########
gobblin-api/src/main/java/org/apache/gobblin/configuration/ConfigurationKeys.java:
##########
@@ -347,6 +347,11 @@ public class ConfigurationKeys {
    */
   public static final String WATERMARK_INTERVAL_VALUE_KEY = 
"watermark.interval.value";
 
+  /**
+   * DEFAULT LOOKBACK TIME KEY property
+   */
+  public static final String DEFAULT_COPY_LOOKBACK_TIME_KEY = 
"copy.lookbackTime";

Review Comment:
   The property is 
["gobblin.copy.recursive.lookback.time"](https://github.com/apache/gobblin/blob/master/gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/TimeAwareRecursiveCopyableDataset.java#L48),
 isn't it will be confusing to use other DatasetFinder config in other finder 
class ?





Issue Time Tracking
-------------------

    Worklog Id:     (was: 932805)
    Time Spent: 0.5h  (was: 20m)

> Add lookback time property in PartitionedFileSource
> ---------------------------------------------------
>
>                 Key: GOBBLIN-2147
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2147
>             Project: Apache Gobblin
>          Issue Type: Task
>            Reporter: Vivek Rai
>            Priority: Major
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> All FileBasedSource implementations should have config for lookback time.
>  
> Currently 
> FileBasedSources look for data since the time set by 
> `conversion.min.watermark` and time granularity is decided by the lowest time 
> denomination. that denomination in many cases, including this one, is 1 second
> (determined by 
> |gobblin.flow.input.dataset.descriptor.partition.pattern|yyyy-MM-dd_HH_mm_ss|
>  
> It is an extremely abusive way to find workunits.
> Let's enable these jobs to use lookback time configs like several other 
> dataset finders do.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to