Vivek Rai created GOBBLIN-2147:
----------------------------------

             Summary: Add lookback time property in PartitionedFileSource
                 Key: GOBBLIN-2147
                 URL: https://issues.apache.org/jira/browse/GOBBLIN-2147
             Project: Apache Gobblin
          Issue Type: Task
            Reporter: Vivek Rai


All FileBasedSource implementations should have config for lookback time.

 

Currently 

FileBasedSources look for data since the time set by `conversion.min.watermark` 
and time granularity is decided by the lowest time denomination. that 
denomination in many cases, including this one, is 1 second
(determined by 
|gobblin.flow.input.dataset.descriptor.partition.pattern|yyyy-MM-dd_HH_mm_ss|

 

It is an extremely abusive way to find workunits.

Let's enable these jobs to use lookback time configs like several other dataset 
finders do.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to