zxcware commented on issue #2846: [GOBBLIN-1001] Implement 
TimePartitionGlobFinder
URL: 
https://github.com/apache/incubator-gobblin/pull/2846#issuecomment-564740742
 
 
   @autumnust Yeah, `yesterdayPartition` is really specific, I'm thinking about 
generalize it to `enforcePreviousN`(looking for better name suggestions) 
partitions. Its main responsibility is to create `EmptyFileSystemDataset` if 
any of the previous N doesn't exist, signaling quiet time. In addition, it 
focuses on time partitions and supports different time formats(not limitted to 
`yyyy/MM/dd`) compared to vanilla `DefaultFileSystemGlobFinder`.  (I'm adding 
comments about it s usage)
   
   By `enforcePreviousN`, it's tied with company requirements even less and 
makes it more justifiable to open-source. In our use case, we capture the quiet 
time signal to publish compaction watermark. It can be captured by others to do 
different operations. 
   
   Another consideration was we have to make internal copies of open source 
compaction constructs(`MRTask`, `Verifier`, `CompactionAction`), if 
`EmptyFileSystemDataset` is made internal. Compared to make 
`EmptyFileSystemDataset` first citizen of open source compaction flow, the 
implementation and mountainous cost of internalization is high, given most of 
our pipelines use open source compaction constructs 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to