HeartSaVioR commented on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-653966315


   I still feel the date option can be ranged via two symmetric options; while 
this won't help on SS case, this can help filtering out files in the range of 
date which are out of interest - while this patch only filters on the left 
side, we can do that for the right side as well. There was similar review 
comment from other contributor.
   
   For `filesModifiedBefore` vs `fromModifiedDate`, I think former sounds more 
natural and even `modifiedBefore` sounds enough (as we know it deals with 
files), but let's hear others' opinions as well.
   
   Before looking at the code diff, some points on PR description (note that PR 
title and description become commit message):
   
   > It's also compatible with structured streaming requiring just a handful of 
small additions to move forward there. Looking to complete that in a separate 
PR.
   
   The part is better to be just left to PR comment rather than the part of the 
commit message. Let's drop it from there.
   
   > This PR introduces an option that can be used with Spark file data sources 
similar to the latestFirst option in structured streaming.
   
   latestFirst option doesn't work like this, as you may already know. (If 
latestFirst works like this I won't have a headache.) It'd be better to explain 
the new option there.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to