[ https://issues.apache.org/jira/browse/CRUNCH-678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Josh Wills resolved CRUNCH-678. ------------------------------- Resolution: Fixed Fix Version/s: 1.0.0 Thank you [~noslowerdna]! > Avoid unnecessary retrieval of last modified time > ------------------------------------------------- > > Key: CRUNCH-678 > URL: https://issues.apache.org/jira/browse/CRUNCH-678 > Project: Crunch > Issue Type: Improvement > Components: Core > Reporter: Andrew Olson > Assignee: Josh Wills > Priority: Major > Fix For: 1.0.0 > > Time Spent: 20m > Remaining Estimate: 0h > > There is no assurance that the last modified time can be retrieved > efficiently for all file systems. In particular, with object stores and large > data sets it could be very slow. Since this information is actually not > always needed, we should only retrieve it when necessary (i.e. when the write > mode is checkpoint) for sources and targets. > CRUNCH-658 expressed similar concerns for the getSize method. This would be a > simpler and safer optimization to make. -- This message was sent by Atlassian JIRA (v7.6.3#76005)