[
https://issues.apache.org/jira/browse/BEAM-5910?focusedWorklogId=161697&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-161697
]
ASF GitHub Bot logged work on BEAM-5910:
----------------------------------------
Author: ASF GitHub Bot
Created on: 01/Nov/18 18:53
Start Date: 01/Nov/18 18:53
Worklog Time Spent: 10m
Work Description: jklukas opened a new pull request #6914: [BEAM-5910]
Add lastModified field to MatchResult.Metadata
URL: https://github.com/apache/beam/pull/6914
In the Java SDK, the Filesystems.match facilities are aimed primarily at
listing file names and collect very limited additional metadata from the
filesystem (sizeBytes and isReadSeekEfficient). This PR adds a new
`lastModified` field to that list.
This could be a basis for a future improvement to
FileIO.match(...).continuously(...) where we could let the user opt to poll not
just for new file names, but also for existing file names if their content has
been updated.
In the near term, the addition of lastModified to Metadata will allow users
to implement their own polling logic on top of Filesystems.match to detect and
download new files from any of the supported filesystems.
------------------------
Follow this checklist to help us incorporate your contribution quickly and
easily:
- [x] Format the pull request title like `[BEAM-XXX] Fixes bug in
ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA
issue, if applicable. This will automatically link the pull request to the
issue.
- [x] If this contribution is large, please file an Apache [Individual
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
It will help us expedite review of your Pull Request if you tag someone
(e.g. `@username`) to look at it.
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
--- | --- | --- | --- | --- | --- | --- | ---
Go | [](https://builds.apache.org/job/beam_PostCommit_Go_GradleBuild/lastCompletedBuild/)
| --- | --- | --- | --- | --- | ---
Java | [](https://builds.apache.org/job/beam_PostCommit_Java_GradleBuild/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Gradle/lastCompletedBuild/)
[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza_Gradle/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_Gradle/lastCompletedBuild/)
Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
| --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
</br> [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/)
| --- | --- | ---
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 161697)
Time Spent: 10m
Remaining Estimate: 0h
> FileSystems should retrieve lastModified time
> ---------------------------------------------
>
> Key: BEAM-5910
> URL: https://issues.apache.org/jira/browse/BEAM-5910
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-core
> Reporter: Jeff Klukas
> Assignee: Jeff Klukas
> Priority: Minor
> Time Spent: 10m
> Remaining Estimate: 0h
>
> In the Java SDK, the Filesystems.match facilities are aimed at listing file
> names and collect very limited additional metadata from the filesystem
> (sizeBytes and isReadSeekEfficient). I propose adding a new field for
> lastModified time to MatchResult.Metadata that each FileSystem would populate
> when listing files.
> This would be a basis for a future improvement to
> FileIO.match(...).continuously(...) where we could let the user opt to poll
> not just for new file names, but also for existing file names if their
> content has been updated.
> In the near term, the addition of lastModified to Metadata would allow users
> to implement their own polling logic on top of Filesystems.match to detect
> and download new files from any of the supported filesystems.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)