[
https://issues.apache.org/jira/browse/BEAM-5910?focusedWorklogId=186372&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-186372
]
ASF GitHub Bot logged work on BEAM-5910:
----------------------------------------
Author: ASF GitHub Bot
Created on: 17/Jan/19 15:57
Start Date: 17/Jan/19 15:57
Worklog Time Spent: 10m
Work Description: jklukas commented on pull request #6914: [BEAM-5910]
Add lastModified field to MatchResult.Metadata
URL: https://github.com/apache/beam/pull/6914#discussion_r248729108
##########
File path:
sdks/java/core/src/main/java/org/apache/beam/sdk/io/fs/MetadataCoder.java
##########
@@ -46,14 +54,18 @@ public void encode(Metadata value, OutputStream os) throws
IOException {
@Override
public Metadata decode(InputStream is) throws IOException {
+ return decodeBuilder(is).build();
+ }
+
+ Metadata.Builder decodeBuilder(InputStream is) throws IOException {
ResourceId resourceId = RESOURCE_ID_CODER.decode(is);
boolean isReadSeekEfficient = INT_CODER.decode(is) == 1;
long sizeBytes = LONG_CODER.decode(is);
return Metadata.builder()
.setResourceId(resourceId)
.setIsReadSeekEfficient(isReadSeekEfficient)
.setSizeBytes(sizeBytes)
- .build();
+ .setLastModifiedMillis(UNKNOWN_LAST_MODIFIED_MILLIS);
Review comment:
The idea here is that lastModifiedMillis must have some default value. I've
chosen here to use `-1` so that `lastModifiedMillis` doesn't have to be
nullable, though we can discuss if null makes more sense here.
AutoValue requires that values be set for all members. To define a default
value, the AutoValue docs suggest a pattern of calling setters before returning
the builder. That's what's going on here.
I don't see that there's a backwards incompatibility here. MetadataCoder
still only writes and reads the three existing values (resource id, int, and
long). When encoding, it throws away lastModifiedMills and when decoding, it
provides the default -1 value.
Do you have thoughts on whether `null` would be preferred as the default vs.
`-1`?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 186372)
Time Spent: 3h 20m (was: 3h 10m)
> FileSystems should retrieve lastModified time
> ---------------------------------------------
>
> Key: BEAM-5910
> URL: https://issues.apache.org/jira/browse/BEAM-5910
> Project: Beam
> Issue Type: Improvement
> Components: sdk-java-core
> Reporter: Jeff Klukas
> Assignee: Jeff Klukas
> Priority: Minor
> Time Spent: 3h 20m
> Remaining Estimate: 0h
>
> In the Java SDK, the Filesystems.match facilities are aimed at listing file
> names and collect very limited additional metadata from the filesystem
> (sizeBytes and isReadSeekEfficient). I propose adding a new field for
> lastModified time to MatchResult.Metadata that each FileSystem would populate
> when listing files.
> This would be a basis for a future improvement to
> FileIO.match(...).continuously(...) where we could let the user opt to poll
> not just for new file names, but also for existing file names if their
> content has been updated.
> In the near term, the addition of lastModified to Metadata would allow users
> to implement their own polling logic on top of Filesystems.match to detect
> and download new files from any of the supported filesystems.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)