[ 
https://issues.apache.org/jira/browse/BEAM-14314?focusedWorklogId=758117&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-758117
 ]

ASF GitHub Bot logged work on BEAM-14314:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 18/Apr/22 20:57
            Start Date: 18/Apr/22 20:57
    Worklog Time Spent: 10m 
      Work Description: Abacn commented on code in PR #17380:
URL: https://github.com/apache/beam/pull/17380#discussion_r852409977


##########
sdks/python/apache_beam/io/hadoopfilesystem.py:
##########
@@ -399,6 +407,26 @@ def checksum(self, url):
         file_checksum[_FILE_CHECKSUM_BYTES],
     )
 
+  def metadata(self, url):
+    """Fetch metadata fields of a file on the FileSystem.
+
+    Args:
+      url: string url of a file.
+
+    Returns:
+      :class:`~apache_beam.io.filesystem.FileMetadata`.
+      Note: last_updated field is not supported yet.
+
+    Raises:
+      ``BeamIOError``: if url doesn't exist.
+    """
+    _, path = self._parse_url(url)
+    status = self._hdfs_client.status(path, strict=False)
+    print(status)
+    if status is None:
+      raise BeamIOError('File not found: %s' % url)
+    return FileMetadata(url, status[_FILE_STATUS_LENGTH])

Review Comment:
   I believe so, according to 
http://hadoop.apache.org/docs/r1.0.4/webhdfs.html#FileStatus, the returned json 
should include a 'modificationTime' field. Implementing this also involves 
updating FakeFile class in unit test.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 758117)
    Time Spent: 1h 10m  (was: 1h)

> Add last_updated field in filesystem.FileMetaData
> -------------------------------------------------
>
>                 Key: BEAM-14314
>                 URL: https://issues.apache.org/jira/browse/BEAM-14314
>             Project: Beam
>          Issue Type: New Feature
>          Components: io-py-common
>            Reporter: Yi Hu
>            Assignee: Yi Hu
>            Priority: P2
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This will be the python counterpart of BEAM-5910
> Per python naming convention, the field will be named as 
> "last_updated_in_seconds".



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to