[
https://issues.apache.org/jira/browse/BEAM-14314?focusedWorklogId=763074&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-763074
]
ASF GitHub Bot logged work on BEAM-14314:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 27/Apr/22 17:55
Start Date: 27/Apr/22 17:55
Worklog Time Spent: 10m
Work Description: johnjcasey commented on code in PR #17380:
URL: https://github.com/apache/beam/pull/17380#discussion_r860099004
##########
sdks/python/apache_beam/io/hadoopfilesystem_test.py:
##########
@@ -538,6 +539,14 @@ def test_checksum(self):
self.assertEqual(
'fake_algo-5-checksum_byte_sequence', self.fs.checksum(url))
+ def test_last_updated(self):
+ url = self.fs.join(self.tmpdir, 'f1')
+ with self.fs.create(url) as f:
+ f.write(b'Hello')
+ tolerance = 60 # 1 min
+ result = self.fs.last_updated(url)
+ self.assertAlmostEqual(result, time.time(), delta=tolerance)
Review Comment:
1 minute is a reasonable window, but this test looks like it could be flaky
still. Is there a way to make this deterministic?
##########
sdks/python/apache_beam/io/hadoopfilesystem_test.py:
##########
@@ -36,17 +37,16 @@ class FakeFile(io.BytesIO):
"""File object for FakeHdfs"""
__hash__ = None # type: ignore[assignment]
- def __init__(self, path, mode='', type='FILE'):
+ def __init__(self, path, mode='', type='FILE', time_ms=None):
io.BytesIO.__init__(self)
-
- self.stat = {
- 'path': path,
- 'mode': mode,
- 'type': type,
- }
+ if time_ms is None:
+ time_ms = int(time.time() * 1000)
+ self.time_ms = time_ms
+ self.stat = {'path': path, 'mode': mode, 'type': type}
self.saved_data = None
def __eq__(self, other):
+ """Equality of two files. Timestamp not included in comparison"""
Review Comment:
Should timestamp be included?
Issue Time Tracking
-------------------
Worklog Id: (was: 763074)
Time Spent: 3h 10m (was: 3h)
> Add last_updated field in filesystem.FileMetaData
> -------------------------------------------------
>
> Key: BEAM-14314
> URL: https://issues.apache.org/jira/browse/BEAM-14314
> Project: Beam
> Issue Type: New Feature
> Components: io-py-common
> Reporter: Yi Hu
> Assignee: Yi Hu
> Priority: P2
> Time Spent: 3h 10m
> Remaining Estimate: 0h
>
> This will be the python counterpart of BEAM-5910
> Per python naming convention, the field will be named as
> "last_updated_in_seconds".
--
This message was sent by Atlassian Jira
(v8.20.7#820007)