[
https://issues.apache.org/jira/browse/GOBBLIN-1677?focusedWorklogId=799104&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-799104
]
ASF GitHub Bot logged work on GOBBLIN-1677:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 08/Aug/22 20:32
Start Date: 08/Aug/22 20:32
Worklog Time Spent: 10m
Work Description: phet commented on code in PR #3535:
URL: https://github.com/apache/gobblin/pull/3535#discussion_r940634518
##########
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/TimeAwareRecursiveCopyableDataset.java:
##########
@@ -68,7 +68,7 @@ public TimeAwareRecursiveCopyableDataset(FileSystem fs, Path
rootPath, Propertie
this.datePattern = properties.getProperty(DATE_PATTERN_KEY);
this.currentTime = properties.containsKey(DATE_PATTERN_TIMEZONE_KEY) ?
LocalDateTime.now(
- DateTimeZone.forID(DATE_PATTERN_TIMEZONE_KEY))
+ DateTimeZone.forID(properties.getProperty(DATE_PATTERN_TIMEZONE_KEY)))
: LocalDateTime.now(DateTimeZone.forID(DEFAULT_DATE_PATTERN_TIMEZONE));
Review Comment:
I find it written most succinctly in -
https://github.com/apache/gobblin/blob/d9ae5353c74fdcd385835fca9b586b3fdb90971b/gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/UnixTimestampRecursiveCopyableDataset.java#L76
BTW, these two could also be DRYed up:
https://github.com/apache/gobblin/blob/d9ae5353c74fdcd385835fca9b586b3fdb90971b/gobblin-data-management/src/main/java/org/apache/gobblin/data/management/version/finder/DatePartitionHiveVersionFinder.java#L69
https://github.com/apache/gobblin/blob/d9ae5353c74fdcd385835fca9b586b3fdb90971b/gobblin-data-management/src/main/java/org/apache/gobblin/data/management/version/finder/DateTimeDatasetVersionFinder.java#L86
##########
gobblin-data-management/src/test/java/org/apache/gobblin/data/management/copy/TimeAwareRecursiveCopyableDatasetTest.java:
##########
@@ -260,6 +260,44 @@ public void testGetFilesAtPath() throws IOException {
}
}
+ @Test
+ public void testTimezoneProperty() throws IOException {
+ // Test in UTC instead of default time
+ String datePattern = "yyyy/MM/dd/HH";
+ DateTimeFormatter formatter = DateTimeFormat.forPattern(datePattern);
+ // Ensure that the files are created in UTC time
+ LocalDateTime endDate = LocalDateTime.now(DateTimeZone.forID("UTC"));
+
+ Set<String> candidateFiles = new HashSet<>();
+ for (int i = 0; i < MAX_NUM_HOURLY_DIRS; i++) {
+ String startDate = endDate.minusHours(i).toString(formatter);
+ Path subDirPath = new Path(baseDir1, new Path(startDate));
+ fs.mkdirs(subDirPath);
+ Path filePath = new Path(subDirPath, i + ".avro");
+ fs.create(filePath);
+ if (i < (NUM_LOOKBACK_HOURS + 1)) {
+ candidateFiles.add(filePath.toString());
+ }
+ }
+
+ //Lookback time = "4h"
+ Properties properties = new Properties();
+
properties.setProperty(TimeAwareRecursiveCopyableDataset.LOOKBACK_TIME_KEY,
NUM_LOOKBACK_HOURS_STR);
+ properties.setProperty(TimeAwareRecursiveCopyableDataset.DATE_PATTERN_KEY,
"yyyy/MM/dd/HH");
+
properties.setProperty(TimeAwareRecursiveCopyableDataset.DATE_PATTERN_TIMEZONE_KEY,
"UTC");
Review Comment:
not major, but right now the impl should throw `IllegalArgExc` when the
timezone is unrecognized. ...but at first I wondered whether UTC wouldn't be a
default it silently uses.
to demonstrate that test success couldn't be due merely to a default of UTC,
would it be worth verifying that `IllegalArgE` gets thrown on a bogus value of
this property?
Issue Time Tracking
-------------------
Worklog Id: (was: 799104)
Time Spent: 0.5h (was: 20m)
> Fix timezone property in TimeAwareRecursiveCopyableDataset
> ----------------------------------------------------------
>
> Key: GOBBLIN-1677
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1677
> Project: Apache Gobblin
> Issue Type: Bug
> Components: gobblin-core
> Reporter: William Lo
> Assignee: Abhishek Tiwari
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> Fix bug where TimeAwareRecursiveCopyableDataset could not read user
> configured timezones due to not reading the key correctly.
> Line 72 : {{DateTimeZone.forID(DATE_PATTERN_TIMEZONE_KEY))}} ===> should be
> {{DateTimeZone.forID(properties.get(DATE_PATTERN_TIMEZONE_KEY)))}}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)