[ https://issues.apache.org/jira/browse/BEAM-5180?focusedWorklogId=136435&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-136435 ]
ASF GitHub Bot logged work on BEAM-5180: ---------------------------------------- Author: ASF GitHub Bot Created on: 21/Aug/18 09:26 Start Date: 21/Aug/18 09:26 Worklog Time Spent: 10m Work Description: JozoVilcek closed pull request #6251: [BEAM-5180] Relax back restriction on parsing file scheme URL: https://github.com/apache/beam/pull/6251 This is a PR merged from a forked repository. As GitHub hides the original diff on merge, it is displayed below for the sake of provenance: As this is a foreign pull request (from a fork), the diff is supplied below (as it won't show otherwise due to GitHub magic): diff --git a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java index be89c9ec099..7de41c1174a 100644 --- a/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java +++ b/sdks/java/core/src/main/java/org/apache/beam/sdk/io/FileSystems.java @@ -69,7 +69,7 @@ public static final String DEFAULT_SCHEME = "file"; private static final Pattern FILE_SCHEME_PATTERN = - Pattern.compile("(?<scheme>[a-zA-Z][-a-zA-Z0-9+.]*)://.*"); + Pattern.compile("(?<scheme>[a-zA-Z][-a-zA-Z0-9+.]*):/.*"); private static final Pattern GLOB_PATTERN = Pattern.compile("[*?{}]"); private static final AtomicReference<Map<String, FileSystem>> SCHEME_TO_FILESYSTEM = diff --git a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileSystemsTest.java b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileSystemsTest.java index 22f71f6e09f..0fbeb71325d 100644 --- a/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileSystemsTest.java +++ b/sdks/java/core/src/test/java/org/apache/beam/sdk/io/FileSystemsTest.java @@ -196,6 +196,7 @@ public void testValidMatchNewResourceForLocalFileSystem() { @Test(expected = IllegalArgumentException.class) public void testInvalidSchemaMatchNewResource() { assertEquals("file", FileSystems.matchNewResource("invalidschema://tmp/f1", false)); + assertEquals("file", FileSystems.matchNewResource("c:/tmp/f1", false)); } private List<ResourceId> toResourceIds(List<Path> paths, final boolean isDirectory) { ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 136435) Time Spent: 1h (was: 50m) > Broken FileResultCoder via parseSchema change > --------------------------------------------- > > Key: BEAM-5180 > URL: https://issues.apache.org/jira/browse/BEAM-5180 > Project: Beam > Issue Type: Bug > Components: sdk-java-core > Affects Versions: 2.6.0 > Reporter: Jozef Vilcek > Assignee: Kenneth Knowles > Priority: Blocker > Time Spent: 1h > Remaining Estimate: 0h > > Recently this commit > [https://github.com/apache/beam/commit/3fff58c21f94415f3397e185377e36d3df662384] > introduced more strict schema parsing which is breaking the contract between > _FileResultCoder_ and _FileSystems.matchNewResource()_. > Coder takes _ResourceId_ and serialize it via `_toString_` methods and then > relies on filesystem being able to parse it back again. Having strict > _scheme://_ breaks this at least for Hadoop filesystem which use _URI_ for > _ResourceId_ and produce _toString()_ in form of `_hdfs:/some/path_` > I guess the _ResourceIdCoder_ is suffering the same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)