[
https://issues.apache.org/jira/browse/CALCITE-2495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595491#comment-16595491
]
Vladimir Sitnikov commented on CALCITE-2495:
--------------------------------------------
I made a first pass of removal {{URL.getPath()}} from
{{Source#append(Source)}}, and it required to rework
{{org.apache.calcite.util.Source#path}}.
The method is strange. There's no javadoc, and it is not clear if the returned
path should be encoded or decoded.
Old behavior was:
1) return decoded filename if File part exists (that is file path without
{{file:}} prefix)
2) Otherwise return textual representation of URL (that is an URL with
{{http:}} prefix)
How is that useful? I'm puzzled.
I think it makes sense to return the same kind of path, so I made {{#path()}}
to return DECODED path component of the URL.
For instance, for
{{[http://en.wikipedia.org/wiki/List_of_United_States_cities_by_population]}}
it would be
{{//en.wikipedia.org/wiki/List_of_United_States_cities_by_population}} (URL
without {{http:}} prefix)
That is symmetric to the file case which returns file path without {{file:}}
prefix.
It caused the following test to fail:
{code:java}
@Test public void testFileReaderMalUrl() throws FileReaderException {
@Test public void testFileReaderMalUrl() throws FileReaderException {
try {
final Source badSource = Sources.url("bad" + CITIES_SOURCE.path()); //
<-- !!! here
fail("expected exception, got " + badSource);
} catch (RuntimeException e) {
assertThat(e.getCause(), instanceOf(MalformedURLException.class));
assertThat(e.getCause().getMessage(), is("unknown protocol: badhttp"));
}
}
{code}
I'm inclined to believe that the test is a misuse of the API, so I could just
alter the test.
Note: I do not see a reason why the test creates invalid URL via string
concatenation. Why doesn't it use pre-defined string?
The tests seems to exercise multiple actions at once, and best practices
describe that the test should verify single behaviour only.
> Rework URL->File conversion in tests
> ------------------------------------
>
> Key: CALCITE-2495
> URL: https://issues.apache.org/jira/browse/CALCITE-2495
> Project: Calcite
> Issue Type: Bug
> Reporter: Vladimir Sitnikov
> Assignee: Julian Hyde
> Priority: Major
>
> {{URL.getPath()}} produces %20 when path contains spaces.
> I suggest to rework all the uses of {{getResource()...}} to use
> {{Sources.of(URL)}} so there's single -point of failure- way to convert URL
> to File.
> This resolves Apache CI which happens to have a space in folder name.
> For the record:
> 1) {{URL.getPath()}} produces %20, so it is added to forbidden signatures
> 2) {{Paths.get(url.toURI()).toFile()}} almost works, however it fails with
> URL is not hierarchical for {{new URL("file:test.java")}}
> 3) {{new File(URL.toURI()}} is worse than #2
> 4) {{URLDecoder}} must not be used to decode %20, since it will convert {{+}}
> to spaces as well, thus it will corrupt {{test.c++}}
> 5) It looks like {{url.toURI().getSchemeSpecificPart())}} properly handles
> "opaque" URIs (which are relative {{file:test.java}} kind of URLs)
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)