Github user steveloughran commented on the issue:
https://github.com/apache/spark/pull/19885
@vanzin its too late for this, but I don't see any reason why
`FileSystem.getCanonicalUri` should be kept protected. If someone wants to
volunteer with the spec changes to filesystem.md & contract tests, they'll get
support.
Looking at what HDFS does there, it calls out HA support as special: you
can't do DNS resolution
```java
protected URI canonicalizeUri(URI uri) {
if (HAUtilClient.isLogicalUri(getConf(), uri)) {
// Don't try to DNS-resolve logical URIs, since the 'authority'
// portion isn't a proper hostname
return uri;
} else {
return NetUtils.getCanonicalUri(uri, getDefaultPort());
}
}
```
where `NetUtils.getCanonicalUri()` does some DNS lookup with caching of
previously canonicalized hosts via {{SecurityUtil.getByName}}. SecurityUtil is
tagged as `@Public`; NetUtils isn't, but that could be relaxed while nobody is
looking. But it doesn't address the big issue: different filesystems clearly
have different rules about "canonical", and you don't want to try and work them
out and re-replicate, as it is a moving-maintenance-target.
I'm stuck at this point. Created
[HADOOP-15094](https://issues.apache.org/jira/browse/HADOOP-15094).
Looking at {{Filesystem.CACHE}}; that compares on: (scheme, authority,
ugi), so it will actually return different FS instances for unqualified and
qualified hosts. Maybe for this specific problem it's simplest to say "if you
do that, don't expect things to work"
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]