[ 
https://issues.apache.org/jira/browse/HADOOP-14132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15892067#comment-15892067
 ] 

Steve Loughran commented on HADOOP-14132:
-----------------------------------------

Short term, 

* I'm about to submit a patch to remove s3a from discovery, as its the perf 
kiler. And it turns out to be registered in core-default anyway
* John Zhuge has put HADOOP-14123 on hold
* we could do a lightweight one and tell everyone they must move to it.

Here's what I'm thinking
{code}
class FileSystemInfo {

String getSchema();
String getFilesystemImplementation();
String getAbstractFilesystemImplemetnation();
String getHomePage()
String getResourceFile()
int getVersion(); // internally we can return the hadoop version

toString(): schema + filesystem + " version "+ version +" from " + homepage
}

{code}

# the toString can be printed on any diags on failure to instantiate
# the getResourceFile() call will define a new XML resource to load 
automatically; allows filesystems to declare all their defaults. That's the 
feature which will encourage people to adopt this mechanism




> Filesystem discovery to stop loading implementation classes
> -----------------------------------------------------------
>
>                 Key: HADOOP-14132
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14132
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs, fs/adl, fs/azure, fs/oss, fs/s3, fs/swift
>    Affects Versions: 2.7.3
>            Reporter: Steve Loughran
>            Assignee: Steve Loughran
>
> Integration testing of Hadoop with the HADOOP-14040 has shown up that the 
> move to a shaded AWS JAR is slowing all hadoop client code down.
> I believe this is due to how we use service discovery to identify FS 
> implementations: the implementation classes themselves are instantiated.
> This has known problems today with classloading, but clearly impacts 
> performance too, especially with complex transitive dependencies unique to 
> the loaded class.
> Proposed: have lightweight service declaration classes which implement an 
> interface declaring
> # schema
> # classname of FileSystem impl
> # classname of AbstractFS impl
> # homepage (for third party code, support, etc)
> These are what we register and scan in the FS to look for services.
> This will leave the question about what to do for existing filesystems? I 
> think we'll need to retain the old code for external ones, while moving the 
> hadoop modules to the new ones



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to