Steve Loughran created HADOOP-14132:
---------------------------------------
Summary: Filesystem discovery to stop loading implementation
classes
Key: HADOOP-14132
URL: https://issues.apache.org/jira/browse/HADOOP-14132
Project: Hadoop Common
Issue Type: Improvement
Components: fs, fs/adl, fs/azure, fs/oss, fs/s3, fs/swift
Affects Versions: 2.7.3
Reporter: Steve Loughran
Assignee: Steve Loughran
Integration testing of Hadoop with the HADOOP-14040 has shown up that the move
to a shaded AWS JAR is slowing all hadoop client code down.
I believe this is due to how we use service discovery to identify FS
implementations: the implementation classes themselves are instantiated.
This has known problems today with classloading, but clearly impacts
performance too, especially with complex transitive dependencies unique to the
loaded class.
Proposed: have lightweight service declaration classes which implement an
interface declaring
# schema
# classname of FileSystem impl
# classname of AbstractFS impl
# homepage (for third party code, support, etc)
These are what we register and scan in the FS to look for services.
This will leave the question about what to do for existing filesystems? I think
we'll need to retain the old code for external ones, while moving the hadoop
modules to the new ones
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]