[
https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527282#comment-13527282
]
Ted Yu commented on HBASE-7205:
-------------------------------
Thanks for the reminder, Adrian.
After poking a little bit, I want to discuss the following assertion:
{code}
for (Map.Entry<HRegion, Set<ClassLoader>> regionCP :
regionsActiveClassLoaders.entrySet()) {
assertTrue("Some CP classloaders for region " + regionCP.getKey() + " are
not cached."
+ " ClassLoader Cache:" + allClassLoaders
+ " Region ClassLoaders:" + regionCP.getValue(),
allClassLoaders.containsAll(regionCP.getValue()));
{code}
The reason why the above assertion failed was that the classloader for
jarFileOnHDFS2 was removed from classLoadersCache in the middle of the test
because of attempt of loading cpNameInvalid class.
I think the above assertion places extra limit on how CoprocessorHost.load()
handles ClassNotFoundException. It assumes that the classloader corresponding
to attempt of loading invalid classname (more strictly, classname and jar file
mismatch) would be retained in cache.
For this particular scenario, I can see two possibilities for root cause:
1. user puts correct jar file on hdfs but specifies wrong class name.
2. user specifies correct class name but uploads wrong / stale jar file onto
hdfs
>From occurrence of ClassNotFoundException itself, we don't have enough
>evidence which one of the above is the root cause.
So it would be nice to give CoprocessorHost.load() flexibility in how this
situation is handled.
Time permitting, I want to use policy introduced by HBASE-4014 to fail fast
class loading instead of silently logging in region server log. If people think
that should be tackled in separate JIRA, I am fine with that - provided we
leave the door open as to how CoprocessorHost.load() handles
ClassNotFoundException.
On top of the above assertion, we already have:
{code}
assertTrue(jarFileOnHDFS1 + " was not cached",
CoprocessorHost.classLoadersCache.containsKey(pathOnHDFS1));
{code}
So there is verification on caching behavior.
This is nice exercise which allows me to understand coprocessor classloading a
bit deeper.
> Coprocessor classloader is replicated for all regions in the HRegionServer
> --------------------------------------------------------------------------
>
> Key: HBASE-7205
> URL: https://issues.apache.org/jira/browse/HBASE-7205
> Project: HBase
> Issue Type: Bug
> Components: Coprocessors
> Affects Versions: 0.92.2, 0.94.2
> Reporter: Adrian Muraru
> Assignee: Ted Yu
> Priority: Critical
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 7205-v1.txt, 7205-v3.txt, 7205-v4.txt, 7205-v5.txt,
> 7205-v6.txt, 7205-v7.txt, 7205-v8.txt, HBASE-7205_v2.patch
>
>
> HBASE-6308 introduced a new custom CoprocessorClassLoader to load the
> coprocessor classes and a new instance of this CL is created for each single
> HRegion opened. This leads to OOME-PermGen when the number of regions go
> above hundres / region server.
> Having the table coprocessor jailed in a separate classloader is good however
> we should create only one for all regions of a table in each HRS.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira