[ 
https://issues.apache.org/jira/browse/HBASE-7205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527282#comment-13527282
 ] 

Ted Yu commented on HBASE-7205:
-------------------------------

Thanks for the reminder, Adrian.
After poking a little bit, I want to discuss the following assertion:
{code}
    for (Map.Entry<HRegion, Set<ClassLoader>> regionCP : 
regionsActiveClassLoaders.entrySet()) {
      assertTrue("Some CP classloaders for region " + regionCP.getKey() + " are 
not cached."
            + " ClassLoader Cache:" + allClassLoaders
            + " Region ClassLoaders:" + regionCP.getValue(),
            allClassLoaders.containsAll(regionCP.getValue()));
{code}
The reason why the above assertion failed was that the classloader for 
jarFileOnHDFS2 was removed from classLoadersCache in the middle of the test 
because of attempt of loading cpNameInvalid class.
I think the above assertion places extra limit on how CoprocessorHost.load() 
handles ClassNotFoundException. It assumes that the classloader corresponding 
to attempt of loading invalid classname (more strictly, classname and jar file 
mismatch) would be retained in cache.

For this particular scenario, I can see two possibilities for root cause:
1. user puts correct jar file on hdfs but specifies wrong class name.
2. user specifies correct class name but uploads wrong / stale jar file onto 
hdfs
>From occurrence of ClassNotFoundException itself, we don't have enough 
>evidence which one of the above is the root cause. 
So it would be nice to give CoprocessorHost.load() flexibility in how this 
situation is handled.

Time permitting, I want to use policy introduced by HBASE-4014 to fail fast 
class loading instead of silently logging in region server log. If people think 
that should be tackled in separate JIRA, I am fine with that - provided we 
leave the door open as to how CoprocessorHost.load() handles 
ClassNotFoundException.
On top of the above assertion, we already have:
{code}
    assertTrue(jarFileOnHDFS1 + " was not cached",
      CoprocessorHost.classLoadersCache.containsKey(pathOnHDFS1));
{code}
So there is verification on caching behavior.

This is nice exercise which allows me to understand coprocessor classloading a 
bit deeper.
                
> Coprocessor classloader is replicated for all regions in the HRegionServer
> --------------------------------------------------------------------------
>
>                 Key: HBASE-7205
>                 URL: https://issues.apache.org/jira/browse/HBASE-7205
>             Project: HBase
>          Issue Type: Bug
>          Components: Coprocessors
>    Affects Versions: 0.92.2, 0.94.2
>            Reporter: Adrian Muraru
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.96.0, 0.94.4
>
>         Attachments: 7205-v1.txt, 7205-v3.txt, 7205-v4.txt, 7205-v5.txt, 
> 7205-v6.txt, 7205-v7.txt, 7205-v8.txt, HBASE-7205_v2.patch
>
>
> HBASE-6308 introduced a new custom CoprocessorClassLoader to load the 
> coprocessor classes and a new instance of this CL is created for each single 
> HRegion opened. This leads to OOME-PermGen when the number of regions go 
> above hundres / region server. 
> Having the table coprocessor jailed in a separate classloader is good however 
> we should create only one for all regions of a table in each HRS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to