[
https://issues.apache.org/jira/browse/HBASE-8778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720389#comment-13720389
]
Hadoop QA commented on HBASE-8778:
----------------------------------
{color:red}-1 overall{color}. Here are the results of testing the latest
attachment
http://issues.apache.org/jira/secure/attachment/12594312/HBASE-8778-v3.patch
against trunk revision .
{color:green}+1 @author{color}. The patch does not contain any @author
tags.
{color:green}+1 tests included{color}. The patch appears to include 30 new
or modified tests.
{color:green}+1 hadoop1.0{color}. The patch compiles against the hadoop
1.0 profile.
{color:green}+1 hadoop2.0{color}. The patch compiles against the hadoop
2.0 profile.
{color:green}+1 javadoc{color}. The javadoc tool did not generate any
warning messages.
{color:green}+1 javac{color}. The applied patch does not increase the
total number of javac compiler warnings.
{color:green}+1 findbugs{color}. The patch does not introduce any new
Findbugs (version 1.3.9) warnings.
{color:green}+1 release audit{color}. The applied patch does not increase
the total number of release audit warnings.
{color:green}+1 lineLengths{color}. The patch does not introduce lines
longer than 100
{color:green}+1 site{color}. The mvn site goal succeeds with this patch.
{color:red}-1 core tests{color}. The patch failed these unit tests:
Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-protocol.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-client.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-examples.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-prefix-tree.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/6476//console
This message is automatically generated.
> Region assigments scan table directory making them slow for huge tables
> -----------------------------------------------------------------------
>
> Key: HBASE-8778
> URL: https://issues.apache.org/jira/browse/HBASE-8778
> Project: HBase
> Issue Type: Improvement
> Reporter: Dave Latham
> Assignee: Dave Latham
> Fix For: 0.98.0, 0.95.2, 0.94.11
>
> Attachments: 8778-dirmodtime.txt, HBASE-8778-0.94.5.patch,
> HBASE-8778-0.94.5-v2.patch, HBASE-8778.patch, HBASE-8778-v2.patch,
> HBASE-8778-v3.patch
>
>
> On a table with 130k regions it takes about 3 seconds for a region server to
> open a region once it has been assigned.
> Watching the threads for a region server running 0.94.5 that is opening many
> such regions shows the thread opening the reigon in code like this:
> {noformat}
> "PRI IPC Server handler 4 on 60020" daemon prio=10 tid=0x00002aaac07e9000
> nid=0x6566 runnable [0x000000004c46d000]
> java.lang.Thread.State: RUNNABLE
> at java.lang.String.indexOf(String.java:1521)
> at java.net.URI$Parser.scan(URI.java:2912)
> at java.net.URI$Parser.parse(URI.java:3004)
> at java.net.URI.<init>(URI.java:736)
> at org.apache.hadoop.fs.Path.initialize(Path.java:145)
> at org.apache.hadoop.fs.Path.<init>(Path.java:126)
> at org.apache.hadoop.fs.Path.<init>(Path.java:50)
> at
> org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(HdfsFileStatus.java:215)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:252)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:311)
> at
> org.apache.hadoop.fs.FilterFileSystem.listStatus(FilterFileSystem.java:159)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:842)
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:867)
> at org.apache.hadoop.hbase.util.FSUtils.listStatus(FSUtils.java:1168)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:269)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoPath(FSTableDescriptors.java:255)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.getTableInfoModtime(FSTableDescriptors.java:368)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:155)
> at
> org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:126)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2834)
> at
> org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2807)
> at sun.reflect.GeneratedMethodAccessor64.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
> at
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)
> {noformat}
> To open the region, the region server first loads the latest
> HTableDescriptor. Since HBASE-4553 HTableDescriptor's are stored in the file
> system at "/hbase/<tableDir>/.tableinfo.<sequenceNum>". The file with the
> largest sequenceNum is the current descriptor. This is done so that the
> current descirptor is updated atomically. However, since the filename is not
> known in advance FSTableDescriptors it has to do a FileSystem.listStatus
> operation which has to list all files in the directory to find it. The
> directory also contains all the region directories, so in our case it has to
> load 130k FileStatus objects. Even using a globStatus matching function
> still transfers all the objects to the client before performing the pattern
> matching. Furthermore HDFS uses a default of transferring 1000 directory
> entries in each RPC call, so it requires 130 roundtrips to the namenode to
> fetch all the directory entries.
> Consequently, to reassign all the regions of a table (or a constant fraction
> thereof) requires time proportional to the square of the number of regions.
> In our case, if a region server fails with 200 such regions, it takes 10+
> minutes for them all to be reassigned, after the zk expiration and log
> splitting.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira