[ https://issues.apache.org/jira/browse/HADOOP-1662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
stack updated HADOOP-1662: -------------------------- Resolution: Fixed Status: Resolved (was: Patch Available) Resolving. Was committed a while back. > [hbase] Make region splits faster > --------------------------------- > > Key: HADOOP-1662 > URL: https://issues.apache.org/jira/browse/HADOOP-1662 > Project: Hadoop > Issue Type: Improvement > Components: contrib/hbase > Reporter: stack > Assignee: stack > Attachments: fastsplits.patch, mapfile_split.patch, splits-2.patch, > splits-v3.patch > > > HADOOP-1644 '[hbase] Compactions should take no longer than period between > memcache flushes' is about making compactions run faster. This issue is > about making splits faster. Currently splits are done by reading as input a > map file and per record, writing out two new mapfiles. Its currently too > slow. ~30 seconds to split 120MB. Google hints in bigtable that splitting is > very fast because they let the split children feed off the split parent. > Primitive testing has splitting mapfiles using raw streams running 3 to 4 > times faster than splitting on mapfile keys. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.