[ 
https://issues.apache.org/jira/browse/HBASE-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matteo Bertozzi updated HBASE-7253:
-----------------------------------

    Release Note: 
The CompactionTool works at file-system level, so the table should be disabled.

The compaction process uses the same hbase-site.xml configuration property used 
by the server, like 
"hbase.hstore.compactionThreshold" & co.

You can compact the whole table or just a single region or family,
and the input of the CompactionTool is a fs path.

You can run the compaction as a MapReduce Job, or as a local process.
Each family can be compacted in parallel if you use the -mapreduce option.

To compact "TestTable" family "cf1" of region "e450da04b1a10099b618bec031e0f951"
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool 
hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951/cf1

To compact all the families of region "e450da04b1a10099b618bec031e0f951":
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool 
hdfs:///hbase/TestTable/e450da04b1a10099b618bec031e0f951

To compact all regions and family of the Table:
bin/hbase org.apache.hadoop.hbase.regionserver.CompactionTool -mapred 
hdfs:///hbase/TestTable

  was:
Tool to run compactions external to hbase:

Usage: java " + this.getClass().getName() +  [-compactOnce] [-mapred] 
[-D<property=value>]* files...
Options:
 mapred         Use MapReduce to run compaction.
 compactOnce    Execute just one compaction step. (default: while needed)
Note: -D properties will be applied to the conf used.
For example:
 To preserve input files, pass -D"+CONF_COMPLETE_COMPACTION+"=false"
 To stop delete of compacted file, pass -D"+CONF_DELETE_COMPACTED+"=false"
 To set tmp dir, pass -D"+CONF_TMP_DIR+"=ALTERNATE_DIR"

Examples:
 To compact the full 'TestTable' using MapReduce:
 $ bin/hbase " + this.getClass().getName() + " -mapred hdfs:///hbase/TestTable"
 To compact column family 'x' of the table 'TestTable' region 'abc':            
                                                                                
     
  $ bin/hbase " + this.getClass().getName() + " hdfs:///hbase/TestTable/abc/x"

    Hadoop Flags:   (was: Reviewed)
    
> Compaction Tool
> ---------------
>
>                 Key: HBASE-7253
>                 URL: https://issues.apache.org/jira/browse/HBASE-7253
>             Project: HBase
>          Issue Type: New Feature
>          Components: Compaction
>    Affects Versions: 0.96.0
>            Reporter: Matteo Bertozzi
>            Assignee: Matteo Bertozzi
>            Priority: Minor
>             Fix For: 0.96.0
>
>         Attachments: HBASE-7253-v0.patch, HBASE-7253-v1.patch
>
>
> In HBASE-5616, as part of the compaction code refactor, a CompactionTool was 
> added.
> but there are some issues:
> * The tool is under test/
> * mockito is required, so the "test" scope should be removed from the 
> pom.xml, otherwise the tool doesn't start
> * The mock, used by the tool, is mocking HRegion.getRegionInfo() but some 
> code (Store) uses HRegion.regionInfo directly HStore.java#L2021,  
> HStore.java#L1389, HStore.java#L1402 and you end up with a NPE in the tool.
> * The Mocked Store uses a dummy family and the compacted files doesn't get 
> the same family properties specified (compression, encoding, ...)
> * at the end of compaction CompactionTool.java#L155, on by default, the 
> compaction file is removed (note that the compacted one are already removed 
> inside the store.compact()... and you end up with an empty dir, if you 
> compact everything.
> I've fixed some stuff and added support to:
>  * Run the compaction as a MR Job
>  * Specify a Table (compact each region/family)
>  * Specify a Region (compact each family)
>  * Specify a Family (as before)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to