[ 
https://issues.apache.org/jira/browse/HBASE-16789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15582923#comment-15582923
 ] 

Umesh Agashe commented on HBASE-16789:
--------------------------------------

[~busbey], here are a few points that are discussed:
* This is an offline Compaction Tool (CT). Without MR option, CT will compact 
files for input table/ region/ column family on local node where CT is run.
* Current CT, decides on node to run MR jobs based on location of first block 
of a first file in an input directory.
* This can be improved to consider nodes based on last know region assignments 
with fallback on location of first block of first file in a table/ region/ 
column family. This will provide better locality.
* Even with the improved logic, locality cannot be guaranteed.
* So, whether to run with MR and MR job node selection can be determined by 
code outside of CT or a User. CT will be just responsible for compaction of 
files for input table/ region/ cf without deciding on MR or node selection for 
MR.
* CT may query/ consider local regions and only compact files belonging to 
local regions. Workaround with -force option can be provided for the default 
behavior.

> Remove directory layout/ filesystem references from CompactionTool
> ------------------------------------------------------------------
>
>                 Key: HBASE-16789
>                 URL: https://issues.apache.org/jira/browse/HBASE-16789
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Filesystem Integration
>            Reporter: Umesh Agashe
>            Assignee: Umesh Agashe
>         Attachments: HBASE-16789-hbase-14439.v1.patch
>
>
> Remove directory layout/ filesystem references from CompactionTool and use 
> APIs provided by MasterStorage/ RegionStorage instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to