[jira] [Commented] (HBASE-22749) Distributed MOB compactions

Sean Busbey (Jira) Wed, 19 Feb 2020 13:05:31 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-22749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040436#comment-17040436
 ]


Sean Busbey commented on HBASE-22749:
-------------------------------------

okay I think this is ready to land.

* [qabot on the PR has a clean bill of 
health|https://github.com/apache/hbase/pull/921#issuecomment-586650488]
* I rebased the current PR on master and ran the full nightly suite ~8 times. 
I'm attaching a PDF of the summary of the results and a CSV with individual 
test pass/fail status. Essentially it looks equivalent to master. the new tests 
look stable, and it's a random smattering of things that fail out with 
execution environment problems. (in the the last build the 1 test that failed 
was due to a local FS permission issue)

I'll rebase to current master again and aim to get the commit history at 
one-commit-per-jira. I think some of the subtasks are already squashed in the 
current history, so I'll aim to get their JIRA keys in the commit message so we 
at least can grep for them.

Since this obviates the dataloss in HBASE-22075, after merge to master I'm 
going to chase down the remaining subtasks so that I can then get this into 
branches-2.

> Distributed MOB compactions 
> ----------------------------
>
>                 Key: HBASE-22749
>                 URL: https://issues.apache.org/jira/browse/HBASE-22749
>             Project: HBase
>          Issue Type: New Feature
>          Components: mob
>            Reporter: Vladimir Rodionov
>            Assignee: Vladimir Rodionov
>            Priority: Major
>         Attachments: HBASE-22749-branch-2.2-v4.patch, 
> HBASE-22749-master-v1.patch, HBASE-22749-master-v2.patch, 
> HBASE-22749-master-v3.patch, HBASE-22749-master-v4.patch, 
> HBase-MOB-2.0-v3.0.pdf
>
>
> There are several  drawbacks in the original MOB 1.0  (Moderate Object 
> Storage) implementation, which can limit the adoption of the MOB feature:  
> # MOB compactions are executed in a Master as a chore, which limits 
> scalability because all I/O goes through a single HBase Master server. 
> # Yarn/Mapreduce framework is required to run MOB compactions in a scalable 
> way, but this won’t work in a stand-alone HBase cluster.
> # Two separate compactors for MOB and for regular store files and their 
> interactions can result in a data loss (see HBASE-22075)
> The design goals for MOB 2.0 were to provide 100% MOB 1.0 - compatible 
> implementation, which is free of the above drawbacks and can be used as a 
> drop in replacement in existing MOB deployments. So, these are design goals 
> of a MOB 2.0:
> # Make MOB compactions scalable without relying on Yarn/Mapreduce framework
> # Provide unified compactor for both MOB and regular store files
> # Make it more robust especially w.r.t. to data losses. 
> # Simplify and reduce the overall MOB code.
> # Provide 100% compatible implementation with MOB 1.0.
> # No migration of data should be required between MOB 1.0 and MOB 2.0 - just 
> software upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (HBASE-22749) Distributed MOB compactions

Reply via email to