On Wed, Dec 13, 2017 at 1:14 AM, Apekshit Sharma <[email protected]> wrote:

> Hi folks,
>
> Was debugging TestTruncateTableProcedure when starting thinking about this.
> (That's one mean test! What nice fault tolerant tests!)
>
> So the specific case: If we fail after adding new regions to meta (
> TRUNCATE_TABLE_ADD_TO_META
> <https://github.com/apache/hbase/blob/master/hbase-
> server/src/main/java/org/apache/hadoop/hbase/master/procedure/
> TruncateTableProcedure.java#L127>),
> then on recovery, AM assumes those regions with null state as offline and
> begins assigning them by itself which is wrong since truncate action is not
> complete (and it'll try to assign them too on recovery, and there are locks
> to avoid simultaneous assigns etc.)
> Simple fix is, add regions with initial state as CLOSED.
>
> Then looking in other places, CreateTableProcedure seems like it should
> suffer the same fate (CREATE_TABLE_ADD_TO_META
> <https://github.com/apache/hbase/blob/677c1f2c635273eb823b91903dffdb
> 2e587f5181/hbase-server/src/main/java/org/apache/hadoop/
> hbase/master/procedure/CreateTableProcedure.java#L104>).
> Should we add region as CLOSED there too?
> (Weird part is, it's not failing, looking into it)
>
> So the main question is, shouldn't we always add new regions to meta with
> state as CLOSED?
>
>
I've been here recently. OFFLINE?



> Whatever operation is adding them will also be opening them if needed,
> right? And no operation should be relying on this weird AM assumption to
> complete it's half done job.
>
> Food for thought - Some operations adding regions are: truncate table,
> create table, modify table, clone snapshot, restore snapshot.
>
> Can you imagine a case where not adding a new region as CLOSED makes sense?
>
>
None. There is adding and then AMv2 takes control.

These tests that double-kill to ensure each step recoverable are great,
yeah, at finding dirty bugs.

I think truncate table though a little silly. If you proposed killing it,
I'd +1 it.

St.Ack


> -- Appy
>

Reply via email to