[ 
https://issues.apache.org/jira/browse/PHOENIX-3165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15417792#comment-15417792
 ] 

Andrew Purtell commented on PHOENIX-3165:
-----------------------------------------

bq. Unfortunately, that's not possible across all the features of Phoenix:

No integrity check or repair tool will handle 100% of the cases. Indeed there 
will be cases where fallback to manual recovery approaches will be necessary 
and some aspects of metadata tricky or not amenable at all to automated repair 
approaches. That said, I'm comfortable stating long operator experience with 
'fsck' class tools over the history of operation of computing systems 
demonstrates their utility. Take HBase fsck as an example. I know it to only 
cover a subset of possible problems, but it allowed me to recover a critical 
production system in minutes. Imagine if I only had as recourse hacking of META 
table HFiles with the Hadoop fsshell! It would have been hours of high profile 
downtime as opposed to minutes, which was serious enough. 

bq. Corruption can take many forms, though. I think it's important to 
understand the root cause of the corruption, as IMHO prevention is the best 
medicine.

It's not possible to prevent corruption. There are so many opportunities, so 
many chains of events that lead to this outcome. Like I said even with a 
recovery tool there are going to be cases where the tool won't help, but on the 
other hand there are cases - and with care and attention, those likely to be 
common - where a recovery tool will allow the user to bring their systems back 
to an available state very quickly. Snapshots and backups increase the margin 
of safety overall but are never a quick nor complete solution for system 
recovery. By definition they miss latest updates. Recovering from latest state 
by applying a dynamically analyzed delta is faster and a deft surgical tool 
compared to the big drop-and-restore hammer.

Metadata repair tools are not different from index rebuild tools. The RDBMS 
system has metadata. The system is meant for mission critical operation. The 
system requires operational tools that meet that objective.

bq. Updating HBase metadata with every change to the SYSTEM.CATALOG would put a 
huge drag on the system.

How so? 

bq. If we're going to do something like that, better to change the design and 
keep the system-of-record in zookeeper instead.

I don't think "system of record" is a use case suitable for ZooKeeper, and I 
believe this to be a common understanding. It's certainly a frequent conclusion 
in system design discussions of which I have been a part. That is not a knock 
on ZooKeeper. It is rock solid as a coordination and consensus service.

bq. Best to have Phoenix-level APIs instead that can guarantee that the system 
catalog is kept in a valid state with commits being performed transactionally.

Sure, "Does not depend on the Phoenix client" is rephrased alternatively and 
hopefully better as "Is Phoenix code using blessed repair mechanisms that do 
not depend on the normal client code paths"

I don't think we can depend on transactional functionality to always be in a 
workable state, if you are referring to the 4.8+ transactional functionality 
that requires Tephra and its metadata to be in working order. 

> System table integrity check and repair tool
> --------------------------------------------
>
>                 Key: PHOENIX-3165
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3165
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Andrew Purtell
>            Priority: Critical
>
> When the Phoenix system tables become corrupt recovery is a painstaking 
> process of low level examination of table contents and manipulation of same 
> with the HBase shell. This is very difficult work providing no margin of 
> safety, and is a critical gap in terms of usability.
> At the OS level, we have fsck.
> At the HDFS level, we have fsck (integrity checking only, though)
> At the HBase level, we have hbck. 
> At the Phoenix level, we lack a system table repair tool. 
> Implement a tool that:
> - Does not depend on the Phoenix client.
> - Supports integrity checking of SYSTEM tables. Check for the existence of 
> all required columns in entries. Check that entries exist for all Phoenix 
> managed tables (implies Phoenix should add supporting advisory-only metadata 
> to the HBase table schemas). Check that serializations are valid. 
> - Supports complete repair of SYSTEM.CATALOG and recreation, if necessary, of 
> other tables like SYSTEM.STATS which can be dropped to recover from an 
> emergency. We should be able to drop SYSTEM.CATALOG (or any other SYSTEM 
> table), run the tool, and have a completely correct recreation of 
> SYSTEM.CATALOG available at the end of its execution.
> - To the extent we have or introduce cross-system-table invariants, check 
> them and offer a repair or reconstruction option.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to