Re: roadmap: data integrity

Ryan Rawson Thu, 06 Aug 2009 23:38:23 -0700

WAL is a major issue, but another one that is coming up fast is the
SPOF that is the namenode.


Right now, namenode aside, I can rolling restart my entire cluster,
including rebooting the machines if I needed to. But not so with the
namenode, because if it does AWOL, all sorts of bad can happen.

I hope that HDFS 0.21 addresses both these issues.  Can we get
positive confirmation that this is being worked on?

-ryan

On Thu, Aug 6, 2009 at 10:25 AM, Andrew Purtell<[email protected]> wrote:
> I updated the roadmap up on the wiki:
>
>
> * Data integrity
>    * Insure that proper append() support in HDFS actually closes the
>      WAL last block write hole
>    * HBase-FSCK (HBASE-7) -- Suggest making this a blocker for 0.21
>
> I have had several recent conversations on my travels with people in
> Fortune 100 companies (based on this list:
> http://www.wageproject.org/content/fortune/index.php).
>
> You and I know we can set up well engineered HBase 0.20 clusters that
> will be operationally solid for a wide range of use cases, but given
> those aforementioned discussions there are certain sectors which would
> say HBASE-7 is #1 before HBase is "bank ready". Not until we can say:
>
>  - Yes, when the client sees data has been committed, it actually has
> been written and replicated on spinning or solid state media in all
> cases.
>
>  - Yes, we go to great lengths to recover data if ${deity} forbid you
> crush some underprovisioned cluster with load or some bizarre bug or
> system fault happens.
>
> HBASE-1295 is also required for business continuity reasons, but this
> is already a priority item for some HBase committers.
>
> The question is I think does the above align with project goals.
> Making HBase-FSCK a blocker will probably knock something someone
> wants for the 0.21 timeframe off the list.
>
>   - Andy
>
>
>

Re: roadmap: data integrity

Reply via email to