HBase using its own checksum handling doesn't directly affect HDFS. It will
still maintain checksum info.  The diff is at the read time..  HBase will
open reader with checksum validation false and it will do checksum
validation on its own.   So using hbase handled checksum in a cluster
should not affect other data..  Does that solves your doubt?

-Anoop-

On Tue, Apr 29, 2014 at 1:58 PM, Krishna Rao <[email protected]> wrote:

> Hi Ted,
>
> I had read those, but I'm confused about how this will affect non-HBase
> HDFS data. With HDFS checksumming off won't it affect data integrity?
>
> Krishna
>
>
> On 24 April 2014 15:54, Ted Yu <[email protected]> wrote:
>
> > Please take a look at the following:
> >
> > http://hbase.apache.org/book.html#perf.hdfs.configs.localread
> > http://hbase.apache.org/book.html#hbase.regionserver.checksum.verify
> >
> >
> > On Thu, Apr 24, 2014 at 5:55 AM, Krishna Rao <[email protected]>
> > wrote:
> >
> > > Hi all,
> > >
> > > I understand that there is a significant improvement gain when turning
> on
> > > short circuit reads, and additionally by setting HBase to do checksums
> > > rather than HDFS.
> > >
> > > However, I'm a little confused by this, do I need to turn of checksum
> > > within HDFS for the entire file system? We don't just use HBase on our
> > > cluster, so this would seem to be a bad idea right?
> > >
> > >  Cheers,
> > >
> > > Krishna
> > >
> >
>

Reply via email to