Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-23 Thread OpenInx
Merged the HBASE-21879 back to master branch,  Thanks all.

On Sun, Jun 23, 2019 at 10:40 AM OpenInx  wrote:

> Now we have four +1 votes ( all binding).
> Will wait a day before the merge, if no objections I will do the merge.
> Thanks all for reviewing & checking.
>
> On Sat, Jun 22, 2019 at 11:05 PM Anoop John  wrote:
>
>> +1 for master merge
>>
>> Anoop
>>
>> On Sat, Jun 22, 2019 at 9:47 AM ramkrishna vasudevan <
>> ramkrishna.s.vasude...@gmail.com> wrote:
>>
>> > +1 to merge to master. Great job Zheng
>> >
>> > On Sat, Jun 22, 2019, 8:41 AM Guanghao Zhang 
>> wrote:
>> >
>> > > +1 for merge this to master.
>> > >
>> > > OpenInx  于2019年6月21日周五 下午2:56写道:
>> > >
>> > > > Update:
>> > > >
>> > > > The ByteBuffer pread backport is under reviewing now.
>> > > > https://github.com/apache/hadoop/pull/997
>> > > >
>> > > > As Hadoop team said,  the Hadoop 2.8 will be EOL soon, so our HDFS
>> team
>> > > > will backport this patch to
>> > > > branch-2 & branch-2.9,  we may need to upgrade the hadoop
>> dependencies
>> > > from
>> > > > 2.8.5 to 2.9.3 in future.
>> > > >
>> > > > Thanks.
>> > > >
>> > > > On Wed, Jun 19, 2019 at 10:41 PM OpenInx  wrote:
>> > > >
>> > > > > Thanks for your reviewing and flaky test checking, Duo.
>> > > > > Will file a separate issue to address your comment if necessary.
>> > > > >
>> > > > > Thanks.
>> > > > >
>> > > > > On Wed, Jun 19, 2019 at 9:55 PM 张铎(Duo Zhang) <
>> palomino...@gmail.com
>> > >
>> > > > > wrote:
>> > > > >
>> > > > >> +1 from me.
>> > > > >>
>> > > > >> Left a few comments on github PR, not big problems. And the flaky
>> > > > >> dashboard
>> > > > >> is pretty good.
>> > > > >>
>> > > > >>
>> > > > >>
>> > > >
>> > >
>> >
>> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/HBASE-21879/lastSuccessfulBuild/artifact/dashboard.html
>> > > > >>
>> > > > >>
>> > > > >> The TestConnectionImplementation was also failing on master, and
>> was
>> > > > fixed
>> > > > >> after merging back HBASE-21512.
>> > > > >>
>> > > > >> 张铎(Duo Zhang)  于2019年6月18日周二 下午9:48写道:
>> > > > >>
>> > > > >> > Good. Will take a look soon.
>> > > > >> >
>> > > > >> > OpenInx  于2019年6月18日周二 下午9:41写道:
>> > > > >> >
>> > > > >> >> > Could please open a PR, just like what I have done for
>> > > HBASE-21512,
>> > > > >> so
>> > > > >> >> that others could have a overall view on the modified code?
>> > > > >> >>
>> > > > >> >> OK,  created a PR for this:
>> > > https://github.com/apache/hbase/pull/320
>> > > > >> >> Thanks for your suggestion, Duo.
>> > > > >> >>
>> > > > >> >> On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang) <
>> > > palomino...@gmail.com
>> > > > >
>> > > > >> >> wrote:
>> > > > >> >>
>> > > > >> >> > The performance number is great.
>> > > > >> >> >
>> > > > >> >> > Could please open a PR, just like what I have done for
>> > > HBASE-21512,
>> > > > >> so
>> > > > >> >> that
>> > > > >> >> > others could have a overall view on the modified code?
>> > > > >> >> >
>> > > > >> >> > Thanks.
>> > > > >> >> >
>> > > > >> >> > OpenInx  于2019年6月18日周二 下午6:58写道:
>> > > > >> >> >
>> > > > >> >> > > BTW,  when testing this branch,  we found some performance
>> > > issues
>> > > > >> >> about
>> > > > >> >> > > HDFS Client:
>> > > > >> >> > > 1.  we reduced the DFS client's heap allocation from 45%
>> to
>> > 27%
>> > > > >> >> > > in HDFS-14535 [1];
>> > > > >> >> > > 2.  we also increased get throughput by 17.8% in disabled
>> > block
>> > > > >> cache
>> > > > >> >> > case
>> > > > >> >> > > in HDFS-14541[2].
>> > > > >> >> > >  In theory, it should also helps a lot (especially
>> > > p99/p999)
>> > > > >> even
>> > > > >> >> if
>> > > > >> >> > RS
>> > > > >> >> > > has a high cacheHitRatio.
>> > > > >> >> > >
>> > > > >> >> > > I think the next HDFS 2.8 release will include those
>> patches,
>> > > > >> they're
>> > > > >> >> > very
>> > > > >> >> > > good points for our
>> > > > >> >> > > HBase performance.
>> > > > >> >> > >
>> > > > >> >> > > [1]. https://issues.apache.org/jira/browse/HDFS-14535
>> > > > >> >> > > [2].
>> > > > >> >> > >
>> > > > >> >> > >
>> > > > >> >> >
>> > > > >> >>
>> > > > >>
>> > > >
>> > >
>> >
>> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
>> > > > >> >> > >
>> > > > >> >> > > Thanks.
>> > > > >> >> > >
>> > > > >> >> > > On Tue, Jun 18, 2019 at 12:05 PM OpenInx <
>> open...@gmail.com>
>> > > > >> wrote:
>> > > > >> >> > >
>> > > > >> >> > > > The HBASE-21879 has lots of changes: 123 files changed,
>> > 5833
>> > > > >> >> > > > insertions(+), 3015 deletions(-).
>> > > > >> >> > > > Currently we developed this issue based on master
>> branch,
>> > and
>> > > > >> >> expect to
>> > > > >> >> > > > release it in future HBase3.x.
>> > > > >> >> > > > Of course, if branch-2 want this feature we can do the
>> > > > backport,
>> > > > >> >> should
>> > > > >> >> > > > have some conflicts now but I
>> > > > >> >> > > > don't think it would 

Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-22 Thread OpenInx
Now we have four +1 votes ( all binding).
Will wait a day before the merge, if no objections I will do the merge.
Thanks all for reviewing & checking.

On Sat, Jun 22, 2019 at 11:05 PM Anoop John  wrote:

> +1 for master merge
>
> Anoop
>
> On Sat, Jun 22, 2019 at 9:47 AM ramkrishna vasudevan <
> ramkrishna.s.vasude...@gmail.com> wrote:
>
> > +1 to merge to master. Great job Zheng
> >
> > On Sat, Jun 22, 2019, 8:41 AM Guanghao Zhang  wrote:
> >
> > > +1 for merge this to master.
> > >
> > > OpenInx  于2019年6月21日周五 下午2:56写道:
> > >
> > > > Update:
> > > >
> > > > The ByteBuffer pread backport is under reviewing now.
> > > > https://github.com/apache/hadoop/pull/997
> > > >
> > > > As Hadoop team said,  the Hadoop 2.8 will be EOL soon, so our HDFS
> team
> > > > will backport this patch to
> > > > branch-2 & branch-2.9,  we may need to upgrade the hadoop
> dependencies
> > > from
> > > > 2.8.5 to 2.9.3 in future.
> > > >
> > > > Thanks.
> > > >
> > > > On Wed, Jun 19, 2019 at 10:41 PM OpenInx  wrote:
> > > >
> > > > > Thanks for your reviewing and flaky test checking, Duo.
> > > > > Will file a separate issue to address your comment if necessary.
> > > > >
> > > > > Thanks.
> > > > >
> > > > > On Wed, Jun 19, 2019 at 9:55 PM 张铎(Duo Zhang) <
> palomino...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > >> +1 from me.
> > > > >>
> > > > >> Left a few comments on github PR, not big problems. And the flaky
> > > > >> dashboard
> > > > >> is pretty good.
> > > > >>
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/HBASE-21879/lastSuccessfulBuild/artifact/dashboard.html
> > > > >>
> > > > >>
> > > > >> The TestConnectionImplementation was also failing on master, and
> was
> > > > fixed
> > > > >> after merging back HBASE-21512.
> > > > >>
> > > > >> 张铎(Duo Zhang)  于2019年6月18日周二 下午9:48写道:
> > > > >>
> > > > >> > Good. Will take a look soon.
> > > > >> >
> > > > >> > OpenInx  于2019年6月18日周二 下午9:41写道:
> > > > >> >
> > > > >> >> > Could please open a PR, just like what I have done for
> > > HBASE-21512,
> > > > >> so
> > > > >> >> that others could have a overall view on the modified code?
> > > > >> >>
> > > > >> >> OK,  created a PR for this:
> > > https://github.com/apache/hbase/pull/320
> > > > >> >> Thanks for your suggestion, Duo.
> > > > >> >>
> > > > >> >> On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang) <
> > > palomino...@gmail.com
> > > > >
> > > > >> >> wrote:
> > > > >> >>
> > > > >> >> > The performance number is great.
> > > > >> >> >
> > > > >> >> > Could please open a PR, just like what I have done for
> > > HBASE-21512,
> > > > >> so
> > > > >> >> that
> > > > >> >> > others could have a overall view on the modified code?
> > > > >> >> >
> > > > >> >> > Thanks.
> > > > >> >> >
> > > > >> >> > OpenInx  于2019年6月18日周二 下午6:58写道:
> > > > >> >> >
> > > > >> >> > > BTW,  when testing this branch,  we found some performance
> > > issues
> > > > >> >> about
> > > > >> >> > > HDFS Client:
> > > > >> >> > > 1.  we reduced the DFS client's heap allocation from 45% to
> > 27%
> > > > >> >> > > in HDFS-14535 [1];
> > > > >> >> > > 2.  we also increased get throughput by 17.8% in disabled
> > block
> > > > >> cache
> > > > >> >> > case
> > > > >> >> > > in HDFS-14541[2].
> > > > >> >> > >  In theory, it should also helps a lot (especially
> > > p99/p999)
> > > > >> even
> > > > >> >> if
> > > > >> >> > RS
> > > > >> >> > > has a high cacheHitRatio.
> > > > >> >> > >
> > > > >> >> > > I think the next HDFS 2.8 release will include those
> patches,
> > > > >> they're
> > > > >> >> > very
> > > > >> >> > > good points for our
> > > > >> >> > > HBase performance.
> > > > >> >> > >
> > > > >> >> > > [1]. https://issues.apache.org/jira/browse/HDFS-14535
> > > > >> >> > > [2].
> > > > >> >> > >
> > > > >> >> > >
> > > > >> >> >
> > > > >> >>
> > > > >>
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
> > > > >> >> > >
> > > > >> >> > > Thanks.
> > > > >> >> > >
> > > > >> >> > > On Tue, Jun 18, 2019 at 12:05 PM OpenInx <
> open...@gmail.com>
> > > > >> wrote:
> > > > >> >> > >
> > > > >> >> > > > The HBASE-21879 has lots of changes: 123 files changed,
> > 5833
> > > > >> >> > > > insertions(+), 3015 deletions(-).
> > > > >> >> > > > Currently we developed this issue based on master branch,
> > and
> > > > >> >> expect to
> > > > >> >> > > > release it in future HBase3.x.
> > > > >> >> > > > Of course, if branch-2 want this feature we can do the
> > > > backport,
> > > > >> >> should
> > > > >> >> > > > have some conflicts now but I
> > > > >> >> > > > don't think it would be hard to fix because I believe the
> > > > >> branch-2
> > > > >> >> > > > shouldn't have so much diff with master now
> > > > >> >> > > > (at least in read path).
> > > > >> >> > > > The first priority thing for now,   I think it would be
> > > merging
> > > > >> the
> > > 

Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-22 Thread Anoop John
+1 for master merge

Anoop

On Sat, Jun 22, 2019 at 9:47 AM ramkrishna vasudevan <
ramkrishna.s.vasude...@gmail.com> wrote:

> +1 to merge to master. Great job Zheng
>
> On Sat, Jun 22, 2019, 8:41 AM Guanghao Zhang  wrote:
>
> > +1 for merge this to master.
> >
> > OpenInx  于2019年6月21日周五 下午2:56写道:
> >
> > > Update:
> > >
> > > The ByteBuffer pread backport is under reviewing now.
> > > https://github.com/apache/hadoop/pull/997
> > >
> > > As Hadoop team said,  the Hadoop 2.8 will be EOL soon, so our HDFS team
> > > will backport this patch to
> > > branch-2 & branch-2.9,  we may need to upgrade the hadoop dependencies
> > from
> > > 2.8.5 to 2.9.3 in future.
> > >
> > > Thanks.
> > >
> > > On Wed, Jun 19, 2019 at 10:41 PM OpenInx  wrote:
> > >
> > > > Thanks for your reviewing and flaky test checking, Duo.
> > > > Will file a separate issue to address your comment if necessary.
> > > >
> > > > Thanks.
> > > >
> > > > On Wed, Jun 19, 2019 at 9:55 PM 张铎(Duo Zhang)  >
> > > > wrote:
> > > >
> > > >> +1 from me.
> > > >>
> > > >> Left a few comments on github PR, not big problems. And the flaky
> > > >> dashboard
> > > >> is pretty good.
> > > >>
> > > >>
> > > >>
> > >
> >
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/HBASE-21879/lastSuccessfulBuild/artifact/dashboard.html
> > > >>
> > > >>
> > > >> The TestConnectionImplementation was also failing on master, and was
> > > fixed
> > > >> after merging back HBASE-21512.
> > > >>
> > > >> 张铎(Duo Zhang)  于2019年6月18日周二 下午9:48写道:
> > > >>
> > > >> > Good. Will take a look soon.
> > > >> >
> > > >> > OpenInx  于2019年6月18日周二 下午9:41写道:
> > > >> >
> > > >> >> > Could please open a PR, just like what I have done for
> > HBASE-21512,
> > > >> so
> > > >> >> that others could have a overall view on the modified code?
> > > >> >>
> > > >> >> OK,  created a PR for this:
> > https://github.com/apache/hbase/pull/320
> > > >> >> Thanks for your suggestion, Duo.
> > > >> >>
> > > >> >> On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang) <
> > palomino...@gmail.com
> > > >
> > > >> >> wrote:
> > > >> >>
> > > >> >> > The performance number is great.
> > > >> >> >
> > > >> >> > Could please open a PR, just like what I have done for
> > HBASE-21512,
> > > >> so
> > > >> >> that
> > > >> >> > others could have a overall view on the modified code?
> > > >> >> >
> > > >> >> > Thanks.
> > > >> >> >
> > > >> >> > OpenInx  于2019年6月18日周二 下午6:58写道:
> > > >> >> >
> > > >> >> > > BTW,  when testing this branch,  we found some performance
> > issues
> > > >> >> about
> > > >> >> > > HDFS Client:
> > > >> >> > > 1.  we reduced the DFS client's heap allocation from 45% to
> 27%
> > > >> >> > > in HDFS-14535 [1];
> > > >> >> > > 2.  we also increased get throughput by 17.8% in disabled
> block
> > > >> cache
> > > >> >> > case
> > > >> >> > > in HDFS-14541[2].
> > > >> >> > >  In theory, it should also helps a lot (especially
> > p99/p999)
> > > >> even
> > > >> >> if
> > > >> >> > RS
> > > >> >> > > has a high cacheHitRatio.
> > > >> >> > >
> > > >> >> > > I think the next HDFS 2.8 release will include those patches,
> > > >> they're
> > > >> >> > very
> > > >> >> > > good points for our
> > > >> >> > > HBase performance.
> > > >> >> > >
> > > >> >> > > [1]. https://issues.apache.org/jira/browse/HDFS-14535
> > > >> >> > > [2].
> > > >> >> > >
> > > >> >> > >
> > > >> >> >
> > > >> >>
> > > >>
> > >
> >
> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
> > > >> >> > >
> > > >> >> > > Thanks.
> > > >> >> > >
> > > >> >> > > On Tue, Jun 18, 2019 at 12:05 PM OpenInx 
> > > >> wrote:
> > > >> >> > >
> > > >> >> > > > The HBASE-21879 has lots of changes: 123 files changed,
> 5833
> > > >> >> > > > insertions(+), 3015 deletions(-).
> > > >> >> > > > Currently we developed this issue based on master branch,
> and
> > > >> >> expect to
> > > >> >> > > > release it in future HBase3.x.
> > > >> >> > > > Of course, if branch-2 want this feature we can do the
> > > backport,
> > > >> >> should
> > > >> >> > > > have some conflicts now but I
> > > >> >> > > > don't think it would be hard to fix because I believe the
> > > >> branch-2
> > > >> >> > > > shouldn't have so much diff with master now
> > > >> >> > > > (at least in read path).
> > > >> >> > > > The first priority thing for now,   I think it would be
> > merging
> > > >> the
> > > >> >> > > > HBASE-21879 branch to master branch
> > > >> >> > > > before diverging.  After that, I can do the backport.
> > > >> >> > > >
> > > >> >> > > > Thanks for your suggestion, Guanghao !
> > > >> >> > > >
> > > >> >> > > >
> > > >> >> > > >
> > > >> >> > > > On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang <
> > > >> zghao...@gmail.com
> > > >> >> >
> > > >> >> > > > wrote:
> > > >> >> > > >
> > > >> >> > > >> This is a improvement not a new feature? So backport to
> > > >> branch-2,
> > > >> >> too?
> > > >> >> > > >>
> > > >> >> > > >> 

Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-21 Thread ramkrishna vasudevan
+1 to merge to master. Great job Zheng

On Sat, Jun 22, 2019, 8:41 AM Guanghao Zhang  wrote:

> +1 for merge this to master.
>
> OpenInx  于2019年6月21日周五 下午2:56写道:
>
> > Update:
> >
> > The ByteBuffer pread backport is under reviewing now.
> > https://github.com/apache/hadoop/pull/997
> >
> > As Hadoop team said,  the Hadoop 2.8 will be EOL soon, so our HDFS team
> > will backport this patch to
> > branch-2 & branch-2.9,  we may need to upgrade the hadoop dependencies
> from
> > 2.8.5 to 2.9.3 in future.
> >
> > Thanks.
> >
> > On Wed, Jun 19, 2019 at 10:41 PM OpenInx  wrote:
> >
> > > Thanks for your reviewing and flaky test checking, Duo.
> > > Will file a separate issue to address your comment if necessary.
> > >
> > > Thanks.
> > >
> > > On Wed, Jun 19, 2019 at 9:55 PM 张铎(Duo Zhang) 
> > > wrote:
> > >
> > >> +1 from me.
> > >>
> > >> Left a few comments on github PR, not big problems. And the flaky
> > >> dashboard
> > >> is pretty good.
> > >>
> > >>
> > >>
> >
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/HBASE-21879/lastSuccessfulBuild/artifact/dashboard.html
> > >>
> > >>
> > >> The TestConnectionImplementation was also failing on master, and was
> > fixed
> > >> after merging back HBASE-21512.
> > >>
> > >> 张铎(Duo Zhang)  于2019年6月18日周二 下午9:48写道:
> > >>
> > >> > Good. Will take a look soon.
> > >> >
> > >> > OpenInx  于2019年6月18日周二 下午9:41写道:
> > >> >
> > >> >> > Could please open a PR, just like what I have done for
> HBASE-21512,
> > >> so
> > >> >> that others could have a overall view on the modified code?
> > >> >>
> > >> >> OK,  created a PR for this:
> https://github.com/apache/hbase/pull/320
> > >> >> Thanks for your suggestion, Duo.
> > >> >>
> > >> >> On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang) <
> palomino...@gmail.com
> > >
> > >> >> wrote:
> > >> >>
> > >> >> > The performance number is great.
> > >> >> >
> > >> >> > Could please open a PR, just like what I have done for
> HBASE-21512,
> > >> so
> > >> >> that
> > >> >> > others could have a overall view on the modified code?
> > >> >> >
> > >> >> > Thanks.
> > >> >> >
> > >> >> > OpenInx  于2019年6月18日周二 下午6:58写道:
> > >> >> >
> > >> >> > > BTW,  when testing this branch,  we found some performance
> issues
> > >> >> about
> > >> >> > > HDFS Client:
> > >> >> > > 1.  we reduced the DFS client's heap allocation from 45% to 27%
> > >> >> > > in HDFS-14535 [1];
> > >> >> > > 2.  we also increased get throughput by 17.8% in disabled block
> > >> cache
> > >> >> > case
> > >> >> > > in HDFS-14541[2].
> > >> >> > >  In theory, it should also helps a lot (especially
> p99/p999)
> > >> even
> > >> >> if
> > >> >> > RS
> > >> >> > > has a high cacheHitRatio.
> > >> >> > >
> > >> >> > > I think the next HDFS 2.8 release will include those patches,
> > >> they're
> > >> >> > very
> > >> >> > > good points for our
> > >> >> > > HBase performance.
> > >> >> > >
> > >> >> > > [1]. https://issues.apache.org/jira/browse/HDFS-14535
> > >> >> > > [2].
> > >> >> > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
> > >> >> > >
> > >> >> > > Thanks.
> > >> >> > >
> > >> >> > > On Tue, Jun 18, 2019 at 12:05 PM OpenInx 
> > >> wrote:
> > >> >> > >
> > >> >> > > > The HBASE-21879 has lots of changes: 123 files changed, 5833
> > >> >> > > > insertions(+), 3015 deletions(-).
> > >> >> > > > Currently we developed this issue based on master branch, and
> > >> >> expect to
> > >> >> > > > release it in future HBase3.x.
> > >> >> > > > Of course, if branch-2 want this feature we can do the
> > backport,
> > >> >> should
> > >> >> > > > have some conflicts now but I
> > >> >> > > > don't think it would be hard to fix because I believe the
> > >> branch-2
> > >> >> > > > shouldn't have so much diff with master now
> > >> >> > > > (at least in read path).
> > >> >> > > > The first priority thing for now,   I think it would be
> merging
> > >> the
> > >> >> > > > HBASE-21879 branch to master branch
> > >> >> > > > before diverging.  After that, I can do the backport.
> > >> >> > > >
> > >> >> > > > Thanks for your suggestion, Guanghao !
> > >> >> > > >
> > >> >> > > >
> > >> >> > > >
> > >> >> > > > On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang <
> > >> zghao...@gmail.com
> > >> >> >
> > >> >> > > > wrote:
> > >> >> > > >
> > >> >> > > >> This is a improvement not a new feature? So backport to
> > >> branch-2,
> > >> >> too?
> > >> >> > > >>
> > >> >> > > >> OpenInx  于2019年6月17日周一 下午2:45写道:
> > >> >> > > >>
> > >> >> > > >> > Dear HBase dev:
> > >> >> > > >> >
> > >> >> > > >> > In HBASE-21879[1], we redesigned the offheap read path:
> read
> > >> the
> > >> >> > > >> HFileBlock
> > >> >> > > >> > from HDFS to pooled offheap
> > >> >> > > >> > ByteBuffers directly, while before HBASE-21879 we just
> read
> > >> the
> > >> >> > > >> HFileBlock
> > >> >> > > >> > to heap which would still lead
> 

Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-21 Thread Guanghao Zhang
+1 for merge this to master.

OpenInx  于2019年6月21日周五 下午2:56写道:

> Update:
>
> The ByteBuffer pread backport is under reviewing now.
> https://github.com/apache/hadoop/pull/997
>
> As Hadoop team said,  the Hadoop 2.8 will be EOL soon, so our HDFS team
> will backport this patch to
> branch-2 & branch-2.9,  we may need to upgrade the hadoop dependencies from
> 2.8.5 to 2.9.3 in future.
>
> Thanks.
>
> On Wed, Jun 19, 2019 at 10:41 PM OpenInx  wrote:
>
> > Thanks for your reviewing and flaky test checking, Duo.
> > Will file a separate issue to address your comment if necessary.
> >
> > Thanks.
> >
> > On Wed, Jun 19, 2019 at 9:55 PM 张铎(Duo Zhang) 
> > wrote:
> >
> >> +1 from me.
> >>
> >> Left a few comments on github PR, not big problems. And the flaky
> >> dashboard
> >> is pretty good.
> >>
> >>
> >>
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/HBASE-21879/lastSuccessfulBuild/artifact/dashboard.html
> >>
> >>
> >> The TestConnectionImplementation was also failing on master, and was
> fixed
> >> after merging back HBASE-21512.
> >>
> >> 张铎(Duo Zhang)  于2019年6月18日周二 下午9:48写道:
> >>
> >> > Good. Will take a look soon.
> >> >
> >> > OpenInx  于2019年6月18日周二 下午9:41写道:
> >> >
> >> >> > Could please open a PR, just like what I have done for HBASE-21512,
> >> so
> >> >> that others could have a overall view on the modified code?
> >> >>
> >> >> OK,  created a PR for this: https://github.com/apache/hbase/pull/320
> >> >> Thanks for your suggestion, Duo.
> >> >>
> >> >> On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang)  >
> >> >> wrote:
> >> >>
> >> >> > The performance number is great.
> >> >> >
> >> >> > Could please open a PR, just like what I have done for HBASE-21512,
> >> so
> >> >> that
> >> >> > others could have a overall view on the modified code?
> >> >> >
> >> >> > Thanks.
> >> >> >
> >> >> > OpenInx  于2019年6月18日周二 下午6:58写道:
> >> >> >
> >> >> > > BTW,  when testing this branch,  we found some performance issues
> >> >> about
> >> >> > > HDFS Client:
> >> >> > > 1.  we reduced the DFS client's heap allocation from 45% to 27%
> >> >> > > in HDFS-14535 [1];
> >> >> > > 2.  we also increased get throughput by 17.8% in disabled block
> >> cache
> >> >> > case
> >> >> > > in HDFS-14541[2].
> >> >> > >  In theory, it should also helps a lot (especially p99/p999)
> >> even
> >> >> if
> >> >> > RS
> >> >> > > has a high cacheHitRatio.
> >> >> > >
> >> >> > > I think the next HDFS 2.8 release will include those patches,
> >> they're
> >> >> > very
> >> >> > > good points for our
> >> >> > > HBase performance.
> >> >> > >
> >> >> > > [1]. https://issues.apache.org/jira/browse/HDFS-14535
> >> >> > > [2].
> >> >> > >
> >> >> > >
> >> >> >
> >> >>
> >>
> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
> >> >> > >
> >> >> > > Thanks.
> >> >> > >
> >> >> > > On Tue, Jun 18, 2019 at 12:05 PM OpenInx 
> >> wrote:
> >> >> > >
> >> >> > > > The HBASE-21879 has lots of changes: 123 files changed, 5833
> >> >> > > > insertions(+), 3015 deletions(-).
> >> >> > > > Currently we developed this issue based on master branch, and
> >> >> expect to
> >> >> > > > release it in future HBase3.x.
> >> >> > > > Of course, if branch-2 want this feature we can do the
> backport,
> >> >> should
> >> >> > > > have some conflicts now but I
> >> >> > > > don't think it would be hard to fix because I believe the
> >> branch-2
> >> >> > > > shouldn't have so much diff with master now
> >> >> > > > (at least in read path).
> >> >> > > > The first priority thing for now,   I think it would be merging
> >> the
> >> >> > > > HBASE-21879 branch to master branch
> >> >> > > > before diverging.  After that, I can do the backport.
> >> >> > > >
> >> >> > > > Thanks for your suggestion, Guanghao !
> >> >> > > >
> >> >> > > >
> >> >> > > >
> >> >> > > > On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang <
> >> zghao...@gmail.com
> >> >> >
> >> >> > > > wrote:
> >> >> > > >
> >> >> > > >> This is a improvement not a new feature? So backport to
> >> branch-2,
> >> >> too?
> >> >> > > >>
> >> >> > > >> OpenInx  于2019年6月17日周一 下午2:45写道:
> >> >> > > >>
> >> >> > > >> > Dear HBase dev:
> >> >> > > >> >
> >> >> > > >> > In HBASE-21879[1], we redesigned the offheap read path: read
> >> the
> >> >> > > >> HFileBlock
> >> >> > > >> > from HDFS to pooled offheap
> >> >> > > >> > ByteBuffers directly, while before HBASE-21879 we just read
> >> the
> >> >> > > >> HFileBlock
> >> >> > > >> > to heap which would still lead
> >> >> > > >> > to high GC pressure.
> >> >> > > >> >
> >> >> > > >> > After few months of development and testing, all subtasks
> have
> >> >> been
> >> >> > > >> > resovled now except the HBASE-21946[2]
> >> >> > > >> > (It depends on HDFS-14483[3] and our HDFS teams are working
> on
> >> >> this,
> >> >> > > we
> >> >> > > >> > expect the HDFS-14483 to be included
> >> >> > > >> > in hadoop 2.8.6 and after that the HBASE-21946 

Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-21 Thread OpenInx
Update:

The ByteBuffer pread backport is under reviewing now.
https://github.com/apache/hadoop/pull/997

As Hadoop team said,  the Hadoop 2.8 will be EOL soon, so our HDFS team
will backport this patch to
branch-2 & branch-2.9,  we may need to upgrade the hadoop dependencies from
2.8.5 to 2.9.3 in future.

Thanks.

On Wed, Jun 19, 2019 at 10:41 PM OpenInx  wrote:

> Thanks for your reviewing and flaky test checking, Duo.
> Will file a separate issue to address your comment if necessary.
>
> Thanks.
>
> On Wed, Jun 19, 2019 at 9:55 PM 张铎(Duo Zhang) 
> wrote:
>
>> +1 from me.
>>
>> Left a few comments on github PR, not big problems. And the flaky
>> dashboard
>> is pretty good.
>>
>>
>> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/HBASE-21879/lastSuccessfulBuild/artifact/dashboard.html
>>
>>
>> The TestConnectionImplementation was also failing on master, and was fixed
>> after merging back HBASE-21512.
>>
>> 张铎(Duo Zhang)  于2019年6月18日周二 下午9:48写道:
>>
>> > Good. Will take a look soon.
>> >
>> > OpenInx  于2019年6月18日周二 下午9:41写道:
>> >
>> >> > Could please open a PR, just like what I have done for HBASE-21512,
>> so
>> >> that others could have a overall view on the modified code?
>> >>
>> >> OK,  created a PR for this: https://github.com/apache/hbase/pull/320
>> >> Thanks for your suggestion, Duo.
>> >>
>> >> On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang) 
>> >> wrote:
>> >>
>> >> > The performance number is great.
>> >> >
>> >> > Could please open a PR, just like what I have done for HBASE-21512,
>> so
>> >> that
>> >> > others could have a overall view on the modified code?
>> >> >
>> >> > Thanks.
>> >> >
>> >> > OpenInx  于2019年6月18日周二 下午6:58写道:
>> >> >
>> >> > > BTW,  when testing this branch,  we found some performance issues
>> >> about
>> >> > > HDFS Client:
>> >> > > 1.  we reduced the DFS client's heap allocation from 45% to 27%
>> >> > > in HDFS-14535 [1];
>> >> > > 2.  we also increased get throughput by 17.8% in disabled block
>> cache
>> >> > case
>> >> > > in HDFS-14541[2].
>> >> > >  In theory, it should also helps a lot (especially p99/p999)
>> even
>> >> if
>> >> > RS
>> >> > > has a high cacheHitRatio.
>> >> > >
>> >> > > I think the next HDFS 2.8 release will include those patches,
>> they're
>> >> > very
>> >> > > good points for our
>> >> > > HBase performance.
>> >> > >
>> >> > > [1]. https://issues.apache.org/jira/browse/HDFS-14535
>> >> > > [2].
>> >> > >
>> >> > >
>> >> >
>> >>
>> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
>> >> > >
>> >> > > Thanks.
>> >> > >
>> >> > > On Tue, Jun 18, 2019 at 12:05 PM OpenInx 
>> wrote:
>> >> > >
>> >> > > > The HBASE-21879 has lots of changes: 123 files changed, 5833
>> >> > > > insertions(+), 3015 deletions(-).
>> >> > > > Currently we developed this issue based on master branch, and
>> >> expect to
>> >> > > > release it in future HBase3.x.
>> >> > > > Of course, if branch-2 want this feature we can do the backport,
>> >> should
>> >> > > > have some conflicts now but I
>> >> > > > don't think it would be hard to fix because I believe the
>> branch-2
>> >> > > > shouldn't have so much diff with master now
>> >> > > > (at least in read path).
>> >> > > > The first priority thing for now,   I think it would be merging
>> the
>> >> > > > HBASE-21879 branch to master branch
>> >> > > > before diverging.  After that, I can do the backport.
>> >> > > >
>> >> > > > Thanks for your suggestion, Guanghao !
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang <
>> zghao...@gmail.com
>> >> >
>> >> > > > wrote:
>> >> > > >
>> >> > > >> This is a improvement not a new feature? So backport to
>> branch-2,
>> >> too?
>> >> > > >>
>> >> > > >> OpenInx  于2019年6月17日周一 下午2:45写道:
>> >> > > >>
>> >> > > >> > Dear HBase dev:
>> >> > > >> >
>> >> > > >> > In HBASE-21879[1], we redesigned the offheap read path: read
>> the
>> >> > > >> HFileBlock
>> >> > > >> > from HDFS to pooled offheap
>> >> > > >> > ByteBuffers directly, while before HBASE-21879 we just read
>> the
>> >> > > >> HFileBlock
>> >> > > >> > to heap which would still lead
>> >> > > >> > to high GC pressure.
>> >> > > >> >
>> >> > > >> > After few months of development and testing, all subtasks have
>> >> been
>> >> > > >> > resovled now except the HBASE-21946[2]
>> >> > > >> > (It depends on HDFS-14483[3] and our HDFS teams are working on
>> >> this,
>> >> > > we
>> >> > > >> > expect the HDFS-14483 to be included
>> >> > > >> > in hadoop 2.8.6 and after that the HBASE-21946 will get
>> >> resolved).
>> >> > we
>> >> > > >> think
>> >> > > >> > the feature is stable enough now and it's
>> >> > > >> > time to merge branch HBASE-21879 back to master now.
>> >> > > >> >
>> >> > > >> > We have designed 3 test cases to prove the performance
>> improvment
>> >> > with
>> >> > > >> > HBASE-21879:
>> >> > > >> > 1. Disabled BlockCache, which 

Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-19 Thread OpenInx
Thanks for your reviewing and flaky test checking, Duo.
Will file a separate issue to address your comment if necessary.

Thanks.

On Wed, Jun 19, 2019 at 9:55 PM 张铎(Duo Zhang)  wrote:

> +1 from me.
>
> Left a few comments on github PR, not big problems. And the flaky dashboard
> is pretty good.
>
>
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/job/HBASE-21879/lastSuccessfulBuild/artifact/dashboard.html
>
>
> The TestConnectionImplementation was also failing on master, and was fixed
> after merging back HBASE-21512.
>
> 张铎(Duo Zhang)  于2019年6月18日周二 下午9:48写道:
>
> > Good. Will take a look soon.
> >
> > OpenInx  于2019年6月18日周二 下午9:41写道:
> >
> >> > Could please open a PR, just like what I have done for HBASE-21512, so
> >> that others could have a overall view on the modified code?
> >>
> >> OK,  created a PR for this: https://github.com/apache/hbase/pull/320
> >> Thanks for your suggestion, Duo.
> >>
> >> On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang) 
> >> wrote:
> >>
> >> > The performance number is great.
> >> >
> >> > Could please open a PR, just like what I have done for HBASE-21512, so
> >> that
> >> > others could have a overall view on the modified code?
> >> >
> >> > Thanks.
> >> >
> >> > OpenInx  于2019年6月18日周二 下午6:58写道:
> >> >
> >> > > BTW,  when testing this branch,  we found some performance issues
> >> about
> >> > > HDFS Client:
> >> > > 1.  we reduced the DFS client's heap allocation from 45% to 27%
> >> > > in HDFS-14535 [1];
> >> > > 2.  we also increased get throughput by 17.8% in disabled block
> cache
> >> > case
> >> > > in HDFS-14541[2].
> >> > >  In theory, it should also helps a lot (especially p99/p999)
> even
> >> if
> >> > RS
> >> > > has a high cacheHitRatio.
> >> > >
> >> > > I think the next HDFS 2.8 release will include those patches,
> they're
> >> > very
> >> > > good points for our
> >> > > HBase performance.
> >> > >
> >> > > [1]. https://issues.apache.org/jira/browse/HDFS-14535
> >> > > [2].
> >> > >
> >> > >
> >> >
> >>
> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
> >> > >
> >> > > Thanks.
> >> > >
> >> > > On Tue, Jun 18, 2019 at 12:05 PM OpenInx  wrote:
> >> > >
> >> > > > The HBASE-21879 has lots of changes: 123 files changed, 5833
> >> > > > insertions(+), 3015 deletions(-).
> >> > > > Currently we developed this issue based on master branch, and
> >> expect to
> >> > > > release it in future HBase3.x.
> >> > > > Of course, if branch-2 want this feature we can do the backport,
> >> should
> >> > > > have some conflicts now but I
> >> > > > don't think it would be hard to fix because I believe the branch-2
> >> > > > shouldn't have so much diff with master now
> >> > > > (at least in read path).
> >> > > > The first priority thing for now,   I think it would be merging
> the
> >> > > > HBASE-21879 branch to master branch
> >> > > > before diverging.  After that, I can do the backport.
> >> > > >
> >> > > > Thanks for your suggestion, Guanghao !
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang <
> zghao...@gmail.com
> >> >
> >> > > > wrote:
> >> > > >
> >> > > >> This is a improvement not a new feature? So backport to branch-2,
> >> too?
> >> > > >>
> >> > > >> OpenInx  于2019年6月17日周一 下午2:45写道:
> >> > > >>
> >> > > >> > Dear HBase dev:
> >> > > >> >
> >> > > >> > In HBASE-21879[1], we redesigned the offheap read path: read
> the
> >> > > >> HFileBlock
> >> > > >> > from HDFS to pooled offheap
> >> > > >> > ByteBuffers directly, while before HBASE-21879 we just read the
> >> > > >> HFileBlock
> >> > > >> > to heap which would still lead
> >> > > >> > to high GC pressure.
> >> > > >> >
> >> > > >> > After few months of development and testing, all subtasks have
> >> been
> >> > > >> > resovled now except the HBASE-21946[2]
> >> > > >> > (It depends on HDFS-14483[3] and our HDFS teams are working on
> >> this,
> >> > > we
> >> > > >> > expect the HDFS-14483 to be included
> >> > > >> > in hadoop 2.8.6 and after that the HBASE-21946 will get
> >> resolved).
> >> > we
> >> > > >> think
> >> > > >> > the feature is stable enough now and it's
> >> > > >> > time to merge branch HBASE-21879 back to master now.
> >> > > >> >
> >> > > >> > We have designed 3 test cases to prove the performance
> improvment
> >> > with
> >> > > >> > HBASE-21879:
> >> > > >> > 1. Disabled BlockCache, which means the cacheHitRatio is 0%;
> >> > > >> > 2. CacheHitRatio~65%;
> >> > > >> > 3. CachehHitRatio~100%;
> >> > > >> >
> >> > > >> > In our performance results[4], we can see that: the case#1 have
> >> an
> >> > > great
> >> > > >> > performance improvement
> >> > > >> > (
> >> > > >> > *throughput increased about 17%, heap allocation decreased
> about
> >> > 95%,
> >> > > >> Young
> >> > > >> > generaion size decreased about 81.7%*), that's because after
> >> > > HBASE-21879
> >> > > >> > all reads will allocate from pooled 

Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-18 Thread Duo Zhang
Good. Will take a look soon.

OpenInx  于2019年6月18日周二 下午9:41写道:

> > Could please open a PR, just like what I have done for HBASE-21512, so
> that others could have a overall view on the modified code?
>
> OK,  created a PR for this: https://github.com/apache/hbase/pull/320
> Thanks for your suggestion, Duo.
>
> On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang) 
> wrote:
>
> > The performance number is great.
> >
> > Could please open a PR, just like what I have done for HBASE-21512, so
> that
> > others could have a overall view on the modified code?
> >
> > Thanks.
> >
> > OpenInx  于2019年6月18日周二 下午6:58写道:
> >
> > > BTW,  when testing this branch,  we found some performance issues about
> > > HDFS Client:
> > > 1.  we reduced the DFS client's heap allocation from 45% to 27%
> > > in HDFS-14535 [1];
> > > 2.  we also increased get throughput by 17.8% in disabled block cache
> > case
> > > in HDFS-14541[2].
> > >  In theory, it should also helps a lot (especially p99/p999) even
> if
> > RS
> > > has a high cacheHitRatio.
> > >
> > > I think the next HDFS 2.8 release will include those patches,  they're
> > very
> > > good points for our
> > > HBase performance.
> > >
> > > [1]. https://issues.apache.org/jira/browse/HDFS-14535
> > > [2].
> > >
> > >
> >
> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
> > >
> > > Thanks.
> > >
> > > On Tue, Jun 18, 2019 at 12:05 PM OpenInx  wrote:
> > >
> > > > The HBASE-21879 has lots of changes: 123 files changed, 5833
> > > > insertions(+), 3015 deletions(-).
> > > > Currently we developed this issue based on master branch, and expect
> to
> > > > release it in future HBase3.x.
> > > > Of course, if branch-2 want this feature we can do the backport,
> should
> > > > have some conflicts now but I
> > > > don't think it would be hard to fix because I believe the branch-2
> > > > shouldn't have so much diff with master now
> > > > (at least in read path).
> > > > The first priority thing for now,   I think it would be merging the
> > > > HBASE-21879 branch to master branch
> > > > before diverging.  After that, I can do the backport.
> > > >
> > > > Thanks for your suggestion, Guanghao !
> > > >
> > > >
> > > >
> > > > On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang 
> > > > wrote:
> > > >
> > > >> This is a improvement not a new feature? So backport to branch-2,
> too?
> > > >>
> > > >> OpenInx  于2019年6月17日周一 下午2:45写道:
> > > >>
> > > >> > Dear HBase dev:
> > > >> >
> > > >> > In HBASE-21879[1], we redesigned the offheap read path: read the
> > > >> HFileBlock
> > > >> > from HDFS to pooled offheap
> > > >> > ByteBuffers directly, while before HBASE-21879 we just read the
> > > >> HFileBlock
> > > >> > to heap which would still lead
> > > >> > to high GC pressure.
> > > >> >
> > > >> > After few months of development and testing, all subtasks have
> been
> > > >> > resovled now except the HBASE-21946[2]
> > > >> > (It depends on HDFS-14483[3] and our HDFS teams are working on
> this,
> > > we
> > > >> > expect the HDFS-14483 to be included
> > > >> > in hadoop 2.8.6 and after that the HBASE-21946 will get resolved).
> > we
> > > >> think
> > > >> > the feature is stable enough now and it's
> > > >> > time to merge branch HBASE-21879 back to master now.
> > > >> >
> > > >> > We have designed 3 test cases to prove the performance improvment
> > with
> > > >> > HBASE-21879:
> > > >> > 1. Disabled BlockCache, which means the cacheHitRatio is 0%;
> > > >> > 2. CacheHitRatio~65%;
> > > >> > 3. CachehHitRatio~100%;
> > > >> >
> > > >> > In our performance results[4], we can see that: the case#1 have an
> > > great
> > > >> > performance improvement
> > > >> > (
> > > >> > *throughput increased about 17%, heap allocation decreased about
> > 95%,
> > > >> Young
> > > >> > generaion size decreased about 81.7%*), that's because after
> > > HBASE-21879
> > > >> > all reads will allocate from pooled offheap bytebuffers
> > > >> > and almost no heap allocation, while before HBASE-21879 the read
> > path
> > > >> will
> > > >> > create so many heap allocations.
> > > >> > On the other hand, from the testing results of case#2 and case#3
> we
> > > can
> > > >> > also see that:
> > > >> >
> > > >> > *As the cacheHitRatioincreasing, the difference between
> > > >> before-HBASE-21879
> > > >> > and after-HBASE-21879 will decrease, when cacheHitRatio is 100%,
> > they
> > > >> > almost have no much difference in both throughput and latency.*
> > > >> >
> > > >> > For more details please see the document[4].  Thanks
> > > >> > Anoop/Ram/DuoZhang/Stack/GuanghaoZhang very much
> > > >> > for your meticulous work (Suggession, discussion, patch reviewing,
> > doc
> > > >> > reviewing etc).
> > > >> >
> > > >> > Please vote
> > > >> >
> > > >> > [] +1
> > > >> > [] +0/-0
> > > >> > [] -1 Do not merge the branch back because...
> > > >> >
> > > >> > Thanks. Any suggestions are welcomed.
> > 

Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-18 Thread OpenInx
> Could please open a PR, just like what I have done for HBASE-21512, so
that others could have a overall view on the modified code?

OK,  created a PR for this: https://github.com/apache/hbase/pull/320
Thanks for your suggestion, Duo.

On Tue, Jun 18, 2019 at 9:24 PM 张铎(Duo Zhang)  wrote:

> The performance number is great.
>
> Could please open a PR, just like what I have done for HBASE-21512, so that
> others could have a overall view on the modified code?
>
> Thanks.
>
> OpenInx  于2019年6月18日周二 下午6:58写道:
>
> > BTW,  when testing this branch,  we found some performance issues about
> > HDFS Client:
> > 1.  we reduced the DFS client's heap allocation from 45% to 27%
> > in HDFS-14535 [1];
> > 2.  we also increased get throughput by 17.8% in disabled block cache
> case
> > in HDFS-14541[2].
> >  In theory, it should also helps a lot (especially p99/p999) even if
> RS
> > has a high cacheHitRatio.
> >
> > I think the next HDFS 2.8 release will include those patches,  they're
> very
> > good points for our
> > HBase performance.
> >
> > [1]. https://issues.apache.org/jira/browse/HDFS-14535
> > [2].
> >
> >
> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
> >
> > Thanks.
> >
> > On Tue, Jun 18, 2019 at 12:05 PM OpenInx  wrote:
> >
> > > The HBASE-21879 has lots of changes: 123 files changed, 5833
> > > insertions(+), 3015 deletions(-).
> > > Currently we developed this issue based on master branch, and expect to
> > > release it in future HBase3.x.
> > > Of course, if branch-2 want this feature we can do the backport, should
> > > have some conflicts now but I
> > > don't think it would be hard to fix because I believe the branch-2
> > > shouldn't have so much diff with master now
> > > (at least in read path).
> > > The first priority thing for now,   I think it would be merging the
> > > HBASE-21879 branch to master branch
> > > before diverging.  After that, I can do the backport.
> > >
> > > Thanks for your suggestion, Guanghao !
> > >
> > >
> > >
> > > On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang 
> > > wrote:
> > >
> > >> This is a improvement not a new feature? So backport to branch-2, too?
> > >>
> > >> OpenInx  于2019年6月17日周一 下午2:45写道:
> > >>
> > >> > Dear HBase dev:
> > >> >
> > >> > In HBASE-21879[1], we redesigned the offheap read path: read the
> > >> HFileBlock
> > >> > from HDFS to pooled offheap
> > >> > ByteBuffers directly, while before HBASE-21879 we just read the
> > >> HFileBlock
> > >> > to heap which would still lead
> > >> > to high GC pressure.
> > >> >
> > >> > After few months of development and testing, all subtasks have been
> > >> > resovled now except the HBASE-21946[2]
> > >> > (It depends on HDFS-14483[3] and our HDFS teams are working on this,
> > we
> > >> > expect the HDFS-14483 to be included
> > >> > in hadoop 2.8.6 and after that the HBASE-21946 will get resolved).
> we
> > >> think
> > >> > the feature is stable enough now and it's
> > >> > time to merge branch HBASE-21879 back to master now.
> > >> >
> > >> > We have designed 3 test cases to prove the performance improvment
> with
> > >> > HBASE-21879:
> > >> > 1. Disabled BlockCache, which means the cacheHitRatio is 0%;
> > >> > 2. CacheHitRatio~65%;
> > >> > 3. CachehHitRatio~100%;
> > >> >
> > >> > In our performance results[4], we can see that: the case#1 have an
> > great
> > >> > performance improvement
> > >> > (
> > >> > *throughput increased about 17%, heap allocation decreased about
> 95%,
> > >> Young
> > >> > generaion size decreased about 81.7%*), that's because after
> > HBASE-21879
> > >> > all reads will allocate from pooled offheap bytebuffers
> > >> > and almost no heap allocation, while before HBASE-21879 the read
> path
> > >> will
> > >> > create so many heap allocations.
> > >> > On the other hand, from the testing results of case#2 and case#3 we
> > can
> > >> > also see that:
> > >> >
> > >> > *As the cacheHitRatioincreasing, the difference between
> > >> before-HBASE-21879
> > >> > and after-HBASE-21879 will decrease, when cacheHitRatio is 100%,
> they
> > >> > almost have no much difference in both throughput and latency.*
> > >> >
> > >> > For more details please see the document[4].  Thanks
> > >> > Anoop/Ram/DuoZhang/Stack/GuanghaoZhang very much
> > >> > for your meticulous work (Suggession, discussion, patch reviewing,
> doc
> > >> > reviewing etc).
> > >> >
> > >> > Please vote
> > >> >
> > >> > [] +1
> > >> > [] +0/-0
> > >> > [] -1 Do not merge the branch back because...
> > >> >
> > >> > Thanks. Any suggestions are welcomed.
> > >> >
> > >> > [1] https://issues.apache.org/jira/browse/HBASE-21879
> > >> > [2] https://issues.apache.org/jira/browse/HBASE-21946
> > >> > [3] https://issues.apache.org/jira/browse/HDFS-14483
> > >> > [4]
> > >> >
> > >> >
> > >>
> >
> https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E
> > >> >
> > >>
> > >
> >
>


Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-18 Thread Duo Zhang
The performance number is great.

Could please open a PR, just like what I have done for HBASE-21512, so that
others could have a overall view on the modified code?

Thanks.

OpenInx  于2019年6月18日周二 下午6:58写道:

> BTW,  when testing this branch,  we found some performance issues about
> HDFS Client:
> 1.  we reduced the DFS client's heap allocation from 45% to 27%
> in HDFS-14535 [1];
> 2.  we also increased get throughput by 17.8% in disabled block cache case
> in HDFS-14541[2].
>  In theory, it should also helps a lot (especially p99/p999) even if RS
> has a high cacheHitRatio.
>
> I think the next HDFS 2.8 release will include those patches,  they're very
> good points for our
> HBase performance.
>
> [1]. https://issues.apache.org/jira/browse/HDFS-14535
> [2].
>
> https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472
>
> Thanks.
>
> On Tue, Jun 18, 2019 at 12:05 PM OpenInx  wrote:
>
> > The HBASE-21879 has lots of changes: 123 files changed, 5833
> > insertions(+), 3015 deletions(-).
> > Currently we developed this issue based on master branch, and expect to
> > release it in future HBase3.x.
> > Of course, if branch-2 want this feature we can do the backport, should
> > have some conflicts now but I
> > don't think it would be hard to fix because I believe the branch-2
> > shouldn't have so much diff with master now
> > (at least in read path).
> > The first priority thing for now,   I think it would be merging the
> > HBASE-21879 branch to master branch
> > before diverging.  After that, I can do the backport.
> >
> > Thanks for your suggestion, Guanghao !
> >
> >
> >
> > On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang 
> > wrote:
> >
> >> This is a improvement not a new feature? So backport to branch-2, too?
> >>
> >> OpenInx  于2019年6月17日周一 下午2:45写道:
> >>
> >> > Dear HBase dev:
> >> >
> >> > In HBASE-21879[1], we redesigned the offheap read path: read the
> >> HFileBlock
> >> > from HDFS to pooled offheap
> >> > ByteBuffers directly, while before HBASE-21879 we just read the
> >> HFileBlock
> >> > to heap which would still lead
> >> > to high GC pressure.
> >> >
> >> > After few months of development and testing, all subtasks have been
> >> > resovled now except the HBASE-21946[2]
> >> > (It depends on HDFS-14483[3] and our HDFS teams are working on this,
> we
> >> > expect the HDFS-14483 to be included
> >> > in hadoop 2.8.6 and after that the HBASE-21946 will get resolved). we
> >> think
> >> > the feature is stable enough now and it's
> >> > time to merge branch HBASE-21879 back to master now.
> >> >
> >> > We have designed 3 test cases to prove the performance improvment with
> >> > HBASE-21879:
> >> > 1. Disabled BlockCache, which means the cacheHitRatio is 0%;
> >> > 2. CacheHitRatio~65%;
> >> > 3. CachehHitRatio~100%;
> >> >
> >> > In our performance results[4], we can see that: the case#1 have an
> great
> >> > performance improvement
> >> > (
> >> > *throughput increased about 17%, heap allocation decreased about 95%,
> >> Young
> >> > generaion size decreased about 81.7%*), that's because after
> HBASE-21879
> >> > all reads will allocate from pooled offheap bytebuffers
> >> > and almost no heap allocation, while before HBASE-21879 the read path
> >> will
> >> > create so many heap allocations.
> >> > On the other hand, from the testing results of case#2 and case#3 we
> can
> >> > also see that:
> >> >
> >> > *As the cacheHitRatioincreasing, the difference between
> >> before-HBASE-21879
> >> > and after-HBASE-21879 will decrease, when cacheHitRatio is 100%,  they
> >> > almost have no much difference in both throughput and latency.*
> >> >
> >> > For more details please see the document[4].  Thanks
> >> > Anoop/Ram/DuoZhang/Stack/GuanghaoZhang very much
> >> > for your meticulous work (Suggession, discussion, patch reviewing, doc
> >> > reviewing etc).
> >> >
> >> > Please vote
> >> >
> >> > [] +1
> >> > [] +0/-0
> >> > [] -1 Do not merge the branch back because...
> >> >
> >> > Thanks. Any suggestions are welcomed.
> >> >
> >> > [1] https://issues.apache.org/jira/browse/HBASE-21879
> >> > [2] https://issues.apache.org/jira/browse/HBASE-21946
> >> > [3] https://issues.apache.org/jira/browse/HDFS-14483
> >> > [4]
> >> >
> >> >
> >>
> https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E
> >> >
> >>
> >
>


Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-18 Thread OpenInx
BTW,  when testing this branch,  we found some performance issues about
HDFS Client:
1.  we reduced the DFS client's heap allocation from 45% to 27%
in HDFS-14535 [1];
2.  we also increased get throughput by 17.8% in disabled block cache case
in HDFS-14541[2].
 In theory, it should also helps a lot (especially p99/p999) even if RS
has a high cacheHitRatio.

I think the next HDFS 2.8 release will include those patches,  they're very
good points for our
HBase performance.

[1]. https://issues.apache.org/jira/browse/HDFS-14535
[2].
https://issues.apache.org/jira/browse/HDFS-14541?focusedCommentId=16866472=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16866472

Thanks.

On Tue, Jun 18, 2019 at 12:05 PM OpenInx  wrote:

> The HBASE-21879 has lots of changes: 123 files changed, 5833
> insertions(+), 3015 deletions(-).
> Currently we developed this issue based on master branch, and expect to
> release it in future HBase3.x.
> Of course, if branch-2 want this feature we can do the backport, should
> have some conflicts now but I
> don't think it would be hard to fix because I believe the branch-2
> shouldn't have so much diff with master now
> (at least in read path).
> The first priority thing for now,   I think it would be merging the
> HBASE-21879 branch to master branch
> before diverging.  After that, I can do the backport.
>
> Thanks for your suggestion, Guanghao !
>
>
>
> On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang 
> wrote:
>
>> This is a improvement not a new feature? So backport to branch-2, too?
>>
>> OpenInx  于2019年6月17日周一 下午2:45写道:
>>
>> > Dear HBase dev:
>> >
>> > In HBASE-21879[1], we redesigned the offheap read path: read the
>> HFileBlock
>> > from HDFS to pooled offheap
>> > ByteBuffers directly, while before HBASE-21879 we just read the
>> HFileBlock
>> > to heap which would still lead
>> > to high GC pressure.
>> >
>> > After few months of development and testing, all subtasks have been
>> > resovled now except the HBASE-21946[2]
>> > (It depends on HDFS-14483[3] and our HDFS teams are working on this, we
>> > expect the HDFS-14483 to be included
>> > in hadoop 2.8.6 and after that the HBASE-21946 will get resolved). we
>> think
>> > the feature is stable enough now and it's
>> > time to merge branch HBASE-21879 back to master now.
>> >
>> > We have designed 3 test cases to prove the performance improvment with
>> > HBASE-21879:
>> > 1. Disabled BlockCache, which means the cacheHitRatio is 0%;
>> > 2. CacheHitRatio~65%;
>> > 3. CachehHitRatio~100%;
>> >
>> > In our performance results[4], we can see that: the case#1 have an great
>> > performance improvement
>> > (
>> > *throughput increased about 17%, heap allocation decreased about 95%,
>> Young
>> > generaion size decreased about 81.7%*), that's because after HBASE-21879
>> > all reads will allocate from pooled offheap bytebuffers
>> > and almost no heap allocation, while before HBASE-21879 the read path
>> will
>> > create so many heap allocations.
>> > On the other hand, from the testing results of case#2 and case#3 we can
>> > also see that:
>> >
>> > *As the cacheHitRatioincreasing, the difference between
>> before-HBASE-21879
>> > and after-HBASE-21879 will decrease, when cacheHitRatio is 100%,  they
>> > almost have no much difference in both throughput and latency.*
>> >
>> > For more details please see the document[4].  Thanks
>> > Anoop/Ram/DuoZhang/Stack/GuanghaoZhang very much
>> > for your meticulous work (Suggession, discussion, patch reviewing, doc
>> > reviewing etc).
>> >
>> > Please vote
>> >
>> > [] +1
>> > [] +0/-0
>> > [] -1 Do not merge the branch back because...
>> >
>> > Thanks. Any suggestions are welcomed.
>> >
>> > [1] https://issues.apache.org/jira/browse/HBASE-21879
>> > [2] https://issues.apache.org/jira/browse/HBASE-21946
>> > [3] https://issues.apache.org/jira/browse/HDFS-14483
>> > [4]
>> >
>> >
>> https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E
>> >
>>
>


Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-17 Thread OpenInx
The HBASE-21879 has lots of changes: 123 files changed, 5833 insertions(+),
3015 deletions(-).
Currently we developed this issue based on master branch, and expect to
release it in future HBase3.x.
Of course, if branch-2 want this feature we can do the backport, should
have some conflicts now but I
don't think it would be hard to fix because I believe the branch-2
shouldn't have so much diff with master now
(at least in read path).
The first priority thing for now,   I think it would be merging the
HBASE-21879 branch to master branch
before diverging.  After that, I can do the backport.

Thanks for your suggestion, Guanghao !



On Tue, Jun 18, 2019 at 11:39 AM Guanghao Zhang  wrote:

> This is a improvement not a new feature? So backport to branch-2, too?
>
> OpenInx  于2019年6月17日周一 下午2:45写道:
>
> > Dear HBase dev:
> >
> > In HBASE-21879[1], we redesigned the offheap read path: read the
> HFileBlock
> > from HDFS to pooled offheap
> > ByteBuffers directly, while before HBASE-21879 we just read the
> HFileBlock
> > to heap which would still lead
> > to high GC pressure.
> >
> > After few months of development and testing, all subtasks have been
> > resovled now except the HBASE-21946[2]
> > (It depends on HDFS-14483[3] and our HDFS teams are working on this, we
> > expect the HDFS-14483 to be included
> > in hadoop 2.8.6 and after that the HBASE-21946 will get resolved). we
> think
> > the feature is stable enough now and it's
> > time to merge branch HBASE-21879 back to master now.
> >
> > We have designed 3 test cases to prove the performance improvment with
> > HBASE-21879:
> > 1. Disabled BlockCache, which means the cacheHitRatio is 0%;
> > 2. CacheHitRatio~65%;
> > 3. CachehHitRatio~100%;
> >
> > In our performance results[4], we can see that: the case#1 have an great
> > performance improvement
> > (
> > *throughput increased about 17%, heap allocation decreased about 95%,
> Young
> > generaion size decreased about 81.7%*), that's because after HBASE-21879
> > all reads will allocate from pooled offheap bytebuffers
> > and almost no heap allocation, while before HBASE-21879 the read path
> will
> > create so many heap allocations.
> > On the other hand, from the testing results of case#2 and case#3 we can
> > also see that:
> >
> > *As the cacheHitRatioincreasing, the difference between
> before-HBASE-21879
> > and after-HBASE-21879 will decrease, when cacheHitRatio is 100%,  they
> > almost have no much difference in both throughput and latency.*
> >
> > For more details please see the document[4].  Thanks
> > Anoop/Ram/DuoZhang/Stack/GuanghaoZhang very much
> > for your meticulous work (Suggession, discussion, patch reviewing, doc
> > reviewing etc).
> >
> > Please vote
> >
> > [] +1
> > [] +0/-0
> > [] -1 Do not merge the branch back because...
> >
> > Thanks. Any suggestions are welcomed.
> >
> > [1] https://issues.apache.org/jira/browse/HBASE-21879
> > [2] https://issues.apache.org/jira/browse/HBASE-21946
> > [3] https://issues.apache.org/jira/browse/HDFS-14483
> > [4]
> >
> >
> https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E
> >
>


Re: [VOTE] Merge branch HBASE-21879 (Reading HFile's Block to ByteBuffer directly) back to master.

2019-06-17 Thread Guanghao Zhang
This is a improvement not a new feature? So backport to branch-2, too?

OpenInx  于2019年6月17日周一 下午2:45写道:

> Dear HBase dev:
>
> In HBASE-21879[1], we redesigned the offheap read path: read the HFileBlock
> from HDFS to pooled offheap
> ByteBuffers directly, while before HBASE-21879 we just read the HFileBlock
> to heap which would still lead
> to high GC pressure.
>
> After few months of development and testing, all subtasks have been
> resovled now except the HBASE-21946[2]
> (It depends on HDFS-14483[3] and our HDFS teams are working on this, we
> expect the HDFS-14483 to be included
> in hadoop 2.8.6 and after that the HBASE-21946 will get resolved). we think
> the feature is stable enough now and it's
> time to merge branch HBASE-21879 back to master now.
>
> We have designed 3 test cases to prove the performance improvment with
> HBASE-21879:
> 1. Disabled BlockCache, which means the cacheHitRatio is 0%;
> 2. CacheHitRatio~65%;
> 3. CachehHitRatio~100%;
>
> In our performance results[4], we can see that: the case#1 have an great
> performance improvement
> (
> *throughput increased about 17%, heap allocation decreased about 95%, Young
> generaion size decreased about 81.7%*), that's because after HBASE-21879
> all reads will allocate from pooled offheap bytebuffers
> and almost no heap allocation, while before HBASE-21879 the read path will
> create so many heap allocations.
> On the other hand, from the testing results of case#2 and case#3 we can
> also see that:
>
> *As the cacheHitRatioincreasing, the difference between before-HBASE-21879
> and after-HBASE-21879 will decrease, when cacheHitRatio is 100%,  they
> almost have no much difference in both throughput and latency.*
>
> For more details please see the document[4].  Thanks
> Anoop/Ram/DuoZhang/Stack/GuanghaoZhang very much
> for your meticulous work (Suggession, discussion, patch reviewing, doc
> reviewing etc).
>
> Please vote
>
> [] +1
> [] +0/-0
> [] -1 Do not merge the branch back because...
>
> Thanks. Any suggestions are welcomed.
>
> [1] https://issues.apache.org/jira/browse/HBASE-21879
> [2] https://issues.apache.org/jira/browse/HBASE-21946
> [3] https://issues.apache.org/jira/browse/HDFS-14483
> [4]
>
> https://docs.google.com/document/d/1xSy9axGxafoH-Qc17zbD2Bd--rWjjI00xTWQZ8ZwI_E
>