bq. For me, I would say that let's start the 2.2.x release line soon? So user could benefit from the change after they upgrade to 2.2.x. Sound good.
On Tue, Jan 15, 2019 at 11:05 AM OpenInx <[email protected]> wrote: > b > > On Tue, Jan 15, 2019 at 10:54 AM 张铎(Duo Zhang) <[email protected]> > wrote: > >> For me, I would say that let's start the 2.2.x release line soon? So user >> could benefit from the change after they upgrade to 2.2.x. >> >> OpenInx <[email protected]> 于2019年1月15日周二 上午10:21写道: >> >> > Sorry, here is a typo. >> > >> > > but not quite sure for branch-1 . Discussion are welcome (smile). >> > but not quite sure for branch-2.1 >> > >> > On Tue, Jan 15, 2019 at 10:17 AM OpenInx <[email protected]> wrote: >> > >> > > Hi: >> > > >> > > In HBASE-21657, I simplified the path of estimatedSerialiedSize() & >> > > estimatedSerialiedSizeOfCell() by moving the general >> getSerializedSize() >> > > and heapSize() from ExtendedCell to Cell interface. It's a >> incompatible >> > > change in some case, such as if the upstream user implemented their >> > > own Cells, although it's rare but can happen, then their compile will >> be >> > > error. >> > > >> > > We gain almost ~40% throughput improvement in 100% scan case for >> branch-2 >> > > (cacheHitRatio~100%)[1], it's a good thing. but I'm not sure >> > > whether the patch should go to branch-2.1 ? in here [2], stack says >> > > branch-2.0 won't need this Cell interface change (Agree, maybe the >> > > following >> > > change can be included, will file issue for it), but not quite sure >> for >> > > branch-1 . Discussion are welcome (smile). >> > > >> > > Anyway, patch can be included to branch-2/master because we've not >> made >> > a >> > > release yet. >> > > >> > > BTW, the patch also included some other improvments: >> > > 1. for 99% of case, our cells has no tags, so let the >> HFileScannerImpl >> > > just return the NoTagsByteBufferKeyValue if no tags, which means we >> can >> > > save >> > > lots of cpu time when sending no tags cell to rpc because can >> just >> > > return the length instead of getting the serialize size by caculating >> > > offset/length >> > > of each fields(row/cf/cq..) >> > > 2. Move the subclass's getSerializedSize implementation from >> ExtendedCell >> > > to their own class, which mean we did not need to call ExtendedCell's >> > > getSerialiedSize() firstly, then forward to subclass's >> > > getSerializedSize(withTags). >> > > 3. Give a estimated result arraylist size for avoiding the frequent >> list >> > > extension when in a big scan, now we estimate the array size as >> > > min(scan.rows, 512). >> > > it's also help a lot. >> > > >> > > Thanks. >> > > >> > > 1. >> > > >> > >> https://issues.apache.org/jira/browse/HBASE-21657?focusedCommentId=16735455&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16735455 >> > > 2. >> > > >> > >> https://issues.apache.org/jira/browse/HBASE-21657?focusedCommentId=16742330&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16742330 >> > > >> > >> >
