[ https://issues.apache.org/jira/browse/HBASE-15214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15135904#comment-15135904 ]
Hudson commented on HBASE-15214: -------------------------------- ABORTED: Integrated in HBase-0.98-matrix #293 (See [https://builds.apache.org/job/HBase-0.98-matrix/293/]) HBASE-15214 Valid mutate Ops fail with RPC Codec in use and region moves (anoopsamjohn: rev b3fe2556641ad5b36caded1a995f09443d88ab5d) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestMultiParallel.java * hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java > Valid mutate Ops fail with RPC Codec in use and region moves across > ------------------------------------------------------------------- > > Key: HBASE-15214 > URL: https://issues.apache.org/jira/browse/HBASE-15214 > Project: HBase > Issue Type: Bug > Affects Versions: 0.98.0 > Reporter: Anoop Sam John > Assignee: Anoop Sam John > Priority: Critical > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.1.4, 1.0.4, 0.98.18 > > Attachments: HBASE-15214-0.98.patch, HBASE-15214-branch-1.0.patch, > HBASE-15214-branch-1.1.patch, HBASE-15214-branch-1.patch, HBASE-15214.patch, > HBASE-15214_V2.patch, HBASE-15214_V3.patch > > > Test failures in HBASE-15198 lead to this bug. Till now we are not doing cell > block (codec usage) for write requests. (Client -> server) Once we enabled > Codec usage by default, aw this issue. > A multi request came to RS with mutation for different regions. One of the > region which was in this RS got unavailable now. In RsRpcServices#multi, we > will fail that entire RegionAction (with N mutations in it) in that > MultiRequest. Then we will continue with remaining RegionActions. Those > Regions might be available. (The failed RegionAction will get retried from > client after fetching latest region location). This all works fine in pure > PB requests world. When a Codec is used, we wont convert the Mutation Cell to > PB Cells and pack them in PB Message. Instead we will pass all Cells > serialized into one byte[] cellblock. Using Decoder we will iterate over > these cells at server side. Each Mutation PB will know only the number of > cells associated with it. As in above case when an entire RegionAction was > skipped, there might be N Mutations under that which might have corresponding > Cells in the cellblock. We are not doing the skip in that Iterator. This > makes the later Mutations (for other Regions) to refer to invalid Cells and > try to put those into the a different region. This will make > HRegion#checkRow() to throw WrongRegionException which will be treated as > Sanity check failure and so throwing back a DNRIOE to client. So the op will > get failed for the user code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)