[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614395#comment-15614395 ] Hudson commented on HBASE-16570: SUCCESS: Integrated in Jenkins build HBase-1.3-JDK8 #60 (See [https://builds.apache.org/job/HBase-1.3-JDK8/60/]) Revert "HBASE-16570 Compute region locality in parallel at startup (garyh: rev 1f4b1b350a9403650e748be2723dd35a4917c032) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/RegionLocationFinder.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestRegionLocationFinder.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (HBASE-16890) Analyze the performance of AsyncWAL and fix the same
[ https://issues.apache.org/jira/browse/HBASE-16890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ramkrishna.s.vasudevan updated HBASE-16890: --- Comment: was deleted (was: for this test I did not apply any other patch other than the above patch. Those things are actually not the real killers. But for our offheap thing to work yes we surely need to do some change in the AsyncWALProtbufWriter#append part. And I don't have the lib with me so things are working by creating temp buffers only. If we do all those we can reduce some more garbage.) > Analyze the performance of AsyncWAL and fix the same > > > Key: HBASE-16890 > URL: https://issues.apache.org/jira/browse/HBASE-16890 > Project: HBase > Issue Type: Sub-task > Components: wal >Affects Versions: 2.0.0 >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: AsyncWAL_disruptor.patch, Screen Shot 2016-10-25 at > 7.34.47 PM.png, Screen Shot 2016-10-25 at 7.39.07 PM.png, Screen Shot > 2016-10-25 at 7.39.48 PM.png, async.svg, classic.svg, contention.png, > contention_defaultWAL.png > > > Tests reveal that AsyncWAL under load in single node cluster performs slower > than the Default WAL. This task is to analyze and see if we could fix it. > See some discussions in the tail of JIRA HBASE-15536. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16747) Track memstore data size and heap overhead separately
[ https://issues.apache.org/jira/browse/HBASE-16747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614388#comment-15614388 ] ramkrishna.s.vasudevan commented on HBASE-16747: +1. Great patch. > Track memstore data size and heap overhead separately > -- > > Key: HBASE-16747 > URL: https://issues.apache.org/jira/browse/HBASE-16747 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0 > > Attachments: HBASE-16747.patch, HBASE-16747.patch, > HBASE-16747_V2.patch, HBASE-16747_V2.patch, HBASE-16747_V3.patch, > HBASE-16747_V3.patch, HBASE-16747_V3.patch, HBASE-16747_V4.patch, > HBASE-16747_WIP.patch > > > We track the memstore size in 3 places. > 1. Global at RS level in RegionServerAccounting. This tracks all memstore's > size and used to calculate whether forced flushes needed because of global > heap pressure > 2. At region level in HRegion. This is sum of sizes of all memstores within > this region. This is used to decide whether region reaches flush size (128 MB) > 3. Segment level. This tracks the in memory flush/compaction decisions. > All these use the Cell's heap size which include the data bytes# as well as > Cell object heap overhead. Also we include the overhead because of addition > of Cells into Segment's data structures (Like CSLM). > Once we have off heap memstore, we will keep the cell data bytes in off heap > area. So we can not track both data size and heap overhead as one entity. We > need to separate them and track. > Proposal here is to track both cell data size and heap overhead separately at > global accounting layer. As of now we have only on heap memstore. So the > global memstore boundary checks will consider both (adds up and check against > global max memstore size) > Track cell data size alone (This can be on heap or off heap) in region level. > Region flushes use cell data size alone for the region flush decision. A > user configuring 128 MB as flush size, normally he will expect to get a 128MB > data flush size. But as we were including the heap overhead also, once the > flush happens, the actual data size getting flushed is way behind this 128 > MB. Now with this change we will behave more like what a user thinks. > Segment level in memory flush/compaction also considers cell data size alone. > But we will need to track the heap overhead also. (Once the in memory flush > or normal flush happens, we will have to adjust both cell data size and heap > overhead) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16947) Some improvements for DumpReplicationQueues tool
[ https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614363#comment-15614363 ] Guanghao Zhang commented on HBASE-16947: HBASE-16947-branch-1.patch can be applied to branch-1.3. > Some improvements for DumpReplicationQueues tool > > > Key: HBASE-16947 > URL: https://issues.apache.org/jira/browse/HBASE-16947 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Attachments: HBASE-16947-branch-1.patch, HBASE-16947-v1.patch, > HBASE-16947.patch > > > Recently we met too many replication WALs problem in our production cluster. > We need the DumpReplicationQueues tool to analyze the replication queues info > in zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some > improvements for it. > 1. Show the dead regionservers under replication/rs znode. When there are too > many WALs under znode, it can't be atomic transferred to new rs znode. So the > dead rs znode will be leaved on zookeeper. > 2. Make a summary about all the queues that belong to peer has been deleted. > 3. Aggregate all regionservers' size of replication queue. Now the > regionserver report ReplicationLoad to master, but there were not a aggregate > metrics for replication. > 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not > Found) need more time to dig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16947) Some improvements for DumpReplicationQueues tool
[ https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-16947: --- Attachment: HBASE-16947-branch-1.patch > Some improvements for DumpReplicationQueues tool > > > Key: HBASE-16947 > URL: https://issues.apache.org/jira/browse/HBASE-16947 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Attachments: HBASE-16947-branch-1.patch, HBASE-16947-v1.patch, > HBASE-16947.patch > > > Recently we met too many replication WALs problem in our production cluster. > We need the DumpReplicationQueues tool to analyze the replication queues info > in zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some > improvements for it. > 1. Show the dead regionservers under replication/rs znode. When there are too > many WALs under znode, it can't be atomic transferred to new rs znode. So the > dead rs znode will be leaved on zookeeper. > 2. Make a summary about all the queues that belong to peer has been deleted. > 3. Aggregate all regionservers' size of replication queue. Now the > regionserver report ReplicationLoad to master, but there were not a aggregate > metrics for replication. > 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not > Found) need more time to dig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16835) Revisit the zookeeper usage at client side
[ https://issues.apache.org/jira/browse/HBASE-16835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-16835: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Pushed to master. Thanks [~stack] for reviewing. > Revisit the zookeeper usage at client side > -- > > Key: HBASE-16835 > URL: https://issues.apache.org/jira/browse/HBASE-16835 > Project: HBase > Issue Type: Sub-task > Components: Client, Zookeeper >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16835-v1.patch, HBASE-16835.patch > > > Watcher or not. > Curator or not. > Keep connection or not. > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16952) Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos
[ https://issues.apache.org/jira/browse/HBASE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614355#comment-15614355 ] Duo Zhang commented on HBASE-16952: --- +1. And the findbugs warnings are from the shaded protobuf classes, not the generated ones. We have already exclude the generated files in findbugs-exclude.xml. I think we should also exclude the shaded sources? > Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos > --- > > Key: HBASE-16952 > URL: https://issues.apache.org/jira/browse/HBASE-16952 > Project: HBase > Issue Type: Task > Components: build >Reporter: stack >Assignee: stack > Attachments: HBASE-16952.master.001.patch, > HBASE-16952.master.002.patch, HBASE-16952.master.003.patch, > HBASE-16952.master.003.patch > > > hadoop-maven-plugins takes less configuration and avoids duplication -- > having to add .proto file in protobuf dir as well as add it explicitly to > pom.xml. This plugin also lets you set more than one source > (hadoop-maven-plugins expects you to compile first w/ one dir and then the > other). > Thanks to [~Apache9] for pointing out this plugin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16947) Some improvements for DumpReplicationQueues tool
[ https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614345#comment-15614345 ] Guanghao Zhang commented on HBASE-16947: ReplicationQueuesClientArguments was not in branch-1. ReplicationPeerConfig.getNamespaces() was added by HBASE-16447. And it only was merged to master. > Some improvements for DumpReplicationQueues tool > > > Key: HBASE-16947 > URL: https://issues.apache.org/jira/browse/HBASE-16947 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Attachments: HBASE-16947-v1.patch, HBASE-16947.patch > > > Recently we met too many replication WALs problem in our production cluster. > We need the DumpReplicationQueues tool to analyze the replication queues info > in zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some > improvements for it. > 1. Show the dead regionservers under replication/rs znode. When there are too > many WALs under znode, it can't be atomic transferred to new rs znode. So the > dead rs znode will be leaved on zookeeper. > 2. Make a summary about all the queues that belong to peer has been deleted. > 3. Aggregate all regionservers' size of replication queue. Now the > regionserver report ReplicationLoad to master, but there were not a aggregate > metrics for replication. > 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not > Found) need more time to dig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16952) Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos
[ https://issues.apache.org/jira/browse/HBASE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614337#comment-15614337 ] stack commented on HBASE-16952: --- Will commit in the morning unless objection (findbugs and white space are from generated files) > Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos > --- > > Key: HBASE-16952 > URL: https://issues.apache.org/jira/browse/HBASE-16952 > Project: HBase > Issue Type: Task > Components: build >Reporter: stack >Assignee: stack > Attachments: HBASE-16952.master.001.patch, > HBASE-16952.master.002.patch, HBASE-16952.master.003.patch, > HBASE-16952.master.003.patch > > > hadoop-maven-plugins takes less configuration and avoids duplication -- > having to add .proto file in protobuf dir as well as add it explicitly to > pom.xml. This plugin also lets you set more than one source > (hadoop-maven-plugins expects you to compile first w/ one dir and then the > other). > Thanks to [~Apache9] for pointing out this plugin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16955) Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build
[ https://issues.apache.org/jira/browse/HBASE-16955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614331#comment-15614331 ] stack commented on HBASE-16955: --- Yeah. Thats what I think. Just trying to prove it. The attached patch didn't run the protoc check when I expected it to will be back. > Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build > > > Key: HBASE-16955 > URL: https://issues.apache.org/jira/browse/HBASE-16955 > Project: HBase > Issue Type: Task > Components: build, Protobufs >Reporter: stack >Assignee: stack > Attachments: nothing_change.txt > > > HBASE-15638 Shade protobuf and a follow-ons changed how we do protobufs. > One, protobufs are in the module they pertain to so distributed throughout > the modules and secondly, we do 2.5.0 pb for externally consumed protobuf -- > e.g. Coprocessor Endpoints -- but internally we use protobuf 3.1.0. > A precommit check looks to see if any proto changes break protoc compile. > This task is about updating the precommit to accommodate the changes brought > about by HBASE-15638. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16835) Revisit the zookeeper usage at client side
[ https://issues.apache.org/jira/browse/HBASE-16835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614324#comment-15614324 ] stack commented on HBASE-16835: --- No. Looks great. +1. > Revisit the zookeeper usage at client side > -- > > Key: HBASE-16835 > URL: https://issues.apache.org/jira/browse/HBASE-16835 > Project: HBase > Issue Type: Sub-task > Components: Client, Zookeeper >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16835-v1.patch, HBASE-16835.patch > > > Watcher or not. > Curator or not. > Keep connection or not. > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16947) Some improvements for DumpReplicationQueues tool
[ https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614321#comment-15614321 ] Guanghao Zhang commented on HBASE-16947: Ok. I will upload a patch for branch-1 later. > Some improvements for DumpReplicationQueues tool > > > Key: HBASE-16947 > URL: https://issues.apache.org/jira/browse/HBASE-16947 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Attachments: HBASE-16947-v1.patch, HBASE-16947.patch > > > Recently we met too many replication WALs problem in our production cluster. > We need the DumpReplicationQueues tool to analyze the replication queues info > in zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some > improvements for it. > 1. Show the dead regionservers under replication/rs znode. When there are too > many WALs under znode, it can't be atomic transferred to new rs znode. So the > dead rs znode will be leaved on zookeeper. > 2. Make a summary about all the queues that belong to peer has been deleted. > 3. Aggregate all regionservers' size of replication queue. Now the > regionserver report ReplicationLoad to master, but there were not a aggregate > metrics for replication. > 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not > Found) need more time to dig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16835) Revisit the zookeeper usage at client side
[ https://issues.apache.org/jira/browse/HBASE-16835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614314#comment-15614314 ] Duo Zhang commented on HBASE-16835: --- Any other concerns of the new patch? [~stack] Thanks. > Revisit the zookeeper usage at client side > -- > > Key: HBASE-16835 > URL: https://issues.apache.org/jira/browse/HBASE-16835 > Project: HBase > Issue Type: Sub-task > Components: Client, Zookeeper >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16835-v1.patch, HBASE-16835.patch > > > Watcher or not. > Curator or not. > Keep connection or not. > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16747) Track memstore data size and heap overhead separately
[ https://issues.apache.org/jira/browse/HBASE-16747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614312#comment-15614312 ] ramkrishna.s.vasudevan commented on HBASE-16747: I will complete my final review today. I have seen this patch but one more look and will +1. > Track memstore data size and heap overhead separately > -- > > Key: HBASE-16747 > URL: https://issues.apache.org/jira/browse/HBASE-16747 > Project: HBase > Issue Type: Sub-task > Components: regionserver >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0 > > Attachments: HBASE-16747.patch, HBASE-16747.patch, > HBASE-16747_V2.patch, HBASE-16747_V2.patch, HBASE-16747_V3.patch, > HBASE-16747_V3.patch, HBASE-16747_V3.patch, HBASE-16747_V4.patch, > HBASE-16747_WIP.patch > > > We track the memstore size in 3 places. > 1. Global at RS level in RegionServerAccounting. This tracks all memstore's > size and used to calculate whether forced flushes needed because of global > heap pressure > 2. At region level in HRegion. This is sum of sizes of all memstores within > this region. This is used to decide whether region reaches flush size (128 MB) > 3. Segment level. This tracks the in memory flush/compaction decisions. > All these use the Cell's heap size which include the data bytes# as well as > Cell object heap overhead. Also we include the overhead because of addition > of Cells into Segment's data structures (Like CSLM). > Once we have off heap memstore, we will keep the cell data bytes in off heap > area. So we can not track both data size and heap overhead as one entity. We > need to separate them and track. > Proposal here is to track both cell data size and heap overhead separately at > global accounting layer. As of now we have only on heap memstore. So the > global memstore boundary checks will consider both (adds up and check against > global max memstore size) > Track cell data size alone (This can be on heap or off heap) in region level. > Region flushes use cell data size alone for the region flush decision. A > user configuring 128 MB as flush size, normally he will expect to get a 128MB > data flush size. But as we were including the heap overhead also, once the > flush happens, the actual data size getting flushed is way behind this 128 > MB. Now with this change we will behave more like what a user thinks. > Segment level in memory flush/compaction also considers cell data size alone. > But we will need to track the heap overhead also. (Once the in memory flush > or normal flush happens, we will have to adjust both cell data size and heap > overhead) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614309#comment-15614309 ] ramkrishna.s.vasudevan commented on HBASE-14918: Thanks for the results. Looks great. bq.We run index compaction with varying number of segments in the pipeline before merging the index: greater than 1 (ic1), greater than 2 (ic2), greater than 3 (ic3). So some where you have ensured that every segment while moving into the pipeline you do flattening and then merge them when the count is 3. Can you just try what happens when you don't merge it? > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: CellBlocksSegmentDesign.pdf, > HBASE-16417-benchmarkresults.pdf, MSLABMove.patch > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16947) Some improvements for DumpReplicationQueues tool
[ https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614292#comment-15614292 ] stack commented on HBASE-16947: --- Thank you Guanghao Zhang. HBASE-16450 was not committed to branch-1. I fixed that but this patch still fails if I try and apply it to branch-1: {code} [ERROR] /Users/stack/checkouts/hbase.git/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java:[626,6] Signal is internal proprietary API and may be removed in a future release [ERROR] /Users/stack/checkouts/hbase.git/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/DumpReplicationQueues.java:[286,59] error: cannot find symbol [ERROR] symbol: method getNamespaces() [ERROR] location: variable peerConfig of type ReplicationPeerConfig [ERROR] /Users/stack/checkouts/hbase.git/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/DumpReplicationQueues.java:[300,4] error: cannot find symbol [ERROR] symbol: class ReplicationQueuesClientArguments [ERROR] location: class DumpReplicationQueues [ERROR] /Users/stack/checkouts/hbase.git/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/DumpReplicationQueues.java:[301,12] error: cannot find symbol [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :hbase-server ... {code} Does the above failure make sense to you? > Some improvements for DumpReplicationQueues tool > > > Key: HBASE-16947 > URL: https://issues.apache.org/jira/browse/HBASE-16947 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Attachments: HBASE-16947-v1.patch, HBASE-16947.patch > > > Recently we met too many replication WALs problem in our production cluster. > We need the DumpReplicationQueues tool to analyze the replication queues info > in zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some > improvements for it. > 1. Show the dead regionservers under replication/rs znode. When there are too > many WALs under znode, it can't be atomic transferred to new rs znode. So the > dead rs znode will be leaved on zookeeper. > 2. Make a summary about all the queues that belong to peer has been deleted. > 3. Aggregate all regionservers' size of replication queue. Now the > regionserver report ReplicationLoad to master, but there were not a aggregate > metrics for replication. > 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not > Found) need more time to dig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16450) Shell tool to dump replication queues
[ https://issues.apache.org/jira/browse/HBASE-16450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-16450: -- Fix Version/s: 1.4.0 > Shell tool to dump replication queues > -- > > Key: HBASE-16450 > URL: https://issues.apache.org/jira/browse/HBASE-16450 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.3.0, 1.1.5, 1.2.2 >Reporter: Esteban Gutierrez >Assignee: Esteban Gutierrez > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16450.branch-1.001.patch, > HBASE-16450.branch-1.002.patch, HBASE-16450.master.001.patch, > HBASE-16450.master.002.patch, HBASE-16450.master.003.patch > > > Currently there is no way to dump list of the configured queues and the > replication queues when replication is enabled. Unfortunately the HBase > master only offers an option to dump the whole content of the znodes but not > details on the queues being processed on each RS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16450) Shell tool to dump replication queues
[ https://issues.apache.org/jira/browse/HBASE-16450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614281#comment-15614281 ] stack commented on HBASE-16450: --- Pushed to branch-1 too. > Shell tool to dump replication queues > -- > > Key: HBASE-16450 > URL: https://issues.apache.org/jira/browse/HBASE-16450 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.3.0, 1.1.5, 1.2.2 >Reporter: Esteban Gutierrez >Assignee: Esteban Gutierrez > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16450.branch-1.001.patch, > HBASE-16450.branch-1.002.patch, HBASE-16450.master.001.patch, > HBASE-16450.master.002.patch, HBASE-16450.master.003.patch > > > Currently there is no way to dump list of the configured queues and the > replication queues when replication is enabled. Unfortunately the HBase > master only offers an option to dump the whole content of the znodes but not > details on the queues being processed on each RS. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16835) Revisit the zookeeper usage at client side
[ https://issues.apache.org/jira/browse/HBASE-16835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614275#comment-15614275 ] Hadoop QA commented on HBASE-16835: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 38s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 25s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 34s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 6s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 40s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 0s {color} | {color:red} hbase-client in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 26s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 19s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 3s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 33m 16s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} | | {color:red}-1{color} | {color:red} hbaseprotoc {color} | {color:red} 0m 29s {color} | {color:red} root in the patch failed. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 79m 28s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 2m 35s {color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 39s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 154m 42s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.client.TestAdmin2 | | | org.apache.hadoop.hbase.client.TestHCM | | | org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas | | | org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient | | | org.apache.hadoop.hbase.client.TestSnapshotFromClientWithRegionReplicas | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614271#comment-15614271 ] Mikhail Antonov commented on HBASE-16570: - At this point l'd prefer it to stay reverted in 1.3 so that we can get the first release candidate out of the door. Let's target the fix for 1.4 with an option being open to backport to 1.3.1 later. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14918) In-Memory MemStore Flush and Compaction
[ https://issues.apache.org/jira/browse/HBASE-14918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614268#comment-15614268 ] stack commented on HBASE-14918: --- 12 virtual or physical cores? bq. "For data compaction we do not use MSLABs to avoid the inherent space and computation overhead of copying data during compaction." [~eshcar] We avoid copying for the data case? Is that unreal? Thanks for running the compare. Interesting that you can saturate with 10 threads only. I should look into that. What do you conclude [~eshcar]? Or this is just exploratory work? Thanks. > In-Memory MemStore Flush and Compaction > --- > > Key: HBASE-14918 > URL: https://issues.apache.org/jira/browse/HBASE-14918 > Project: HBase > Issue Type: Umbrella >Affects Versions: 2.0.0 >Reporter: Eshcar Hillel >Assignee: Eshcar Hillel > Attachments: CellBlocksSegmentDesign.pdf, > HBASE-16417-benchmarkresults.pdf, MSLABMove.patch > > > A memstore serves as the in-memory component of a store unit, absorbing all > updates to the store. From time to time these updates are flushed to a file > on disk, where they are compacted (by eliminating redundancies) and > compressed (i.e., written in a compressed format to reduce their storage > size). > We aim to speed up data access, and therefore suggest to apply in-memory > memstore flush. That is to flush the active in-memory segment into an > intermediate buffer where it can be accessed by the application. Data in the > buffer is subject to compaction and can be stored in any format that allows > it to take up smaller space in RAM. The less space the buffer consumes the > longer it can reside in memory before data is flushed to disk, resulting in > better performance. > Specifically, the optimization is beneficial for workloads with > medium-to-high key churn which incur many redundant cells, like persistent > messaging. > We suggest to structure the solution as 4 subtasks (respectively, patches). > (1) Infrastructure - refactoring of the MemStore hierarchy, introducing > segment (StoreSegment) as first-class citizen, and decoupling memstore > scanner from the memstore implementation; > (2) Adding StoreServices facility at the region level to allow memstores > update region counters and access region level synchronization mechanism; > (3) Implementation of a new memstore (CompactingMemstore) with non-optimized > immutable segment representation, and > (4) Memory optimization including compressed format representation and off > heap allocations. > This Jira continues the discussion in HBASE-13408. > Design documents, evaluation results and previous patches can be found in > HBASE-13408. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16960) RegionServer hang when aborting
[ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614249#comment-15614249 ] binlijin commented on HBASE-16960: -- [~saint@gmail.com] can you take a look at this? I have patch that change SyncFuture.get() to SyncFuture.get(long timeout) to fix this problem. > RegionServer hang when aborting > --- > > Key: HBASE-16960 > URL: https://issues.apache.org/jira/browse/HBASE-16960 > Project: HBase > Issue Type: Bug >Reporter: binlijin > Attachments: RingBufferEventHandler.png, > RingBufferEventHandler_exception.png, SyncFuture.png, > SyncFuture_exception.png, rs1081.jstack > > > We see regionserver hang when aborting several times and cause all regions on > this regionserver out of service and then all affected applications stop > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614242#comment-15614242 ] Hudson commented on HBASE-16570: SUCCESS: Integrated in Jenkins build HBase-1.3-JDK7 #53 (See [https://builds.apache.org/job/HBase-1.3-JDK7/53/]) Revert "HBASE-16570 Compute region locality in parallel at startup (garyh: rev 1f4b1b350a9403650e748be2723dd35a4917c032) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/RegionLocationFinder.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestRegionLocationFinder.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/BaseLoadBalancer.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestBaseLoadBalancer.java > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16960) RegionServer hang when aborting
[ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614234#comment-15614234 ] binlijin commented on HBASE-16960: -- The problem can happen when: (1)FSHLog#rollWriter throw exception and LogRoller call regionserver.abort (2)RingBufferEventHandler.onEvent process FSWALEntry throw DamagedWALException (3)RingBufferEventHandler.onEvent process safe point, set RingBufferEventHandler.exception=null (4)RingBufferEventHandler.onEvent process SyncFuture (MemStoreFlusher.1 FSHLog.sync) endOfBatch=false (5)RingBufferEventHandler.onEvent process FSWALEntry (ASYNC_WAL FSHLog.append) There is no other events, so the MemStoreFlusher.1 FSHLog.sync will hang. > RegionServer hang when aborting > --- > > Key: HBASE-16960 > URL: https://issues.apache.org/jira/browse/HBASE-16960 > Project: HBase > Issue Type: Bug >Reporter: binlijin > Attachments: RingBufferEventHandler.png, > RingBufferEventHandler_exception.png, SyncFuture.png, > SyncFuture_exception.png, rs1081.jstack > > > We see regionserver hang when aborting several times and cause all regions on > this regionserver out of service and then all affected applications stop > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16960) RegionServer hang when aborting
[ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614217#comment-15614217 ] binlijin commented on HBASE-16960: -- 2016-10-25 14:07:58,294 WARN [RS_OPEN_REGION-hadoop1081:16020-0.append-pool3-t1] wal.FSHLog: Failed appending 783246980, requesting roll of WAL 2016-10-25 14:07:59,054 ERROR [regionserver/hadoop1081:16020.logRoller] wal.FSHLog: Failed close of WAL writer hdfs://hadoop/hbase/WALs/hadoop1081,16020,1477194496398/hadoop1081%2C16020%2C1477194496398.regiongroup-1.1477375578315, unflushedEntries=8014 2016-10-25 14:07:59,054 FATAL [regionserver/hadoop1081:16020.logRoller] regionserver.HRegionServer: ABORTING region server hadoop1081,16020,1477194496398: Failed log close in log roller org.apache.hadoop.hbase.regionserver.wal.FailedLogCloseException: hdfs://hadoop/hbase/WALs/hadoop1081,16020,1477194496398/hadoop1081%2C16020%2C1477194496398.regiongroup-1.1477375578315, unflushedEntries=8014 2016-10-25 14:07:59,057 WARN [RS_OPEN_REGION-hadoop1081:16020-0.append-pool3-t1] wal.FSHLog: Failed appending 783261485, requesting roll of WAL 2016-10-25 14:07:59,100 INFO [regionserver/hadoop1081:16020.logRoller] regionserver.LogRoller: LogRoller exiting. > RegionServer hang when aborting > --- > > Key: HBASE-16960 > URL: https://issues.apache.org/jira/browse/HBASE-16960 > Project: HBase > Issue Type: Bug >Reporter: binlijin > Attachments: RingBufferEventHandler.png, > RingBufferEventHandler_exception.png, SyncFuture.png, > SyncFuture_exception.png, rs1081.jstack > > > We see regionserver hang when aborting several times and cause all regions on > this regionserver out of service and then all affected applications stop > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16960) RegionServer hang when aborting
[ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614212#comment-15614212 ] binlijin commented on HBASE-16960: -- The MemStoreFlusher.1 waiting on a SyncFuture, and this SyncFuture's heapdump is SyncFuture.png/SyncFuture_exception.png And the RingBufferEventHandler's heap dump is RingBufferEventHandler.png/RingBufferEventHandler_exception.png > RegionServer hang when aborting > --- > > Key: HBASE-16960 > URL: https://issues.apache.org/jira/browse/HBASE-16960 > Project: HBase > Issue Type: Bug >Reporter: binlijin > Attachments: RingBufferEventHandler.png, > RingBufferEventHandler_exception.png, SyncFuture.png, > SyncFuture_exception.png, rs1081.jstack > > > We see regionserver hang when aborting several times and cause all regions on > this regionserver out of service and then all affected applications stop > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16960) RegionServer hang when aborting
[ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-16960: - Attachment: SyncFuture.png RingBufferEventHandler.png SyncFuture_exception.png RingBufferEventHandler_exception.png > RegionServer hang when aborting > --- > > Key: HBASE-16960 > URL: https://issues.apache.org/jira/browse/HBASE-16960 > Project: HBase > Issue Type: Bug >Reporter: binlijin > Attachments: RingBufferEventHandler.png, > RingBufferEventHandler_exception.png, SyncFuture.png, > SyncFuture_exception.png, rs1081.jstack > > > We see regionserver hang when aborting several times and cause all regions on > this regionserver out of service and then all affected applications stop > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16960) RegionServer hang when aborting
[ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614193#comment-15614193 ] binlijin commented on HBASE-16960: -- The problem is MemStoreFlusher.1 call FSHLog.sync and do not get a result so not return. {code} "MemStoreFlusher.1" prio=10 tid=0x7f553e0dc800 nid=0x27c91 in Object.wait() [0x7f5519d73000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at org.apache.hadoop.hbase.regionserver.wal.SyncFuture.get(SyncFuture.java:167) - locked <0x7f593d16e128> (a org.apache.hadoop.hbase.regionserver.wal.SyncFuture) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.blockOnSync(FSHLog.java:1523) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.publishSyncThenBlockOnCompletion(FSHLog.java:1517) at org.apache.hadoop.hbase.regionserver.wal.FSHLog.sync(FSHLog.java:1607) at org.apache.hadoop.hbase.regionserver.HRegion.internalPrepareFlushCache(HRegion.java:2289) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2110) at org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2075) at org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:1967) at org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:1893) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:510) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75) at org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259) at java.lang.Thread.run(Thread.java:756) {code} > RegionServer hang when aborting > --- > > Key: HBASE-16960 > URL: https://issues.apache.org/jira/browse/HBASE-16960 > Project: HBase > Issue Type: Bug >Reporter: binlijin > Attachments: rs1081.jstack > > > We see regionserver hang when aborting several times and cause all regions on > this regionserver out of service and then all affected applications stop > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16959) Export snapshot to local file system of a single node
[ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614191#comment-15614191 ] Matteo Bertozzi commented on HBASE-16959: - why not just run with -Dmapreduce.jobtracker.address=local, so the ExportSnapshot will run on the same machine and export locally? > Export snapshot to local file system of a single node > - > > Key: HBASE-16959 > URL: https://issues.apache.org/jira/browse/HBASE-16959 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Xiang Li >Priority: Critical > > ExportSnapshot allows uses to specify "file://" in "copy-to". > Based on the implementation (use Map jobs), it works as follow: > (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local > file system of the HBase client node where the command is issued > (2) The data of the snapshot(archive) is exported to the local file system > of the nodes where the map jobs run, so spread everywhere. > *That causes 2 problems we meet so far:* > (1) The last step to verify the snapshot integrity fails, due to that not all > the data can be found on the HBase client node where the command is issued. > "-no-target-verify" can be of help here to suppress the verification, but it > is not a good idea > (2) When the HBase client (where the command is issued) is also a NodeManager > of Yarn, and it happens to have a map job (to write data of snapshot) running > on it, the "copy-to" directory will be created firstly when writing the > manifest by user=hbase and then user=yarn(if it is not controlled) will try > to write data into it. If the directory permission is not set properly, let > say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is > created with no write permission(777-022=755, so rwxr-xr-x) for the same > group, user=yarn can not write data into the "copy-to" directory, as it is > created by user=hbase. We have the following exception > {code} > Error: java.io.IOException: Mkdirs failed to create > file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info > (exists=false, > cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_04) > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449) > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > {code} > We can control the permission to resolve that, but it is not a good idea > either. > *Proposal* > If exporting to "file://", add reduce to aggregate all "distributed" data of > the snapshot to the HBase client node where the command is issued, to be > together with the manifest of the snapshot. That can resolve the verification > problem above in (1) > For problem (2), have no idea so far -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16960) RegionServer hang when aborting
[ https://issues.apache.org/jira/browse/HBASE-16960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-16960: - Attachment: rs1081.jstack > RegionServer hang when aborting > --- > > Key: HBASE-16960 > URL: https://issues.apache.org/jira/browse/HBASE-16960 > Project: HBase > Issue Type: Bug >Reporter: binlijin > Attachments: rs1081.jstack > > > We see regionserver hang when aborting several times and cause all regions on > this regionserver out of service and then all affected applications stop > works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16960) RegionServer hang when aborting
binlijin created HBASE-16960: Summary: RegionServer hang when aborting Key: HBASE-16960 URL: https://issues.apache.org/jira/browse/HBASE-16960 Project: HBase Issue Type: Bug Reporter: binlijin We see regionserver hang when aborting several times and cause all regions on this regionserver out of service and then all affected applications stop works. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16954) Unify HTable#checkAndDelete with AP
[ https://issues.apache.org/jira/browse/HBASE-16954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614150#comment-15614150 ] Hadoop QA commented on HBASE-16954: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 42s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 53s {color} | {color:red} hbase-client in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 51s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 30s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 89m 30s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 138m 10s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.client.TestFromClientSide | | | org.apache.hadoop.hbase.TestClusterBootOrder | | | org.apache.hadoop.hbase.client.TestAdmin2 | | | org.apache.hadoop.hbase.client.TestHCM | | | org.apache.hadoop.hbase.client.TestSnapshotFromClientWithRegionReplicas | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835640/HBASE-16954.v1.patch | | JIRA Issue | HBASE-16954 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 35beac91a89d 3.13.0-93-generic #140-Ubuntu SMP Mon Jul 18 21:21:05 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | |
[jira] [Commented] (HBASE-16886) hbase-client: scanner with reversed=true and small=true gets no result
[ https://issues.apache.org/jira/browse/HBASE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614147#comment-15614147 ] huzheng commented on HBASE-16886: - Fine , thanks. > hbase-client: scanner with reversed=true and small=true gets no result > -- > > Key: HBASE-16886 > URL: https://issues.apache.org/jira/browse/HBASE-16886 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.2.3, 1.1.7, 0.98.23 >Reporter: huzheng > Labels: patch > Attachments: 16886.addendum, 16886.v4.branch-1.patch, > HBASE-16886.v0.patch, HBASE-16886.v1.patch, HBASE-16886.v2.patch, > HBASE-16886.v3.patch, HBASE-16886.v4.0.98.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.branch-1.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.master.patch, > TestReversedSmallScan.java > > > Assume HBase have four regions (-oo, b), [b, c), [c, d), [d,+oo) , and all > rowKeys are located in region [d, +oo). using a Reversed Small Scanner will > get no result. > Attached file show this failed test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16959) Export snapshot to local file system of a single node
[ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiang Li updated HBASE-16959: - Description: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow: (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. *That causes 2 problems we meet so far:* (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info (exists=false, cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_04) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} We can control the permission to resolve that, but it is not a good idea either. *Proposal* If exporting to "file://", add reduce to aggregate all "distributed" data of the snapshot to the HBase client node where the command is issued, to be together with the manifest of the snapshot. That can resolve the verification problem above in (1) For problem (2), have no idea so far was: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. *That causes 2 problems we meet so far:* (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create
[jira] [Updated] (HBASE-16959) Export snapshot to local file system of a single node
[ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiang Li updated HBASE-16959: - Description: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. *That causes 2 problems we meet so far:* (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info (exists=false, cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_04) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} We can control the permission to resolve that, but it is not a good idea either. *Proposal* If exporting to "file://", add reduce to aggregate all "distributed" data of the snapshot to the HBase client node where the command is issued, to be together with the manifest of the snapshot. That can resolve the verification problem above in (1) For problem (2), have no idea so far was: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. *That causes 2 problems we meet so far:* (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614118#comment-15614118 ] Hadoop QA commented on HBASE-16570: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} docker {color} | {color:red} 0m 45s {color} | {color:red} Docker failed to build yetus/hbase:b2c5d84. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835725/HBASE-16570.branch-1.3-addendum.patch | | JIRA Issue | HBASE-16570 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/4216/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614096#comment-15614096 ] binlijin commented on HBASE-16570: -- bq. I don't see where this would impact master startup time RegionLocationFinder.asyncGetBlockDistribution(HRegionInfo hri) only get a ListenableFuture,and submit task(get region's HDFSBlocksDistribution) to ListeningExecutorService and ListeningExecutorService run tasks concurrent, so there are 5 threads to get region's HDFSBlocksDistribution concurrently. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614081#comment-15614081 ] binlijin commented on HBASE-16570: -- done > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-16570: - Status: Patch Available (was: Reopened) > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-16570: - Attachment: HBASE-16570.branch-1.3-addendum.patch > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614078#comment-15614078 ] Yu Li commented on HBASE-16570: --- Maybe a full patch with fix of the issue Gary mentioned for branch-1.3 and an addendum patch for master and branch-1? bq. I don't see where this would impact master startup time And it seems some question to answer here [~aoxiang] > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570.branch-1.3-addendum.patch, > HBASE-16570_addnum.patch, HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614060#comment-15614060 ] Ted Yu commented on HBASE-16570: This was only reverted from branch-1.3 Try naming your patch 16570.branch-1.3-addendum.patch and get QA to test it. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch, > HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16931) Setting cell's seqId to zero in compaction flow might cause RS down.
[ https://issues.apache.org/jira/browse/HBASE-16931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yu Li updated HBASE-16931: -- Fix Version/s: 1.3.0 Plan to commit to branch-1.3 if no objections in another 12 hours and close this JIRA, thanks. > Setting cell's seqId to zero in compaction flow might cause RS down. > > > Key: HBASE-16931 > URL: https://issues.apache.org/jira/browse/HBASE-16931 > Project: HBase > Issue Type: Bug > Components: regionserver >Affects Versions: 2.0.0 >Reporter: binlijin >Assignee: binlijin >Priority: Critical > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4, 1.1.8 > > Attachments: HBASE-16931-master.patch, HBASE-16931.branch-1.patch, > HBASE-16931.branch-1.v2.patch, HBASE-16931_master_v2.patch, > HBASE-16931_master_v3.patch, HBASE-16931_master_v4.patch, > HBASE-16931_master_v5.patch > > > Compactor#performCompaction > do { > hasMore = scanner.next(cells, scannerContext); > // output to writer: > for (Cell c : cells) { > if (cleanSeqId && c.getSequenceId() <= smallestReadPoint) { > CellUtil.setSequenceId(c, 0); > } > writer.append(c); > } > cells.clear(); > } while (hasMore); > scanner.next will choose at most "hbase.hstore.compaction.kv.max" kvs, the > last cell still reference by StoreScanner.prevCell, so if cleanSeqId is > called when the scanner.next call StoreScanner.checkScanOrder may throw > exception and cause regionserver down. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16886) hbase-client: scanner with reversed=true and small=true gets no result
[ https://issues.apache.org/jira/browse/HBASE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614040#comment-15614040 ] Ted Yu commented on HBASE-16886: Looks like RC for 1.3 would use today's git hash. Still waiting ... > hbase-client: scanner with reversed=true and small=true gets no result > -- > > Key: HBASE-16886 > URL: https://issues.apache.org/jira/browse/HBASE-16886 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.2.3, 1.1.7, 0.98.23 >Reporter: huzheng > Labels: patch > Attachments: 16886.addendum, 16886.v4.branch-1.patch, > HBASE-16886.v0.patch, HBASE-16886.v1.patch, HBASE-16886.v2.patch, > HBASE-16886.v3.patch, HBASE-16886.v4.0.98.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.branch-1.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.master.patch, > TestReversedSmallScan.java > > > Assume HBase have four regions (-oo, b), [b, c), [c, d), [d,+oo) , and all > rowKeys are located in region [d, +oo). using a Reversed Small Scanner will > get no result. > Attached file show this failed test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14123) HBase Backup/Restore Phase 2
[ https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-14123: -- Attachment: 14123-master.v30.txt v30. Addresses comments on RB > HBase Backup/Restore Phase 2 > > > Key: HBASE-14123 > URL: https://issues.apache.org/jira/browse/HBASE-14123 > Project: HBase > Issue Type: Umbrella >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Attachments: 14123-master.v14.txt, 14123-master.v15.txt, > 14123-master.v16.txt, 14123-master.v17.txt, 14123-master.v18.txt, > 14123-master.v19.txt, 14123-master.v2.txt, 14123-master.v20.txt, > 14123-master.v21.txt, 14123-master.v24.txt, 14123-master.v25.txt, > 14123-master.v27.txt, 14123-master.v28.txt, 14123-master.v29.full.txt, > 14123-master.v3.txt, 14123-master.v30.txt, 14123-master.v5.txt, > 14123-master.v6.txt, 14123-master.v7.txt, 14123-master.v8.txt, > 14123-master.v9.txt, 14123-v14.txt, HBASE-14123-for-7912-v1.patch, > HBASE-14123-for-7912-v6.patch, HBASE-14123-v1.patch, HBASE-14123-v10.patch, > HBASE-14123-v11.patch, HBASE-14123-v12.patch, HBASE-14123-v13.patch, > HBASE-14123-v15.patch, HBASE-14123-v16.patch, HBASE-14123-v2.patch, > HBASE-14123-v3.patch, HBASE-14123-v4.patch, HBASE-14123-v5.patch, > HBASE-14123-v6.patch, HBASE-14123-v7.patch, HBASE-14123-v9.patch > > > Phase 2 umbrella JIRA. See HBASE-7912 for design document and description. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16886) hbase-client: scanner with reversed=true and small=true gets no result
[ https://issues.apache.org/jira/browse/HBASE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15614021#comment-15614021 ] huzheng commented on HBASE-16886: - Thanks for reviewing and commiting. Can I update the issue to be Resolved ? > hbase-client: scanner with reversed=true and small=true gets no result > -- > > Key: HBASE-16886 > URL: https://issues.apache.org/jira/browse/HBASE-16886 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.2.3, 1.1.7, 0.98.23 >Reporter: huzheng > Labels: patch > Attachments: 16886.addendum, 16886.v4.branch-1.patch, > HBASE-16886.v0.patch, HBASE-16886.v1.patch, HBASE-16886.v2.patch, > HBASE-16886.v3.patch, HBASE-16886.v4.0.98.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.branch-1.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.master.patch, > TestReversedSmallScan.java > > > Assume HBase have four regions (-oo, b), [b, c), [c, d), [d,+oo) , and all > rowKeys are located in region [d, +oo). using a Reversed Small Scanner will > get no result. > Attached file show this failed test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16959) Export snapshot to local file system of a single node
[ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiang Li updated HBASE-16959: - Description: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. *That causes 2 problems we meet so far:* (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info (exists=false, cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_04) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} We can control the permission to resolve that, but it is not a good idea either. *Propoal* If exporting to "file://", add reduce to aggregate all "distributed" data of the snapshot to the HBase client node where the command is issued, to be together with the manifest of the snapshot. That can resolve the verification problem above in (1) For problem (2), have no idea so far was: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. *That causes 2 problems we meet so far:* (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create
[jira] [Updated] (HBASE-16959) Export snapshot to local file system of a single node
[ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiang Li updated HBASE-16959: - Description: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. *That causes 2 problems we meet so far:* (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info (exists=false, cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_04) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} We can control the permission to resolve that, but it is not a good idea either. *Propoal* Add reduce to move all "distributed" data of the snapshot to the HBase client node where the command is issued, to be together with the manifest of the snapshot. That can resolve the verification problem above in (1) For problem (2), have no idea so far was: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. That causes 2 problems we meet so far: (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info (exists=false,
[jira] [Updated] (HBASE-16959) Export snapshot to local file system of a single node
[ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiang Li updated HBASE-16959: - Summary: Export snapshot to local file system of a single node (was: Export snapshot to local file system) > Export snapshot to local file system of a single node > - > > Key: HBASE-16959 > URL: https://issues.apache.org/jira/browse/HBASE-16959 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Xiang Li >Priority: Critical > > ExportSnapshot allows uses to specify "file://" in "copy-to". > Based on the implementation (use Map jobs), it works as follow > (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local > file system of the HBase client node where the command is issued > (2) The data of the snapshot(archive) is exported to the local file system > of the nodes where the map jobs run, so spread everywhere. > That causes 2 problems we meet so far: > (1) The last step to verify the snapshot integrity fails, due to that not all > the data can be found on the HBase client node where the command is issued. > "-no-target-verify" can be of help here to suppress the verification, but it > is not a good idea > (2) When the HBase client (where the command is issued) is also a NodeManager > of Yarn, and it happens to have a map job (to write data of snapshot) running > on it, the "copy-to" directory will be created firstly when writing the > manifest by user=hbase and then user=yarn(if it is not controlled) will try > to write data into it. If the directory permission is not set properly, let > say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is > created with no write permission(777-022=755, so rwxr-xr-x) for the same > group, user=yarn can not write data into the "copy-to" directory, as it is > created by user=hbase. We have the following exception > {code} > Error: java.io.IOException: Mkdirs failed to create > file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info > (exists=false, > cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_04) > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449) > at > org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) > at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193) > at > org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) > {code} > We can control the permission to resolve that, but it is not a good idea > either. > *Propoal* > Add reduce to move all "distributed" data of the snapshot to the HBase client > node where the command is issued, to be together with the manifest of the > snapshot. That can resolve the verification problem above in (1) > For problem (2), have no idea so far -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16959) Export snapshot to local file system
[ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiang Li updated HBASE-16959: - Description: ExportSnapshot allows uses to specify "file://" in "copy-to". Based on the implementation (use Map jobs), it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local file system of the HBase client node where the command is issued (2) The data of the snapshot(archive) is exported to the local file system of the nodes where the map jobs run, so spread everywhere. That causes 2 problems we meet so far: (1) The last step to verify the snapshot integrity fails, due to that not all the data can be found on the HBase client node where the command is issued. "-no-target-verify" can be of help here to suppress the verification, but it is not a good idea (2) When the HBase client (where the command is issued) is also a NodeManager of Yarn, and it happens to have a map job (to write data of snapshot) running on it, the "copy-to" directory will be created firstly when writing the manifest by user=hbase and then user=yarn(if it is not controlled) will try to write data into it. If the directory permission is not set properly, let say, umask = 022, both hbase and yarn are in hadoop group, the "copy-to" is created with no write permission(777-022=755, so rwxr-xr-x) for the same group, user=yarn can not write data into the "copy-to" directory, as it is created by user=hbase. We have the following exception {code} Error: java.io.IOException: Mkdirs failed to create file:/tmp/snap_export/archive/data/default/table_xxx/regionid_xxx/info (exists=false, cwd=file:/hadoop/yarn/local/usercache/hbase/appcache/application_1477577812726_0001/container_1477577812726_0001_01_04) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:449) at org.apache.hadoop.fs.ChecksumFileSystem.create(ChecksumFileSystem.java:435) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:275) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:193) at org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:119) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) {code} We can control the permission to resolve that, but it is not a good idea either. *Propoal* Add reduce to move all "distributed" data of the snapshot to the HBase client node where the command is issued, to be together with the manifest of the snapshot. That can resolve the verification problem above in (1) For problem (2), have no idea so far was: Current ExportSnapshot allows uses to specify "file://" in "copy-to", but it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to local file system of the HBase client node where the command is issued (2) The data > Export snapshot to local file system > > > Key: HBASE-16959 > URL: https://issues.apache.org/jira/browse/HBASE-16959 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Xiang Li >Priority: Critical > > ExportSnapshot allows uses to specify "file://" in "copy-to". > Based on the implementation (use Map jobs), it works as follow > (1) The manifest of the snapshot(.hbase-snapshot) is exported to the local > file system of the HBase client node where the command is issued > (2) The data of the snapshot(archive) is exported to the local file system > of the nodes where the map jobs run, so spread everywhere. > That causes 2 problems we meet so far: > (1) The last step to verify the snapshot integrity fails, due to that not all > the data can be found on the HBase client node where the command is issued. > "-no-target-verify" can be of help here to suppress the verification, but it > is not a good idea > (2) When the HBase client (where the command is issued) is also a NodeManager > of Yarn, and it happens to have a map job (to write data of snapshot) running > on it, the "copy-to" directory will
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613992#comment-15613992 ] binlijin commented on HBASE-16570: -- Why need a name patch with branch-1.3 ? > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch, > HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16957) Remove directory layout/ filesystem references from Cleaners and a few other modules in master
[ https://issues.apache.org/jira/browse/HBASE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613989#comment-15613989 ] Hadoop QA commented on HBASE-16957: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 22 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 3m 15s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 36s {color} | {color:green} hbase-14439 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 43s {color} | {color:green} hbase-14439 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s {color} | {color:green} hbase-14439 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 11s {color} | {color:green} hbase-14439 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 39s {color} | {color:red} hbase-common in hbase-14439 has 1 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 53s {color} | {color:red} hbase-server in hbase-14439 has 5 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 49s {color} | {color:green} hbase-14439 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 7s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 25m 42s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 36s {color} | {color:green} the patch passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 41s {color} | {color:green} hbase-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 46s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 12s {color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 102m 17s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestMinVersions | | | hadoop.hbase.regionserver.TestSplitTransaction | | | hadoop.hbase.regionserver.TestDateTieredCompactionPolicy | | | hadoop.hbase.mob.TestMobFileCache | | |
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613977#comment-15613977 ] binlijin commented on HBASE-16570: -- Yes, startup is only true for retainAssignment. And the master/branch-1 all have the problem. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch, > HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16848) Usage for show_peer_tableCFs command doesn't include peer
[ https://issues.apache.org/jira/browse/HBASE-16848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16848: --- Labels: replication (was: ) > Usage for show_peer_tableCFs command doesn't include peer > - > > Key: HBASE-16848 > URL: https://issues.apache.org/jira/browse/HBASE-16848 > Project: HBase > Issue Type: Bug >Reporter: Ted Yu >Priority: Minor > Labels: replication > > {code} > hbase(main):003:0> show_peer_tableCFs > ERROR: wrong number of arguments (0 for 1) > Here is some help for this command: > Show replicable table-cf config for the specified peer. > hbase> show_peer_tableCFs > {code} > The sample usage should include peer id. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16835) Revisit the zookeeper usage at client side
[ https://issues.apache.org/jira/browse/HBASE-16835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-16835: -- Attachment: HBASE-16835-v1.patch Rename ClusterRegistry to AsyncRegistry and add some docs. > Revisit the zookeeper usage at client side > -- > > Key: HBASE-16835 > URL: https://issues.apache.org/jira/browse/HBASE-16835 > Project: HBase > Issue Type: Sub-task > Components: Client, Zookeeper >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16835-v1.patch, HBASE-16835.patch > > > Watcher or not. > Curator or not. > Keep connection or not. > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16959) Export snapshot to local file system
[ https://issues.apache.org/jira/browse/HBASE-16959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiang Li updated HBASE-16959: - Description: Current ExportSnapshot allows uses to specify "file://" in "copy-to", but it works as follow (1) The manifest of the snapshot(.hbase-snapshot) is exported to local file system of the HBase client node where the command is issued (2) The data > Export snapshot to local file system > > > Key: HBASE-16959 > URL: https://issues.apache.org/jira/browse/HBASE-16959 > Project: HBase > Issue Type: New Feature > Components: snapshots >Reporter: Xiang Li >Priority: Critical > > Current ExportSnapshot allows uses to specify "file://" in "copy-to", but it > works as follow > (1) The manifest of the snapshot(.hbase-snapshot) is exported to local file > system of the HBase client node where the command is issued > (2) The data -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613948#comment-15613948 ] binlijin edited comment on HBASE-16570 at 10/28/16 1:41 AM: [~chenheng] [~te...@apache.org] [~ghelmling] can you help review the patch HBASE-16570_addnum_v2.patch ? was (Author: aoxiang): [~chenheng][~te...@apache.org][~ghelmling] can you help review the patch HBASE-16570_addnum_v2.patch ? > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch, > HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613948#comment-15613948 ] binlijin commented on HBASE-16570: -- [~chenheng][~te...@apache.org][~ghelmling] can you help review the patch HBASE-16570_addnum_v2.patch ? > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch, > HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16959) Export snapshot to local file system
Xiang Li created HBASE-16959: Summary: Export snapshot to local file system Key: HBASE-16959 URL: https://issues.apache.org/jira/browse/HBASE-16959 Project: HBase Issue Type: New Feature Components: snapshots Reporter: Xiang Li Priority: Critical -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-16570: - Attachment: HBASE-16570_addnum_v2.patch > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch, > HBASE-16570_addnum_v2.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16952) Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos
[ https://issues.apache.org/jira/browse/HBASE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613930#comment-15613930 ] Hudson commented on HBASE-16952: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1867 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1867/]) Revert "HBASE-16952 Replace hadoop-maven-plugins with (stack: rev 738ff821dd092a1206cb39f6a024620df5710256) * (edit) hbase-examples/pom.xml * (edit) hbase-protocol/pom.xml * (edit) hbase-rest/pom.xml * (edit) hbase-endpoint/pom.xml * (edit) hbase-protocol/README.txt * (edit) hbase-rsgroup/README.txt * (edit) hbase-protocol-shaded/pom.xml * (edit) hbase-spark/pom.xml * (add) hbase-protocol-shaded/src/main/protobuf/RowProcessor.proto * (edit) hbase-endpoint/README.txt * (edit) pom.xml * (delete) hbase-rest/README.txt * (add) hbase-protocol-shaded/src/main/protobuf/CellSetMessage.proto * (edit) hbase-spark/README.txt * (edit) hbase-examples/README.txt * (edit) hbase-protocol-shaded/README.txt * (edit) hbase-rsgroup/pom.xml * (delete) hbase-protocol/src/main/java/org/apache/hadoop/hbase/ipc/protobuf/generated/TestProcedureProtos.java > Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos > --- > > Key: HBASE-16952 > URL: https://issues.apache.org/jira/browse/HBASE-16952 > Project: HBase > Issue Type: Task > Components: build >Reporter: stack >Assignee: stack > Attachments: HBASE-16952.master.001.patch, > HBASE-16952.master.002.patch, HBASE-16952.master.003.patch, > HBASE-16952.master.003.patch > > > hadoop-maven-plugins takes less configuration and avoids duplication -- > having to add .proto file in protobuf dir as well as add it explicitly to > pom.xml. This plugin also lets you set more than one source > (hadoop-maven-plugins expects you to compile first w/ one dir and then the > other). > Thanks to [~Apache9] for pointing out this plugin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16955) Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build
[ https://issues.apache.org/jira/browse/HBASE-16955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613897#comment-15613897 ] Duo Zhang edited comment on HBASE-16955 at 10/28/16 1:15 AM: - I think HBASE-16952 is enough to solve this problem? The problem is the old protobuf plugin can not locate the right protoc executable, but now we just use the new plugin to manage the protoc executable and do not depend on the external environment. was (Author: apache9): I think HBASE-16592 is enough to solve this problem? The problem is the old protobuf plugin can not locate the right protoc executable, but now we just use the new plugin to manage the protoc executable and do not depend on the external environment. > Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build > > > Key: HBASE-16955 > URL: https://issues.apache.org/jira/browse/HBASE-16955 > Project: HBase > Issue Type: Task > Components: build, Protobufs >Reporter: stack >Assignee: stack > Attachments: nothing_change.txt > > > HBASE-15638 Shade protobuf and a follow-ons changed how we do protobufs. > One, protobufs are in the module they pertain to so distributed throughout > the modules and secondly, we do 2.5.0 pb for externally consumed protobuf -- > e.g. Coprocessor Endpoints -- but internally we use protobuf 3.1.0. > A precommit check looks to see if any proto changes break protoc compile. > This task is about updating the precommit to accommodate the changes brought > about by HBASE-15638. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16955) Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build
[ https://issues.apache.org/jira/browse/HBASE-16955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613897#comment-15613897 ] Duo Zhang commented on HBASE-16955: --- I think HBASE-16592 is enough to solve this problem? The problem is the old protobuf plugin can not locate the right protoc executable, but now we just use the new plugin to manage the protoc executable and do not depend on the external environment. > Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build > > > Key: HBASE-16955 > URL: https://issues.apache.org/jira/browse/HBASE-16955 > Project: HBase > Issue Type: Task > Components: build, Protobufs >Reporter: stack >Assignee: stack > Attachments: nothing_change.txt > > > HBASE-15638 Shade protobuf and a follow-ons changed how we do protobufs. > One, protobufs are in the module they pertain to so distributed throughout > the modules and secondly, we do 2.5.0 pb for externally consumed protobuf -- > e.g. Coprocessor Endpoints -- but internally we use protobuf 3.1.0. > A precommit check looks to see if any proto changes break protoc compile. > This task is about updating the precommit to accommodate the changes brought > about by HBASE-15638. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16835) Revisit the zookeeper usage at client side
[ https://issues.apache.org/jira/browse/HBASE-16835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613886#comment-15613886 ] Duo Zhang commented on HBASE-16835: --- Fine. Let et change the class name. > Revisit the zookeeper usage at client side > -- > > Key: HBASE-16835 > URL: https://issues.apache.org/jira/browse/HBASE-16835 > Project: HBase > Issue Type: Sub-task > Components: Client, Zookeeper >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16835.patch > > > Watcher or not. > Curator or not. > Keep connection or not. > ... -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16954) Unify HTable#checkAndDelete with AP
[ https://issues.apache.org/jira/browse/HBASE-16954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613849#comment-15613849 ] Heng Chen commented on HBASE-16954: --- +1 > Unify HTable#checkAndDelete with AP > --- > > Key: HBASE-16954 > URL: https://issues.apache.org/jira/browse/HBASE-16954 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: ChiaPing Tsai >Assignee: ChiaPing Tsai >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16954.v0.patch, HBASE-16954.v1.patch > > > The HTable#checkAndDelete(byte[], byte[], byte[], byte[], Delete) can be > implemented by HTable#checkAndDelete(byte[], byte[], byte[], byte[], > CompareType.EQUAL, Delete). As a result, all HTable#checkAndDelete methods > can be unified with AP. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16947) Some improvements for DumpReplicationQueues tool
[ https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613812#comment-15613812 ] Guanghao Zhang commented on HBASE-16947: Thanks for committing. HBASE-16450 was only merged to master and branch-1.3. And now branch-1.3 is ready for release, so I didn't attach a patch for 1.3. This improvement need HBASE-16450 was merged to all branches first. I'd like to help to backport HBASE-16450 and this to all active branches. Does it need to open a new issue? Thanks. > Some improvements for DumpReplicationQueues tool > > > Key: HBASE-16947 > URL: https://issues.apache.org/jira/browse/HBASE-16947 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Attachments: HBASE-16947-v1.patch, HBASE-16947.patch > > > Recently we met too many replication WALs problem in our production cluster. > We need the DumpReplicationQueues tool to analyze the replication queues info > in zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some > improvements for it. > 1. Show the dead regionservers under replication/rs znode. When there are too > many WALs under znode, it can't be atomic transferred to new rs znode. So the > dead rs znode will be leaved on zookeeper. > 2. Make a summary about all the queues that belong to peer has been deleted. > 3. Aggregate all regionservers' size of replication queue. Now the > regionserver report ReplicationLoad to master, but there were not a aggregate > metrics for replication. > 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not > Found) need more time to dig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16952) Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos
[ https://issues.apache.org/jira/browse/HBASE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613801#comment-15613801 ] Hadoop QA commented on HBASE-16952: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 18s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 45s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 12m 36s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 32s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 58s {color} | {color:red} hbase-protocol-shaded in master has 24 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 3m 23s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 5m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 5m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 5m 36s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 12m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 33s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 4 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 10s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 51s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 16s {color} | {color:green} hbase-protocol in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 26s {color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 16s {color} | {color:green} hbase-rsgroup in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 21s {color} | {color:green} hbase-endpoint in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 17s {color} | {color:green} hbase-examples in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 24s {color} | {color:green} hbase-rest in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 4s {color} | {color:green} hbase-spark in the patch
[jira] [Commented] (HBASE-15809) Basic Replication WebUI
[ https://issues.apache.org/jira/browse/HBASE-15809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613791#comment-15613791 ] Guanghao Zhang commented on HBASE-15809: Sounds good. I am very glad to try it on our cluster. I will do a backport (because our cluster used a branch based 0.98) after you upload the patch. > Basic Replication WebUI > --- > > Key: HBASE-15809 > URL: https://issues.apache.org/jira/browse/HBASE-15809 > Project: HBase > Issue Type: New Feature > Components: Replication, UI >Affects Versions: 2.0.0 >Reporter: Matteo Bertozzi >Assignee: Appy >Priority: Minor > Fix For: 2.0.0, 1.4.0, 0.98.24 > > Attachments: HBASE-15809-v0.patch, HBASE-15809-v0.png, > HBASE-15809-v1.patch > > > At the moment the only way to have some insight on replication from the webui > is looking at zkdump and metrics. > the basic information useful to get started debugging are: peer information > and the view of WALs offsets for each peer. > https://reviews.apache.org/r/47275/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613779#comment-15613779 ] binlijin commented on HBASE-16570: -- Wait for a moment.. I need to check it more. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16955) Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build
[ https://issues.apache.org/jira/browse/HBASE-16955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613760#comment-15613760 ] Hadoop QA commented on HBASE-16955: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 3s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 48s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 4s {color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha1. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 28s {color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 57s {color} | {color:green} hbase-spark in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 35m 40s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.0 Server=1.12.0 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835689/nothing_change.txt | | JIRA Issue | HBASE-16955 | | Optional Tests | asflicense cc unit hbaseprotoc | | uname | Linux 283f101e2bb8 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 738ff82 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/4211/testReport/ | | modules | C: hbase-protocol-shaded hbase-spark U: . | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/4211/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build > > > Key: HBASE-16955 > URL: https://issues.apache.org/jira/browse/HBASE-16955 > Project: HBase > Issue Type: Task > Components: build, Protobufs >Reporter: stack >Assignee: stack > Attachments: nothing_change.txt > > > HBASE-15638 Shade protobuf and a follow-ons changed how we do protobufs. > One, protobufs are in the module they pertain to so distributed throughout > the modules and secondly, we do 2.5.0 pb for externally consumed protobuf -- > e.g. Coprocessor Endpoints -- but internally we use protobuf 3.1.0. > A precommit check looks to see if any proto changes break protoc compile. > This task is about updating the precommit to accommodate the changes brought > about by HBASE-15638. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613748#comment-15613748 ] binlijin commented on HBASE-16570: -- can you check if the HBASE-16570_addnum.patch can fix this? Sorry for the mistake. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] binlijin updated HBASE-16570: - Attachment: HBASE-16570_addnum.patch > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch, HBASE-16570_addnum.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16957) Remove directory layout/ filesystem references from Cleaners and a few other modules in master
[ https://issues.apache.org/jira/browse/HBASE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-16957: - Status: Patch Available (was: In Progress) > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > -- > > Key: HBASE-16957 > URL: https://issues.apache.org/jira/browse/HBASE-16957 > Project: HBase > Issue Type: Sub-task > Components: Filesystem Integration, master >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Attachments: HBASE-16957-hbase-14439.v1.patch, > HBASE-16957-hbase-14439.v3.patch, HBASE-16957-jbase-14439.v2.patch > > > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > {code} > hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterStatusServlet.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileLinkCleaner.java > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16957) Remove directory layout/ filesystem references from Cleaners and a few other modules in master
[ https://issues.apache.org/jira/browse/HBASE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-16957: - Attachment: HBASE-16957-hbase-14439.v3.patch Fixed typo in the name of the file. > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > -- > > Key: HBASE-16957 > URL: https://issues.apache.org/jira/browse/HBASE-16957 > Project: HBase > Issue Type: Sub-task > Components: Filesystem Integration, master >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Attachments: HBASE-16957-hbase-14439.v1.patch, > HBASE-16957-hbase-14439.v3.patch, HBASE-16957-jbase-14439.v2.patch > > > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > {code} > hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterStatusServlet.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileLinkCleaner.java > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16957) Remove directory layout/ filesystem references from Cleaners and a few other modules in master
[ https://issues.apache.org/jira/browse/HBASE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-16957: - Attachment: HBASE-16957-jbase-14439.v2.patch Uploading patch after updating Javadoc for MasterStorage.getChores() as suggested. For convenience here are the changes: {code} diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/MasterStorage.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/MasterStorage.java index bb82512..2f3b4a4 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/MasterStorage.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/MasterStorage.java @@ -68,7 +68,14 @@ public abstract class MasterStorage { /** * Get Chores that are required to be run from time to time for the underlying MasterStorage - * implementation. + * implementation. A few setup methods e.g. {@link #enableSnapshots()} may have their own chores. + * The returned list of chores or their configuration may vary depending on when in sequence + * this method is called with respect to other methods. Generally, a call to this method for + * getting and scheduling chores, needs to be after storage is setup properly by calling those + * methods first. + * + * Please refer to the documentation of specific method implementation for more details. + * * @param stopper the stopper * @return storage chores. */ diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/legacy/LegacyMasterStorage.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/legacy/LegacyMasterStorage.java index 0f622fd..aa4de2c 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/legacy/LegacyMasterStorage.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/fs/legacy/LegacyMasterStorage.java @@ -128,6 +128,10 @@ public class LegacyMasterStorage extends MasterStorage { return chores; } + /** + * This method modifies chores configuration for snapshots. Please call this method before + * instantiating and scheduling list of chores with {@link #getChores(Stoppable, Map)}. + */ @Override public void enableSnapshots() { super.enableSnapshots(); {code} > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > -- > > Key: HBASE-16957 > URL: https://issues.apache.org/jira/browse/HBASE-16957 > Project: HBase > Issue Type: Sub-task > Components: Filesystem Integration, master >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Attachments: HBASE-16957-hbase-14439.v1.patch, > HBASE-16957-jbase-14439.v2.patch > > > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > {code} > hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterStatusServlet.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileLinkCleaner.java > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-16958) Balancer recomputes block distributions every time balanceCluster() runs
[ https://issues.apache.org/jira/browse/HBASE-16958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling resolved HBASE-16958. --- Resolution: Duplicate Assignee: (was: Gary Helmling) Fix Version/s: (was: 1.3.0) I re-opened HBASE-16570 to fix the issue that is described here. > Balancer recomputes block distributions every time balanceCluster() runs > > > Key: HBASE-16958 > URL: https://issues.apache.org/jira/browse/HBASE-16958 > Project: HBase > Issue Type: Bug > Components: Balancer >Reporter: Gary Helmling > > The change in HBASE-16570 modified the balancer to compute block > distributions in parallel with a pool of 5 threads. However, because it does > this every time Cluster is instantiated, it effectively bypasses the cache of > block locations added in HBASE-14473: > In the LoadBalancer.balanceCluster() implementations (in > StochasticLoadBalancer, SimpleLoadBalancer), we create a new Cluster instance. > In Cluster., we call registerRegion() on every HRegionInfo. > In registerRegion(), we do the following: > {code} > regionLocationFutures.set(regionIndex, > regionFinder.asyncGetBlockDistribution(region)); > {code} > Then, back in Cluster. we do a get() on each ListenableFuture in a loop. > So while we are doing the calls to get block locations in parallel with 5 > threads, we're recomputing them every time balanceCluster() is called and not > taking advantage of the cache at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-16570: -- Fix Version/s: (was: 1.3.0) 1.3.1 > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling reopened HBASE-16570: --- I've reverted this from branch-1.3 for the moment, until the issue that I described can be addressed. I don't see where this would impact master startup time. If we need to pre-initialize this on startup, let's do it in a background thread only on startup. We need to make sure that locality is not recomputed on every run and that we use the cache instead. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.4.0, 1.3.1 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15531) Favored Nodes Enhancements
[ https://issues.apache.org/jira/browse/HBASE-15531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613662#comment-15613662 ] stack commented on HBASE-15531: --- Sounds good. > Favored Nodes Enhancements > -- > > Key: HBASE-15531 > URL: https://issues.apache.org/jira/browse/HBASE-15531 > Project: HBase > Issue Type: Umbrella >Reporter: Francis Liu >Assignee: Francis Liu > > We been working on enhancing favored nodes at Yahoo! See draft document. > Feel free to comment. I'll add more info. > https://docs.google.com/document/d/1948RKX_-kGUNOHjiYFiKPZnmKWybJkahJYsNibVhcCk/edit?usp=sharing > These enhancements have recently started running in production. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15531) Favored Nodes Enhancements
[ https://issues.apache.org/jira/browse/HBASE-15531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613652#comment-15613652 ] Thiruvel Thirumoolan commented on HBASE-15531: -- As part of these enhancements, we are planning to remove the following classes since no-one is using favored nodes. 1. FavoredNodeLoadBalancer. FavoredNodeLoadBalancer will be replaced by FavoredStochasticLoadBalancer with additional pickers and a RSGroup based equivalent. 2. RegionPlacementMaintainer (RPM) - This is currently large, will replace this with smaller tools. > Favored Nodes Enhancements > -- > > Key: HBASE-15531 > URL: https://issues.apache.org/jira/browse/HBASE-15531 > Project: HBase > Issue Type: Umbrella >Reporter: Francis Liu >Assignee: Francis Liu > > We been working on enhancing favored nodes at Yahoo! See draft document. > Feel free to comment. I'll add more info. > https://docs.google.com/document/d/1948RKX_-kGUNOHjiYFiKPZnmKWybJkahJYsNibVhcCk/edit?usp=sharing > These enhancements have recently started running in production. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16886) hbase-client: scanner with reversed=true and small=true gets no result
[ https://issues.apache.org/jira/browse/HBASE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613619#comment-15613619 ] Hudson commented on HBASE-16886: SUCCESS: Integrated in Jenkins build HBase-1.4 #500 (See [https://builds.apache.org/job/HBase-1.4/500/]) HBASE-16886 hbase-client: scanner with reversed=true and small=true gets (tedyu: rev d4826e1665085b0ef697db548f8b6277be256591) * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestSmallReversedScanner.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallReversedScanner.java * (edit) hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientSmallReversedScanner.java > hbase-client: scanner with reversed=true and small=true gets no result > -- > > Key: HBASE-16886 > URL: https://issues.apache.org/jira/browse/HBASE-16886 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.2.3, 1.1.7, 0.98.23 >Reporter: huzheng > Labels: patch > Attachments: 16886.addendum, 16886.v4.branch-1.patch, > HBASE-16886.v0.patch, HBASE-16886.v1.patch, HBASE-16886.v2.patch, > HBASE-16886.v3.patch, HBASE-16886.v4.0.98.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.branch-1.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.master.patch, > TestReversedSmallScan.java > > > Assume HBase have four regions (-oo, b), [b, c), [c, d), [d,+oo) , and all > rowKeys are located in region [d, +oo). using a Reversed Small Scanner will > get no result. > Attached file show this failed test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16955) Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build
[ https://issues.apache.org/jira/browse/HBASE-16955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-16955: -- Status: Patch Available (was: Open) Nothing change just to see what build does. > Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build > > > Key: HBASE-16955 > URL: https://issues.apache.org/jira/browse/HBASE-16955 > Project: HBase > Issue Type: Task > Components: build, Protobufs >Reporter: stack >Assignee: stack > Attachments: nothing_change.txt > > > HBASE-15638 Shade protobuf and a follow-ons changed how we do protobufs. > One, protobufs are in the module they pertain to so distributed throughout > the modules and secondly, we do 2.5.0 pb for externally consumed protobuf -- > e.g. Coprocessor Endpoints -- but internally we use protobuf 3.1.0. > A precommit check looks to see if any proto changes break protoc compile. > This task is about updating the precommit to accommodate the changes brought > about by HBASE-15638. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16955) Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build
[ https://issues.apache.org/jira/browse/HBASE-16955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-16955: -- Attachment: nothing_change.txt Am now thinking I don't have to do anything here after making it so mvn runs the protoc. Here is a nothing patch to see if we trigger protoc check in two different modules... > Fixup precommit protoc check to do new distributed protos and pb 3.1.0 build > > > Key: HBASE-16955 > URL: https://issues.apache.org/jira/browse/HBASE-16955 > Project: HBase > Issue Type: Task > Components: build, Protobufs >Reporter: stack >Assignee: stack > Attachments: nothing_change.txt > > > HBASE-15638 Shade protobuf and a follow-ons changed how we do protobufs. > One, protobufs are in the module they pertain to so distributed throughout > the modules and secondly, we do 2.5.0 pb for externally consumed protobuf -- > e.g. Coprocessor Endpoints -- but internally we use protobuf 3.1.0. > A precommit check looks to see if any proto changes break protoc compile. > This task is about updating the precommit to accommodate the changes brought > about by HBASE-15638. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16957) Remove directory layout/ filesystem references from Cleaners and a few other modules in master
[ https://issues.apache.org/jira/browse/HBASE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613559#comment-15613559 ] Hadoop QA commented on HBASE-16957: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 22 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 40s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 19s {color} | {color:green} hbase-14439 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 38s {color} | {color:green} hbase-14439 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 4s {color} | {color:green} hbase-14439 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 33s {color} | {color:green} hbase-14439 passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 38s {color} | {color:red} hbase-common in hbase-14439 has 1 extant Findbugs warnings. {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 46s {color} | {color:red} hbase-server in hbase-14439 has 5 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 24s {color} | {color:green} hbase-14439 passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 55s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m 28s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 58s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m 37s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 7 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 0s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 34s {color} | {color:green} the patch passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 34s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 44s {color} | {color:green} hbase-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 13m 38s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 19m 59s {color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 96m 40s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.filter.TestDependentColumnFilter | | | hadoop.hbase.snapshot.TestMobRestoreSnapshotHelper | | | hadoop.hbase.coprocessor.TestCoprocessorInterface | | |
[jira] [Commented] (HBASE-16956) Refactor FavoredNodePlan to use regionNames as keys
[ https://issues.apache.org/jira/browse/HBASE-16956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613500#comment-15613500 ] Devaraj Das commented on HBASE-16956: - Sorry, just read the description.. Yeah patch looks fine to me (pending QA). > Refactor FavoredNodePlan to use regionNames as keys > --- > > Key: HBASE-16956 > URL: https://issues.apache.org/jira/browse/HBASE-16956 > Project: HBase > Issue Type: Sub-task >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16956.master.001.patch > > > We would like to rely on the FNPlan cache whether a region is offline or not. > Sticking to regionNames as keys makes that possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16941) FavoredNodes - Split/Merge code paths
[ https://issues.apache.org/jira/browse/HBASE-16941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16941: - Status: Patch Available (was: Open) [~devaraj], I have split the patch like you requested for in HBASE-15532. Submitting the patch so recommit builds can run. If there is any warning/errors, will fix them. Review board link @ https://reviews.apache.org/r/53242/ > FavoredNodes - Split/Merge code paths > - > > Key: HBASE-16941 > URL: https://issues.apache.org/jira/browse/HBASE-16941 > Project: HBase > Issue Type: Sub-task >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16941.master.001.patch > > > This jira is to deal with the split/merge logic discussed as part of > HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15532) core favored nodes enhancements
[ https://issues.apache.org/jira/browse/HBASE-15532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-15532: - Status: Open (was: Patch Available) > core favored nodes enhancements > --- > > Key: HBASE-15532 > URL: https://issues.apache.org/jira/browse/HBASE-15532 > Project: HBase > Issue Type: Sub-task >Reporter: Francis Liu >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-15532.master.000.patch, > HBASE-15532.master.001.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16941) FavoredNodes - Split/Merge code paths
[ https://issues.apache.org/jira/browse/HBASE-16941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thiruvel Thirumoolan updated HBASE-16941: - Attachment: HBASE-16941.master.001.patch Attaching prelim patch for precommit tests to run. > FavoredNodes - Split/Merge code paths > - > > Key: HBASE-16941 > URL: https://issues.apache.org/jira/browse/HBASE-16941 > Project: HBase > Issue Type: Sub-task >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan > Fix For: 2.0.0 > > Attachments: HBASE-16941.master.001.patch > > > This jira is to deal with the split/merge logic discussed as part of > HBASE-15532. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16957) Remove directory layout/ filesystem references from Cleaners and a few other modules in master
[ https://issues.apache.org/jira/browse/HBASE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-16957: Status: In Progress (was: Patch Available) moving out of patch available for the javadoc update on MasterStorage. > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > -- > > Key: HBASE-16957 > URL: https://issues.apache.org/jira/browse/HBASE-16957 > Project: HBase > Issue Type: Sub-task > Components: Filesystem Integration, master >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Attachments: HBASE-16957-hbase-14439.v1.patch > > > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > {code} > hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterStatusServlet.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileLinkCleaner.java > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16957) Remove directory layout/ filesystem references from Cleaners and a few other modules in master
[ https://issues.apache.org/jira/browse/HBASE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613354#comment-15613354 ] Sean Busbey commented on HBASE-16957: - draft release note: {quote} Incompatible config changes: * hbase.master.logcleaner.plugins default value that users should always add when customizing changes from org.apache.hadoop.hbase.master.cleaner.TimeToLiveLogCleaner to org.apache.hadoop.hbase.fs.legacy.cleaner.TimeToLiveLogCleaner * hbase.master.hfilecleaner.plugins default value that users should always add when customizing changes from org.apache.hadoop.hbase.master.cleaner.TimeToLiveHFileCleaner to org.apache.hadoop.hbase.fs.legacy.cleaner.TimeToLiveHFileCleaner Downstream users who have set custom plugins for either log cleaning or hfile cleaning should be sure to update their configs to use the new names of the respective time to live implementations. {quote} > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > -- > > Key: HBASE-16957 > URL: https://issues.apache.org/jira/browse/HBASE-16957 > Project: HBase > Issue Type: Sub-task > Components: Filesystem Integration, master >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Attachments: HBASE-16957-hbase-14439.v1.patch > > > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > {code} > hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterStatusServlet.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileLinkCleaner.java > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16957) Remove directory layout/ filesystem references from Cleaners and a few other modules in master
[ https://issues.apache.org/jira/browse/HBASE-16957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613334#comment-15613334 ] Sean Busbey commented on HBASE-16957: - {code} + @Override + public Iterable getChores(Stoppable stopper, Mapparams) { +ArrayList chores = (ArrayList) super.getChores(stopper, params); + +int cleanerInterval = getConfiguration().getInt("hbase.master.cleaner.interval", 60 * 1000); +// add log cleaner chore +chores.add(new LogCleaner(cleanerInterval, stopper, getConfiguration(), getFileSystem(), +LegacyLayout.getOldLogDir(getRootContainer().path))); +// add hfile archive cleaner chore +chores.add(new HFileCleaner(cleanerInterval, stopper, getConfiguration(), getFileSystem(), +LegacyLayout.getArchiveDir(getRootContainer().path), params)); + +return chores; + } + + @Override + public void enableSnapshots() { +super.enableSnapshots(); +if (!isSnapshotsEnabled()) { + // Extract cleaners from conf + Set hfileCleaners = new HashSet<>(); + String[] cleaners = getConfiguration().getStrings(HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS); + if (cleaners != null) Collections.addAll(hfileCleaners, cleaners); + + // add snapshot related cleaners + hfileCleaners.add(SnapshotHFileCleaner.class.getName()); + hfileCleaners.add(HFileLinkCleaner.class.getName()); + + // Set cleaners conf + getConfiguration().setStrings(HFileCleaner.MASTER_HFILE_CLEANER_PLUGINS, + hfileCleaners.toArray(new String[hfileCleaners.size()])); +} + } {code} Looks like the interface for MasterStorage should specify that {{enableSnapshots()}} will be called before {{getChores}}, just to make sure we keep doing this correctly in the future. > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > -- > > Key: HBASE-16957 > URL: https://issues.apache.org/jira/browse/HBASE-16957 > Project: HBase > Issue Type: Sub-task > Components: Filesystem Integration, master >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Attachments: HBASE-16957-hbase-14439.v1.patch > > > Remove directory layout/ filesystem references from Cleaners and a few other > modules in master > {code} > hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterStatusServlet.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java > hbase-server/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileLinkCleaner.java > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16956) Refactor FavoredNodePlan to use regionNames as keys
[ https://issues.apache.org/jira/browse/HBASE-16956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613300#comment-15613300 ] Devaraj Das commented on HBASE-16956: - [~thiruvel] not sure I understand the motivation for this change. Could you please shed some light. Thanks! > Refactor FavoredNodePlan to use regionNames as keys > --- > > Key: HBASE-16956 > URL: https://issues.apache.org/jira/browse/HBASE-16956 > Project: HBase > Issue Type: Sub-task >Reporter: Thiruvel Thirumoolan >Assignee: Thiruvel Thirumoolan >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16956.master.001.patch > > > We would like to rely on the FNPlan cache whether a region is offline or not. > Sticking to regionNames as keys makes that possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613276#comment-15613276 ] Gary Helmling commented on HBASE-16570: --- I had opened a new JIRA, but yeah that's a better idea. I'll re-open here, revert from 1.3, and close out the new HBASE-16958 as a dupe. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16952) Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos
[ https://issues.apache.org/jira/browse/HBASE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-16952: -- Attachment: HBASE-16952.master.003.patch > Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos > --- > > Key: HBASE-16952 > URL: https://issues.apache.org/jira/browse/HBASE-16952 > Project: HBase > Issue Type: Task > Components: build >Reporter: stack >Assignee: stack > Attachments: HBASE-16952.master.001.patch, > HBASE-16952.master.002.patch, HBASE-16952.master.003.patch, > HBASE-16952.master.003.patch > > > hadoop-maven-plugins takes less configuration and avoids duplication -- > having to add .proto file in protobuf dir as well as add it explicitly to > pom.xml. This plugin also lets you set more than one source > (hadoop-maven-plugins expects you to compile first w/ one dir and then the > other). > Thanks to [~Apache9] for pointing out this plugin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14417) Incremental backup and bulk loading
[ https://issues.apache.org/jira/browse/HBASE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14417: --- Attachment: 14417.v23.txt Patch v23 is working version. All backup / restore tests pass. > Incremental backup and bulk loading > --- > > Key: HBASE-14417 > URL: https://issues.apache.org/jira/browse/HBASE-14417 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Ted Yu >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: 14417.v1.txt, 14417.v11.txt, 14417.v13.txt, > 14417.v2.txt, 14417.v21.txt, 14417.v23.txt, 14417.v6.txt > > > Currently, incremental backup is based on WAL files. Bulk data loading > bypasses WALs for obvious reasons, breaking incremental backups. The only way > to continue backups after bulk loading is to create new full backup of a > table. This may not be feasible for customers who do bulk loading regularly > (say, every day). > Google doc for design: > https://docs.google.com/document/d/1ACCLsecHDvzVSasORgqqRNrloGx4mNYIbvAU7lq5lJE -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16886) hbase-client: scanner with reversed=true and small=true gets no result
[ https://issues.apache.org/jira/browse/HBASE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613215#comment-15613215 ] Hudson commented on HBASE-16886: SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #1866 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1866/]) HBASE-16886 hbase-client: scanner with reversed=true and small=true gets (tedyu: rev d35b65883c07a4d8d378ce633bb1cc5185ad43c5) * (edit) hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestClientSmallReversedScanner.java * (add) hbase-server/src/test/java/org/apache/hadoop/hbase/client/TestSmallReversedScanner.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallReversedScanner.java HBASE-16886 hbase-client: scanner with reversed=true and small=true gets (tedyu: rev a9526f6fdb12efc7d6195185cfc2c8e6aa927af1) * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/ClientSmallReversedScanner.java > hbase-client: scanner with reversed=true and small=true gets no result > -- > > Key: HBASE-16886 > URL: https://issues.apache.org/jira/browse/HBASE-16886 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.3.0, 1.4.0, 1.2.3, 1.1.7, 0.98.23 >Reporter: huzheng > Labels: patch > Attachments: 16886.addendum, 16886.v4.branch-1.patch, > HBASE-16886.v0.patch, HBASE-16886.v1.patch, HBASE-16886.v2.patch, > HBASE-16886.v3.patch, HBASE-16886.v4.0.98.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.branch-1.patch, > HBASE-16886.v4.branch-1.patch, HBASE-16886.v4.master.patch, > TestReversedSmallScan.java > > > Assume HBase have four regions (-oo, b), [b, c), [c, d), [d,+oo) , and all > rowKeys are located in region [d, +oo). using a Reversed Small Scanner will > get no result. > Attached file show this failed test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16947) Some improvements for DumpReplicationQueues tool
[ https://issues.apache.org/jira/browse/HBASE-16947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613214#comment-15613214 ] Hudson commented on HBASE-16947: SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #1866 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1866/]) HBASE-16947 Some improvements for DumpReplicationQueues tool (stack: rev 7b74dd0374638b93e8e274d3c5800bdbff21ac52) * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesZKImpl.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/DumpReplicationQueues.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/replication/ReplicationQueuesClientZKImpl.java > Some improvements for DumpReplicationQueues tool > > > Key: HBASE-16947 > URL: https://issues.apache.org/jira/browse/HBASE-16947 > Project: HBase > Issue Type: Improvement > Components: Operability, Replication >Affects Versions: 2.0.0, 1.4.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang > Attachments: HBASE-16947-v1.patch, HBASE-16947.patch > > > Recently we met too many replication WALs problem in our production cluster. > We need the DumpReplicationQueues tool to analyze the replication queues info > in zookeeper. So I backport HBASE-16450 to our branch based 0.98 and did some > improvements for it. > 1. Show the dead regionservers under replication/rs znode. When there are too > many WALs under znode, it can't be atomic transferred to new rs znode. So the > dead rs znode will be leaved on zookeeper. > 2. Make a summary about all the queues that belong to peer has been deleted. > 3. Aggregate all regionservers' size of replication queue. Now the > regionserver report ReplicationLoad to master, but there were not a aggregate > metrics for replication. > 4. Show how many WALs which can not found on hdfs. But the reason (WAL Not > Found) need more time to dig. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16952) Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos
[ https://issues.apache.org/jira/browse/HBASE-16952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613217#comment-15613217 ] Hudson commented on HBASE-16952: SUCCESS: Integrated in Jenkins build HBase-Trunk_matrix #1866 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1866/]) HBASE-16952 Replace hadoop-maven-plugins with protobuf-maven-plugin for (stack: rev d0e61b0e9ae3e998074834c500a663f9412629bc) * (add) hbase-protocol/src/main/java/org/apache/hadoop/hbase/ipc/protobuf/generated/TestProcedureProtos.java * (edit) hbase-protocol/README.txt * (edit) hbase-protocol/pom.xml * (edit) hbase-endpoint/README.txt * (edit) hbase-examples/README.txt * (delete) hbase-protocol-shaded/src/main/protobuf/RowProcessor.proto * (edit) hbase-rest/pom.xml * (edit) hbase-spark/pom.xml * (delete) hbase-protocol-shaded/src/main/protobuf/CellSetMessage.proto * (edit) pom.xml * (edit) hbase-examples/pom.xml * (edit) hbase-protocol-shaded/README.txt * (edit) hbase-protocol-shaded/pom.xml * (add) hbase-rest/README.txt * (edit) hbase-rsgroup/README.txt * (edit) hbase-rsgroup/pom.xml * (edit) hbase-endpoint/pom.xml * (edit) hbase-spark/README.txt > Replace hadoop-maven-plugins with protobuf-maven-plugin for building protos > --- > > Key: HBASE-16952 > URL: https://issues.apache.org/jira/browse/HBASE-16952 > Project: HBase > Issue Type: Task > Components: build >Reporter: stack >Assignee: stack > Attachments: HBASE-16952.master.001.patch, > HBASE-16952.master.002.patch, HBASE-16952.master.003.patch > > > hadoop-maven-plugins takes less configuration and avoids duplication -- > having to add .proto file in protobuf dir as well as add it explicitly to > pom.xml. This plugin also lets you set more than one source > (hadoop-maven-plugins expects you to compile first w/ one dir and then the > other). > Thanks to [~Apache9] for pointing out this plugin. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16958) Balancer recomputes block distributions every time balanceCluster() runs
Gary Helmling created HBASE-16958: - Summary: Balancer recomputes block distributions every time balanceCluster() runs Key: HBASE-16958 URL: https://issues.apache.org/jira/browse/HBASE-16958 Project: HBase Issue Type: Bug Components: Balancer Reporter: Gary Helmling Assignee: Gary Helmling Fix For: 1.3.0 The change in HBASE-16570 modified the balancer to compute block distributions in parallel with a pool of 5 threads. However, because it does this every time Cluster is instantiated, it effectively bypasses the cache of block locations added in HBASE-14473: In the LoadBalancer.balanceCluster() implementations (in StochasticLoadBalancer, SimpleLoadBalancer), we create a new Cluster instance. In Cluster., we call registerRegion() on every HRegionInfo. In registerRegion(), we do the following: {code} regionLocationFutures.set(regionIndex, regionFinder.asyncGetBlockDistribution(region)); {code} Then, back in Cluster. we do a get() on each ListenableFuture in a loop. So while we are doing the calls to get block locations in parallel with 5 threads, we're recomputing them every time balanceCluster() is called and not taking advantage of the cache at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
[ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613157#comment-15613157 ] stack commented on HBASE-16570: --- Reopen [~ghelmling]? Revert from 1.3 in meantime. > Compute region locality in parallel at startup > -- > > Key: HBASE-16570 > URL: https://issues.apache.org/jira/browse/HBASE-16570 > Project: HBase > Issue Type: Sub-task >Reporter: binlijin >Assignee: binlijin > Fix For: 2.0.0, 1.3.0, 1.4.0 > > Attachments: HBASE-16570-master_V1.patch, > HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch, > HBASE-16570-master_V4.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)