[jira] Commented: (HBASE-2797) Another NPE in ReadWriteConsistencyControl
[ https://issues.apache.org/jira/browse/HBASE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883548#action_12883548 ] Dave Latham commented on HBASE-2797: Also getting them with the similar stack trace: Exception in thread regionserver/192.168.41.19:60020.leaseChecker java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint(ReadWriteConsistencyControl.java:40) at org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getNext(MemStore.java:532) at org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.seek(MemStore.java:558) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:320) at org.apache.hadoop.hbase.regionserver.StoreScanner.checkReseek(StoreScanner.java:306) at org.apache.hadoop.hbase.regionserver.StoreScanner.peek(StoreScanner.java:143) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:117) at java.util.PriorityQueue.siftDownUsingComparator(PriorityQueue.java:644) at java.util.PriorityQueue.siftDown(PriorityQueue.java:612) at java.util.PriorityQueue.poll(PriorityQueue.java:523) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.close(KeyValueHeap.java:151) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.close(HRegion.java:1971) at org.apache.hadoop.hbase.regionserver.HRegionServer$ScannerListener.leaseExpired(HRegionServer.java:1962) at org.apache.hadoop.hbase.Leases.run(Leases.java:98) Another NPE in ReadWriteConsistencyControl -- Key: HBASE-2797 URL: https://issues.apache.org/jira/browse/HBASE-2797 Project: HBase Issue Type: Bug Affects Versions: 0.20.5 Reporter: Dave Latham Assignee: ryan rawson Priority: Blocker Fix For: 0.20.6 This occurred on a cluster with 46 slaves, running a couple MR jobs. One doing heavy writes copying everything from one table to a new table with a different schema. After one regionserver went down, about 40 of them died within an hour before it was caught and the jobs stopped. Let me know if any other piece of context would be particularly helpful. This exception appears in the .out file: Exception in thread regionserver/192.168.41.2:60020 java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint(ReadWriteConsistencyControl.java:40) at org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getNext(MemStore.java:532) at org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.seek(MemStore.java:558) at org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:320) at org.apache.hadoop.hbase.regionserver.StoreScanner.checkReseek(StoreScanner.java:306) at org.apache.hadoop.hbase.regionserver.StoreScanner.peek(StoreScanner.java:143) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:127) at org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:117) at java.util.PriorityQueue.siftDownUsingComparator(PriorityQueue.java:644) at java.util.PriorityQueue.siftDown(PriorityQueue.java:612) at java.util.PriorityQueue.poll(PriorityQueue.java:523) at org.apache.hadoop.hbase.regionserver.KeyValueHeap.close(KeyValueHeap.java:151) at org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.close(HRegion.java:1971) at org.apache.hadoop.hbase.regionserver.HRegionServer.closeAllRegions(HRegionServer.java:1610) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:621) at java.lang.Thread.run(Thread.java:619) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-2803) Remove remaining Get code from Store.java,etc
Remove remaining Get code from Store.java,etc - Key: HBASE-2803 URL: https://issues.apache.org/jira/browse/HBASE-2803 Project: HBase Issue Type: Bug Reporter: ryan rawson Fix For: 0.21.0 There is still remaining Get code due to HBASE-2248, remove it! -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2801) Pid file should be cleaned up when stop succeeds
[ https://issues.apache.org/jira/browse/HBASE-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883606#action_12883606 ] stack commented on HBASE-2801: -- It should be cleaning it up? Its not? Pid file should be cleaned up when stop succeeds Key: HBASE-2801 URL: https://issues.apache.org/jira/browse/HBASE-2801 Project: HBase Issue Type: Bug Reporter: Alex Newman Assignee: Alex Newman Priority: Minor -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2801) Pid file should be cleaned up when stop succeeds
[ https://issues.apache.org/jira/browse/HBASE-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883610#action_12883610 ] Alex Newman commented on HBASE-2801: Correct Pid file should be cleaned up when stop succeeds Key: HBASE-2801 URL: https://issues.apache.org/jira/browse/HBASE-2801 Project: HBase Issue Type: Bug Reporter: Alex Newman Assignee: Alex Newman Priority: Minor -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-2804) [replication] Support ICVs in a master-master setup
[replication] Support ICVs in a master-master setup --- Key: HBASE-2804 URL: https://issues.apache.org/jira/browse/HBASE-2804 Project: HBase Issue Type: New Feature Reporter: Jean-Daniel Cryans Fix For: 0.21.0 Currently an ICV ends up as a Put in the HLogs, which ReplicationSource ships to ReplicationSink that in turn only recreates the Put and not the ICV itself. This means that in a master-master replication setup where the same counters are implemented on both side, the Puts will actually overwrite each other. We need to find a way to support this use case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-2804) [replication] Support ICVs in a master-master setup
[ https://issues.apache.org/jira/browse/HBASE-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883712#action_12883712 ] ryan rawson commented on HBASE-2804: it would make sense to 'shard' ICV by datacenters, where each datacenter gets it's own ICV column then anyone wishing to know the total would just get all columns and sum. Different datacenters would not overwrite each other. The only problem is this is more of an application level thing, and isnt baked into the API anywhere setting people up for failure down the road. The problem with doing something like shipping deltas is that it becomes difficult to bring up a new cluster, since the cluster will need a 'starting point' combined with a sequence of deltas that must mesh perfectly or else the replica cluster will be out of sync. [replication] Support ICVs in a master-master setup --- Key: HBASE-2804 URL: https://issues.apache.org/jira/browse/HBASE-2804 Project: HBase Issue Type: New Feature Reporter: Jean-Daniel Cryans Fix For: 0.21.0 Currently an ICV ends up as a Put in the HLogs, which ReplicationSource ships to ReplicationSink that in turn only recreates the Put and not the ICV itself. This means that in a master-master replication setup where the same counters are implemented on both side, the Puts will actually overwrite each other. We need to find a way to support this use case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HBASE-2501) Refactor StoreFile Code
[ https://issues.apache.org/jira/browse/HBASE-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ryan rawson resolved HBASE-2501. Resolution: Fixed FIXED! Refactor StoreFile Code --- Key: HBASE-2501 URL: https://issues.apache.org/jira/browse/HBASE-2501 Project: HBase Issue Type: Improvement Components: regionserver Affects Versions: 0.20.5 Reporter: Nicolas Spiegelberg Assignee: ryan rawson Priority: Minor Fix For: 0.21.0 Currently, the StoreFile code is a thin wrapper around an HFile.Reader. With the addition of BloomFilters and other features that operate at the HFile layer, we need to clarify the difference between a StoreFile HFile. To that end, we need to refactor the StoreFile.Reader code and the code that inter-operates with it. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HBASE-2805) [shell] Add support for 'x = get TABLENAME, ...'
[shell] Add support for 'x = get TABLENAME, ...' -- Key: HBASE-2805 URL: https://issues.apache.org/jira/browse/HBASE-2805 Project: HBase Issue Type: Improvement Reporter: stack In the shell, if you do a get, it emits the content on STDOUT. It'd be better if this behavior only happened if you did not supply an 'x = ' prefix. In this latter case, x would hold the Result returned by the get. This kinda behavior should come across as natural enough. For example if you fire up the python interpreter, if no variable supplied to catch results, then content is emitted on STDOUT. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HBASE-50) Snapshot of table
[ https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883779#action_12883779 ] Li Chongxin commented on HBASE-50: -- bq. isSnapshot in HRI? bq. Will keeping snapshot data in .META. work? .META. is by region but regions are deleted after a split but you want your snapshot to live beyond this? Snapshot data, actually the reference count of hfiles, will be kept in .META. table, but in a different row than the original region row. So these reference count information will not be deleted after a split. Reference count information is saved here because it is also in a region centric view. Reference count information of a region's hfiles are kept together in a row in .META. no matter this hfile is still in use or has been archived. I described this in the Appendix A. of the document. bq. In zk, writeZnode and readZnode ain't the best names for methods... what kinda znodes are these? (Jon says these already exist, that they are not your fault) Actually the method names for snapshot are startSnapshotOnZK, abortSnapshotOnZK, registerRSForSnapshot in ZooKeeperWrapper. I put writeZnode and readZnode in the diagram because I think I can use them inside the above methods. Do you think we should make writeZnode and readZnode private and just use them inside ZooKeeperWrapper? bq. Can you make a SnapShot class into which encapsulate all related to snapshotting rather than adding new data members to HMaster? Maybe you do encapsulate it all into snapshotmonitor? I haven't figured out all the data members in the design. I will create a Snapsnot class to encapsulate the related fields if necessary during implementation. bq. Can you call RSSnapshotHandler just SnapshotHandler? sure bq. You probably don't need to support String overloads. You mean methods in HBaseAdmin? A repository has been created in github with the initial content of hbase/trunk http://github.com/lichongxin/hbase-snapshot Snapshot of table - Key: HBASE-50 URL: https://issues.apache.org/jira/browse/HBASE-50 Project: HBase Issue Type: New Feature Reporter: Billy Pearson Assignee: Li Chongxin Priority: Minor Attachments: HBase Snapshot Design Report V2.pdf, HBase Snapshot Design Report V3.pdf, HBase Snapshot Implementation Plan.pdf, Snapshot Class Diagram.png Havening an option to take a snapshot of a table would be vary useful in production. What I would like to see this option do is do a merge of all the data into one or more files stored in the same folder on the dfs. This way we could save data in case of a software bug in hadoop or user code. The other advantage would be to be able to export a table to multi locations. Say I had a read_only table that must be online. I could take a snapshot of it when needed and export it to a separate data center and have it loaded there and then i would have it online at multi data centers for load balancing and failover. I understand that hadoop takes the need out of havening backup to protect from failed servers, but this does not protect use from software bugs that might delete or alter data in ways we did not plan. We should have a way we can roll back a dataset. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.