[jira] Commented: (HBASE-2797) Another NPE in ReadWriteConsistencyControl

2010-06-29 Thread Dave Latham (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883548#action_12883548
 ] 

Dave Latham commented on HBASE-2797:


Also getting them with the similar stack trace:

Exception in thread regionserver/192.168.41.19:60020.leaseChecker 
java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint(ReadWriteConsistencyControl.java:40)
at 
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getNext(MemStore.java:532)
at 
org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.seek(MemStore.java:558)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:320)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.checkReseek(StoreScanner.java:306)
at 
org.apache.hadoop.hbase.regionserver.StoreScanner.peek(StoreScanner.java:143)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:127)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:117)
at 
java.util.PriorityQueue.siftDownUsingComparator(PriorityQueue.java:644)
at java.util.PriorityQueue.siftDown(PriorityQueue.java:612)
at java.util.PriorityQueue.poll(PriorityQueue.java:523)
at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.close(KeyValueHeap.java:151)
at 
org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.close(HRegion.java:1971)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer$ScannerListener.leaseExpired(HRegionServer.java:1962)
at org.apache.hadoop.hbase.Leases.run(Leases.java:98)

 Another NPE in ReadWriteConsistencyControl
 --

 Key: HBASE-2797
 URL: https://issues.apache.org/jira/browse/HBASE-2797
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.20.5
Reporter: Dave Latham
Assignee: ryan rawson
Priority: Blocker
 Fix For: 0.20.6


 This occurred on a cluster with 46 slaves, running a couple MR jobs.  One 
 doing heavy writes copying everything from one table to a new table with a 
 different schema.  After one regionserver went down, about 40 of them died 
 within an hour before it was caught and the jobs stopped.  Let me know if any 
 other piece of context would be particularly helpful.
 This exception appears in the .out file:
 Exception in thread regionserver/192.168.41.2:60020 
 java.lang.NullPointerException
 at 
 org.apache.hadoop.hbase.regionserver.ReadWriteConsistencyControl.getThreadReadPoint(ReadWriteConsistencyControl.java:40)
 at 
 org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.getNext(MemStore.java:532)
 at 
 org.apache.hadoop.hbase.regionserver.MemStore$MemStoreScanner.seek(MemStore.java:558)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:320)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.checkReseek(StoreScanner.java:306)
 at 
 org.apache.hadoop.hbase.regionserver.StoreScanner.peek(StoreScanner.java:143)
 at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:127)
 at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap$KVScannerComparator.compare(KeyValueHeap.java:117)
 at 
 java.util.PriorityQueue.siftDownUsingComparator(PriorityQueue.java:644)
 at java.util.PriorityQueue.siftDown(PriorityQueue.java:612)
 at java.util.PriorityQueue.poll(PriorityQueue.java:523)
 at 
 org.apache.hadoop.hbase.regionserver.KeyValueHeap.close(KeyValueHeap.java:151)
 at 
 org.apache.hadoop.hbase.regionserver.HRegion$RegionScanner.close(HRegion.java:1971)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.closeAllRegions(HRegionServer.java:1610)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:621)
 at java.lang.Thread.run(Thread.java:619)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-2803) Remove remaining Get code from Store.java,etc

2010-06-29 Thread ryan rawson (JIRA)
Remove remaining Get code from Store.java,etc
-

 Key: HBASE-2803
 URL: https://issues.apache.org/jira/browse/HBASE-2803
 Project: HBase
  Issue Type: Bug
Reporter: ryan rawson
 Fix For: 0.21.0


There is still remaining Get code due to HBASE-2248, remove it!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2801) Pid file should be cleaned up when stop succeeds

2010-06-29 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883606#action_12883606
 ] 

stack commented on HBASE-2801:
--

It should be cleaning it up? Its not?

 Pid file should be cleaned up when stop succeeds
 

 Key: HBASE-2801
 URL: https://issues.apache.org/jira/browse/HBASE-2801
 Project: HBase
  Issue Type: Bug
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2801) Pid file should be cleaned up when stop succeeds

2010-06-29 Thread Alex Newman (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883610#action_12883610
 ] 

Alex Newman commented on HBASE-2801:


Correct

 Pid file should be cleaned up when stop succeeds
 

 Key: HBASE-2801
 URL: https://issues.apache.org/jira/browse/HBASE-2801
 Project: HBase
  Issue Type: Bug
Reporter: Alex Newman
Assignee: Alex Newman
Priority: Minor



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-2804) [replication] Support ICVs in a master-master setup

2010-06-29 Thread Jean-Daniel Cryans (JIRA)
[replication] Support ICVs in a master-master setup
---

 Key: HBASE-2804
 URL: https://issues.apache.org/jira/browse/HBASE-2804
 Project: HBase
  Issue Type: New Feature
Reporter: Jean-Daniel Cryans
 Fix For: 0.21.0


Currently an ICV ends up as a Put in the HLogs, which ReplicationSource ships 
to ReplicationSink that in turn only recreates the Put and not the ICV itself. 
This means that in a master-master replication setup where the same counters 
are implemented on both side, the Puts will actually overwrite each other.

We need to find a way to support this use case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-2804) [replication] Support ICVs in a master-master setup

2010-06-29 Thread ryan rawson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883712#action_12883712
 ] 

ryan rawson commented on HBASE-2804:


it would make sense to 'shard' ICV by datacenters, where each datacenter gets 
it's own ICV column then anyone wishing to know the total would just get all 
columns and sum.  Different datacenters would not overwrite each other.  The 
only problem is this is more of an application level thing, and isnt baked into 
the API anywhere setting people up for failure down the road.

The problem with doing something like shipping deltas is that it becomes 
difficult to bring up a new cluster, since the cluster will need a 'starting 
point' combined with a sequence of deltas that must mesh perfectly or else the 
replica cluster will be out of sync.

 [replication] Support ICVs in a master-master setup
 ---

 Key: HBASE-2804
 URL: https://issues.apache.org/jira/browse/HBASE-2804
 Project: HBase
  Issue Type: New Feature
Reporter: Jean-Daniel Cryans
 Fix For: 0.21.0


 Currently an ICV ends up as a Put in the HLogs, which ReplicationSource ships 
 to ReplicationSink that in turn only recreates the Put and not the ICV 
 itself. This means that in a master-master replication setup where the same 
 counters are implemented on both side, the Puts will actually overwrite each 
 other.
 We need to find a way to support this use case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HBASE-2501) Refactor StoreFile Code

2010-06-29 Thread ryan rawson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ryan rawson resolved HBASE-2501.


Resolution: Fixed

FIXED!

 Refactor StoreFile Code
 ---

 Key: HBASE-2501
 URL: https://issues.apache.org/jira/browse/HBASE-2501
 Project: HBase
  Issue Type: Improvement
  Components: regionserver
Affects Versions: 0.20.5
Reporter: Nicolas Spiegelberg
Assignee: ryan rawson
Priority: Minor
 Fix For: 0.21.0


 Currently, the StoreFile code is a thin wrapper around an HFile.Reader.  With 
 the addition of BloomFilters and other features that operate at the HFile 
 layer, we need to clarify the difference between a StoreFile  HFile.  To 
 that end, we need to refactor the StoreFile.Reader code and the code that 
 inter-operates with it.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-2805) [shell] Add support for 'x = get TABLENAME, ...'

2010-06-29 Thread stack (JIRA)
[shell] Add support for 'x = get TABLENAME, ...'
--

 Key: HBASE-2805
 URL: https://issues.apache.org/jira/browse/HBASE-2805
 Project: HBase
  Issue Type: Improvement
Reporter: stack


In the shell, if you do  a get, it emits the content on STDOUT.  It'd be better 
if this behavior only happened if you did not supply an 'x = ' prefix.  In this 
latter case, x would hold the Result returned by the get.  This kinda behavior 
should come across as natural enough.   For example if you fire up the python 
interpreter, if no variable supplied to catch results, then content is emitted 
on STDOUT.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-50) Snapshot of table

2010-06-29 Thread Li Chongxin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-50?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12883779#action_12883779
 ] 

Li Chongxin commented on HBASE-50:
--

bq. isSnapshot in HRI? 
bq. Will keeping snapshot data in .META. work? .META. is by region but regions 
are deleted after a split but you want your snapshot to live beyond this?

Snapshot data, actually the reference count of hfiles, will be kept in .META. 
table, but in a different row than the original region row. So these reference 
count information will not be deleted after a split. Reference count 
information is saved here because it is also in a region centric view. 
Reference count information of a region's hfiles are kept together in a row in 
.META. no matter this hfile is still in use or has been archived. I described 
this in the Appendix A. of the document. 

bq. In zk, writeZnode and readZnode ain't the best names for methods... what 
kinda znodes are these? (Jon says these already exist, that they are not your 
fault)

Actually the method names for snapshot are startSnapshotOnZK, 
abortSnapshotOnZK, registerRSForSnapshot in ZooKeeperWrapper. I put writeZnode 
and readZnode in the diagram because I think I can use them inside the above 
methods.
Do you think we should make writeZnode and readZnode private and just use them 
inside ZooKeeperWrapper?

bq. Can you make a SnapShot class into which encapsulate all related to 
snapshotting rather than adding new data members to HMaster? Maybe you do 
encapsulate it all into snapshotmonitor?

I haven't figured out all the data members in the design. I will create a 
Snapsnot class to encapsulate the related fields if necessary during 
implementation.

bq. Can you call RSSnapshotHandler just SnapshotHandler?

sure

bq. You probably don't need to support String overloads.

You mean methods in HBaseAdmin?

A repository has been created in github with the initial content of hbase/trunk
http://github.com/lichongxin/hbase-snapshot

 Snapshot of table
 -

 Key: HBASE-50
 URL: https://issues.apache.org/jira/browse/HBASE-50
 Project: HBase
  Issue Type: New Feature
Reporter: Billy Pearson
Assignee: Li Chongxin
Priority: Minor
 Attachments: HBase Snapshot Design Report V2.pdf, HBase Snapshot 
 Design Report V3.pdf, HBase Snapshot Implementation Plan.pdf, Snapshot Class 
 Diagram.png


 Havening an option to take a snapshot of a table would be vary useful in 
 production.
 What I would like to see this option do is do a merge of all the data into 
 one or more files stored in the same folder on the dfs. This way we could 
 save data in case of a software bug in hadoop or user code. 
 The other advantage would be to be able to export a table to multi locations. 
 Say I had a read_only table that must be online. I could take a snapshot of 
 it when needed and export it to a separate data center and have it loaded 
 there and then i would have it online at multi data centers for load 
 balancing and failover.
 I understand that hadoop takes the need out of havening backup to protect 
 from failed servers, but this does not protect use from software bugs that 
 might delete or alter data in ways we did not plan. We should have a way we 
 can roll back a dataset.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.