Region size per region on the table page

2013-08-01 Thread samar.opensource

Hi Devs/Users,
   Most of the time we want to know if our table split logic is 
accurate of if our current regions are well balanced for a table. I was 
wondering if we can expose the size of region on the table.jsp too on 
the table region table. If people thing it is useful I can pick it up. 
Also let me know if it already exists.


Samar


Re: Region size per region on the table page

2013-08-01 Thread Jean-Marc Spaggiari
Hi Samar

Hannibal is already doing what you are looking for.

Cheers,

JMS

2013/8/1 samar.opensource samar.opensou...@gmail.com

 Hi Devs/Users,
Most of the time we want to know if our table split logic is accurate
 of if our current regions are well balanced for a table. I was wondering if
 we can expose the size of region on the table.jsp too on the table region
 table. If people thing it is useful I can pick it up. Also let me know if
 it already exists.

 Samar



AssignmentManager looping?

2013-08-01 Thread Jean-Marc Spaggiari
My master keep logging that:

2013-07-31 21:52:59,201 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,201 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:52:59,339 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,339 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:52:59,461 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,461 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:52:59,636 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,636 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:53:00,074 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:53:00,074 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:53:00,261 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:53:00,261 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:53:00,417 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:53:00,417 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
it doesn't exist anymore, probably already processed its split

hbase@node3:~/hbase-0.94.3$ cat logs/hbase-hbase-master-node3.log* | grep
Region 270a9c371fcbe9cd9a04986e0b77d16b not found  | wc
   5042   65546  927728


Then crashed.
2013-07-31 22:22:46,072 FATAL org.apache.hadoop.hbase.master.HMaster:
Master server abort: loaded coprocessors are: []
2013-07-31 22:22:46,073 FATAL org.apache.hadoop.hbase.master.HMaster:
Unexpected state : work_proposed,\x02\xE8\x92'\x00\x00\x00\x00
http://video.inportnews.ca/search/all/source/sun-news-network/harry-potter-in-translation/68463493001/page/1526,1375307272709.d95bb27cc026511c2a8c8ad155e79bf6.
state=OPENING, ts=1375323766008, server=node7,60020,1375319044055 .. Cannot
transit it to OFFLINE.
java.lang.IllegalStateException: Unexpected state :
work_proposed,\x02\xE8\x92'\x00\x00\x00\x00
http://video.inportnews.ca/search/all/source/sun-news-network/harry-potter-in-translation/68463493001/page/1526,1375307272709.d95bb27cc026511c2a8c8ad155e79bf6.
state=OPENING, ts=1375323766008, server=node7,60020,1375319044055 .. Cannot
transit it to OFFLINE.
at
org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
at
org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
at
org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
at
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
2013-07-31 22:22:46,075 INFO 

Re: AssignmentManager looping?

2013-08-01 Thread Kevin O'dell
Does it exist in meta or hdfs?
On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:

 My master keep logging that:

 2013-07-31 21:52:59,201 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,201 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:52:59,339 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,339 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:52:59,461 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,461 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:52:59,636 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,636 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:53:00,074 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:53:00,074 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:53:00,261 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:53:00,261 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:53:00,417 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:53:00,417 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
 270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055 but
 it doesn't exist anymore, probably already processed its split

 hbase@node3:~/hbase-0.94.3$ cat logs/hbase-hbase-master-node3.log* | grep
 Region 270a9c371fcbe9cd9a04986e0b77d16b not found  | wc
5042   65546  927728


 Then crashed.
 2013-07-31 22:22:46,072 FATAL org.apache.hadoop.hbase.master.HMaster:
 Master server abort: loaded coprocessors are: []
 2013-07-31 22:22:46,073 FATAL org.apache.hadoop.hbase.master.HMaster:
 Unexpected state : work_proposed,\x02\xE8\x92'\x00\x00\x00\x00

 http://video.inportnews.ca/search/all/source/sun-news-network/harry-potter-in-translation/68463493001/page/1526,1375307272709.d95bb27cc026511c2a8c8ad155e79bf6.
 state=OPENING, ts=1375323766008, server=node7,60020,1375319044055 ..
 Cannot
 transit it to OFFLINE.
 java.lang.IllegalStateException: Unexpected state :
 work_proposed,\x02\xE8\x92'\x00\x00\x00\x00

 http://video.inportnews.ca/search/all/source/sun-news-network/harry-potter-in-translation/68463493001/page/1526,1375307272709.d95bb27cc026511c2a8c8ad155e79bf6.
 state=OPENING, ts=1375323766008, server=node7,60020,1375319044055 ..
 Cannot
 transit it to OFFLINE.
 at

 org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
 at

 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
 at

 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
 at

 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
 at

 org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
 at

 org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
 at
 org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
 at

 

slow operation in postPut

2013-08-01 Thread Pavel Hančar
 Hello,
I have a class extending BaseRegionObserver and I use the postPut method to
run a slow procedure. I'd like to run more these procedures in more
threads. Is it possible to run more HTable.put(put) methods concurrently? I
tried, but I have this error for each thread:

Exception in thread Thread-3 java.lang.IndexOutOfBoundsException: Index:
1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.remove(ArrayList.java:445)
at
org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:966)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:811)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:786)
at img.PutFilesThread.run(PutFilesThread.java:74)
at java.lang.Thread.run(Thread.java:724)

Anybody has an idea?
  Thanks,
  Pavel Hančar


Re: slow operation in postPut

2013-08-01 Thread Ted Yu
HTable is not thread safe. 

On Aug 1, 2013, at 5:58 AM, Pavel Hančar pavel.han...@gmail.com wrote:

 Hello,
 I have a class extending BaseRegionObserver and I use the postPut method to
 run a slow procedure. I'd like to run more these procedures in more
 threads. Is it possible to run more HTable.put(put) methods concurrently? I
 tried, but I have this error for each thread:
 
 Exception in thread Thread-3 java.lang.IndexOutOfBoundsException: Index:
 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:604)
at java.util.ArrayList.remove(ArrayList.java:445)
at
 org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:966)
at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:811)
at org.apache.hadoop.hbase.client.HTable.put(HTable.java:786)
at img.PutFilesThread.run(PutFilesThread.java:74)
at java.lang.Thread.run(Thread.java:724)
 
 Anybody has an idea?
  Thanks,
  Pavel Hančar


Re: AssignmentManager looping?

2013-08-01 Thread Kevin O'dell
Yes you can if HBase is down, first I would copy .META out of HDFS local
and then you can search it for split issues. Deleting those znodes should
clear this up though.
On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:

 I can't check the meta since HBase is down.

 Regarding HDFS, I took few random lines like:
 2013-08-01 08:45:57,260 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 28328fdb7181cbd9cc4d6814775e8895 not found on server
 node4,60020,1375319042033; failed processing
 2013-08-01 08:45:57,260 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for region
 28328fdb7181cbd9cc4d6814775e8895 from server node4,60020,1375319042033 but
 it doesn't exist anymore, probably already processed its split

 And each time, there is nothing like that.
 hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
 28328fdb7181cbd9cc4d6814775e8895

 On ZK side:
 [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog

 [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
 [28328fdb7181cbd9cc4d6814775e8895, a8781a598c46f19723a2405345b58470,
 b7ebfeb63b10997736fd12920fde2bb8, d95bb27cc026511c2a8c8ad155e79bf6,
 270a9c371fcbe9cd9a04986e0b77d16b, aff4d1d8bf470458bb19525e8aef0759]

 Can I just delete those zknodes? Worst case hbck will find them back from
 HDFS if required?

 JM

 2013/8/1 Kevin O'dell kevin.od...@cloudera.com

  Does it exist in meta or hdfs?
  On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
  wrote:
 
   My master keep logging that:
  
   2013-07-31 21:52:59,201 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,201 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,339 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,339 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,461 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,461 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,636 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,636 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:53:00,074 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:53:00,074 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:53:00,261 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:53:00,261 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:53:00,417 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:53:00,417 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
  
   hbase@node3:~/hbase-0.94.3$ cat logs/hbase-hbase-master-node3.log* |
  grep
   Region 270a9c371fcbe9cd9a04986e0b77d16b not found  | wc
  5042   65546  927728
  
  
   Then crashed.
   2013-07-31 22:22:46,072 FATAL org.apache.hadoop.hbase.master.HMaster:
   

Re: slow operation in postPut

2013-08-01 Thread yonghu
If I want to use multi-thread with thread safe, which class should I use?


On Thu, Aug 1, 2013 at 3:08 PM, Ted Yu yuzhih...@gmail.com wrote:

 HTable is not thread safe.

 On Aug 1, 2013, at 5:58 AM, Pavel Hančar pavel.han...@gmail.com wrote:

  Hello,
  I have a class extending BaseRegionObserver and I use the postPut method
 to
  run a slow procedure. I'd like to run more these procedures in more
  threads. Is it possible to run more HTable.put(put) methods
 concurrently? I
  tried, but I have this error for each thread:
 
  Exception in thread Thread-3 java.lang.IndexOutOfBoundsException:
 Index:
  1, Size: 1
 at java.util.ArrayList.rangeCheck(ArrayList.java:604)
 at java.util.ArrayList.remove(ArrayList.java:445)
 at
  org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:966)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:811)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:786)
 at img.PutFilesThread.run(PutFilesThread.java:74)
 at java.lang.Thread.run(Thread.java:724)
 
  Anybody has an idea?
   Thanks,
   Pavel Hančar



Import HBase snapshots possible?

2013-08-01 Thread Siddharth Karandikar
Hi there,

I am testing out newly added snapshot capability, ExportSnapshot in particular.
Its working fine for me. I am able to run ExportSnapshot properly.

But the biggest (noob) issue is, once exported, is there any way to
import those snapshots back in hbase? I don't see any ImportSnapshot
util there.


Thanks,
Siddharth


Re: Import HBase snapshots possible?

2013-08-01 Thread Siddharth Karandikar
Btw, I am running on hbase-0.95.1-hadoop1.

On Thu, Aug 1, 2013 at 7:05 PM, Siddharth Karandikar
siddharth.karandi...@gmail.com wrote:
 Hi there,

 I am testing out newly added snapshot capability, ExportSnapshot in 
 particular.
 Its working fine for me. I am able to run ExportSnapshot properly.

 But the biggest (noob) issue is, once exported, is there any way to
 import those snapshots back in hbase? I don't see any ImportSnapshot
 util there.


 Thanks,
 Siddharth


Re: slow operation in postPut

2013-08-01 Thread yonghu
Use HTablePool instead. For more infor,
http://hbase.apache.org/book/client.html.


On Thu, Aug 1, 2013 at 3:32 PM, yonghu yongyong...@gmail.com wrote:

 If I want to use multi-thread with thread safe, which class should I use?


 On Thu, Aug 1, 2013 at 3:08 PM, Ted Yu yuzhih...@gmail.com wrote:

 HTable is not thread safe.

 On Aug 1, 2013, at 5:58 AM, Pavel Hančar pavel.han...@gmail.com wrote:

  Hello,
  I have a class extending BaseRegionObserver and I use the postPut
 method to
  run a slow procedure. I'd like to run more these procedures in more
  threads. Is it possible to run more HTable.put(put) methods
 concurrently? I
  tried, but I have this error for each thread:
 
  Exception in thread Thread-3 java.lang.IndexOutOfBoundsException:
 Index:
  1, Size: 1
 at java.util.ArrayList.rangeCheck(ArrayList.java:604)
 at java.util.ArrayList.remove(ArrayList.java:445)
 at
  org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:966)
 at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:811)
 at org.apache.hadoop.hbase.client.HTable.put(HTable.java:786)
 at img.PutFilesThread.run(PutFilesThread.java:74)
 at java.lang.Thread.run(Thread.java:724)
 
  Anybody has an idea?
   Thanks,
   Pavel Hančar





Re: slow operation in postPut

2013-08-01 Thread Ted Yu
See 9.3.1.1. Connection Pooling in http://hbase.apache.org/book.html

On Thu, Aug 1, 2013 at 6:32 AM, yonghu yongyong...@gmail.com wrote:

 If I want to use multi-thread with thread safe, which class should I use?


 On Thu, Aug 1, 2013 at 3:08 PM, Ted Yu yuzhih...@gmail.com wrote:

  HTable is not thread safe.
 
  On Aug 1, 2013, at 5:58 AM, Pavel Hančar pavel.han...@gmail.com wrote:
 
   Hello,
   I have a class extending BaseRegionObserver and I use the postPut
 method
  to
   run a slow procedure. I'd like to run more these procedures in more
   threads. Is it possible to run more HTable.put(put) methods
  concurrently? I
   tried, but I have this error for each thread:
  
   Exception in thread Thread-3 java.lang.IndexOutOfBoundsException:
  Index:
   1, Size: 1
  at java.util.ArrayList.rangeCheck(ArrayList.java:604)
  at java.util.ArrayList.remove(ArrayList.java:445)
  at
   org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:966)
  at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:811)
  at org.apache.hadoop.hbase.client.HTable.put(HTable.java:786)
  at img.PutFilesThread.run(PutFilesThread.java:74)
  at java.lang.Thread.run(Thread.java:724)
  
   Anybody has an idea?
Thanks,
Pavel Hančar
 



Re: Import HBase snapshots possible?

2013-08-01 Thread Matteo Bertozzi
The ExportSnapshot will export the snapshot data+metadata, in theory, to
another hbase cluster.
so on the second cluster you'll now be able to do list_snapshots from
shell and see the exported snapshot.
now you can simply do clone_snapshot snapshot_name, new_table_name and
you're restoring a snapshot on the second cluster

assuming that you have removed the snapshot from cluster1 and you want to
export back your snapshot..
you just use ExportSnapshot again to move the snapshot from cluster2 to
cluster1
and same as before you do a clone_snapshot to restore it

Matteo



On Thu, Aug 1, 2013 at 2:35 PM, Siddharth Karandikar 
siddharth.karandi...@gmail.com wrote:

 Hi there,

 I am testing out newly added snapshot capability, ExportSnapshot in
 particular.
 Its working fine for me. I am able to run ExportSnapshot properly.

 But the biggest (noob) issue is, once exported, is there any way to
 import those snapshots back in hbase? I don't see any ImportSnapshot
 util there.


 Thanks,
 Siddharth



Re: Import HBase snapshots possible?

2013-08-01 Thread Siddharth Karandikar
Can't I export it to plain HDFS? I think that would be very useful.

On Thu, Aug 1, 2013 at 7:08 PM, Matteo Bertozzi theo.berto...@gmail.com wrote:
 The ExportSnapshot will export the snapshot data+metadata, in theory, to
 another hbase cluster.
 so on the second cluster you'll now be able to do list_snapshots from
 shell and see the exported snapshot.
 now you can simply do clone_snapshot snapshot_name, new_table_name and
 you're restoring a snapshot on the second cluster

 assuming that you have removed the snapshot from cluster1 and you want to
 export back your snapshot..
 you just use ExportSnapshot again to move the snapshot from cluster2 to
 cluster1
 and same as before you do a clone_snapshot to restore it

 Matteo



 On Thu, Aug 1, 2013 at 2:35 PM, Siddharth Karandikar 
 siddharth.karandi...@gmail.com wrote:

 Hi there,

 I am testing out newly added snapshot capability, ExportSnapshot in
 particular.
 Its working fine for me. I am able to run ExportSnapshot properly.

 But the biggest (noob) issue is, once exported, is there any way to
 import those snapshots back in hbase? I don't see any ImportSnapshot
 util there.


 Thanks,
 Siddharth



Re: Import HBase snapshots possible?

2013-08-01 Thread Matteo Bertozzi
Yes, the export an HDFS path.
$ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
-snapshot MySnapshot -copy-to hdfs:///srv2:8082/hbase

so you can export to some /my-backup-dir on your HDFS
and then you've to export back to an hbase cluster, when you want to
restore it

Matteo



On Thu, Aug 1, 2013 at 2:45 PM, Siddharth Karandikar 
siddharth.karandi...@gmail.com wrote:

 Can't I export it to plain HDFS? I think that would be very useful.

 On Thu, Aug 1, 2013 at 7:08 PM, Matteo Bertozzi theo.berto...@gmail.com
 wrote:
  The ExportSnapshot will export the snapshot data+metadata, in theory, to
  another hbase cluster.
  so on the second cluster you'll now be able to do list_snapshots from
  shell and see the exported snapshot.
  now you can simply do clone_snapshot snapshot_name, new_table_name
 and
  you're restoring a snapshot on the second cluster
 
  assuming that you have removed the snapshot from cluster1 and you want to
  export back your snapshot..
  you just use ExportSnapshot again to move the snapshot from cluster2 to
  cluster1
  and same as before you do a clone_snapshot to restore it
 
  Matteo
 
 
 
  On Thu, Aug 1, 2013 at 2:35 PM, Siddharth Karandikar 
  siddharth.karandi...@gmail.com wrote:
 
  Hi there,
 
  I am testing out newly added snapshot capability, ExportSnapshot in
  particular.
  Its working fine for me. I am able to run ExportSnapshot properly.
 
  But the biggest (noob) issue is, once exported, is there any way to
  import those snapshots back in hbase? I don't see any ImportSnapshot
  util there.
 
 
  Thanks,
  Siddharth
 



Re: Import HBase snapshots possible?

2013-08-01 Thread Siddharth Karandikar
Yeah, thats right. But the issue is, hdfs that I am exporting to is
not under HBase.
Can you please provide some example command to do this...


Thanks,
Siddharth

On Thu, Aug 1, 2013 at 7:17 PM, Matteo Bertozzi theo.berto...@gmail.com wrote:
 Yes, the export an HDFS path.
 $ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 -snapshot MySnapshot -copy-to hdfs:///srv2:8082/hbase

 so you can export to some /my-backup-dir on your HDFS
 and then you've to export back to an hbase cluster, when you want to
 restore it

 Matteo



 On Thu, Aug 1, 2013 at 2:45 PM, Siddharth Karandikar 
 siddharth.karandi...@gmail.com wrote:

 Can't I export it to plain HDFS? I think that would be very useful.

 On Thu, Aug 1, 2013 at 7:08 PM, Matteo Bertozzi theo.berto...@gmail.com
 wrote:
  The ExportSnapshot will export the snapshot data+metadata, in theory, to
  another hbase cluster.
  so on the second cluster you'll now be able to do list_snapshots from
  shell and see the exported snapshot.
  now you can simply do clone_snapshot snapshot_name, new_table_name
 and
  you're restoring a snapshot on the second cluster
 
  assuming that you have removed the snapshot from cluster1 and you want to
  export back your snapshot..
  you just use ExportSnapshot again to move the snapshot from cluster2 to
  cluster1
  and same as before you do a clone_snapshot to restore it
 
  Matteo
 
 
 
  On Thu, Aug 1, 2013 at 2:35 PM, Siddharth Karandikar 
  siddharth.karandi...@gmail.com wrote:
 
  Hi there,
 
  I am testing out newly added snapshot capability, ExportSnapshot in
  particular.
  Its working fine for me. I am able to run ExportSnapshot properly.
 
  But the biggest (noob) issue is, once exported, is there any way to
  import those snapshots back in hbase? I don't see any ImportSnapshot
  util there.
 
 
  Thanks,
  Siddharth
 



Re: Import HBase snapshots possible?

2013-08-01 Thread Matteo Bertozzi
Ok, so to export a snapshot from your HBase cluster, you can do
$ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
-snapshot MySnapshot -copy-to hdfs:///srv2:8082/my-backup-dir

Now on cluster2, hdfs:///srv2:8082 you've your my-backup-dir that contains
the exported snapshot (note that the snapshot is under hidden dirs
.snapshots, and .archive)

Now if you want to restore the snapshot,  you have to export it back an
HBase cluster.
So on cluster2, you can do:
$ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot -D
hbase.rootdir=hdfs:///srv2:8082/my-backup-dir -snapshot MySnapshot -copy-to
hdfs:///hbaseSrv:8082/hbase


so, to recap
 - You take a snapshot
 - You Export the snapshot from HBase Cluster-1 - to a simple HDFS dir in
Cluster-2
 - Then you want to restore
 - You Export the snapshot from HDFS dir in Cluster-2 to HBase Cluster (it
can be a different one from the original)
 - From the hbase shell you can just: clone_snapshot 'snapshotName',
'newTableName' if the table does not exists or use restore_snapshot
'snapshotName', if there's a table with the same name


Matteo



On Thu, Aug 1, 2013 at 2:54 PM, Siddharth Karandikar 
siddharth.karandi...@gmail.com wrote:

 Yeah, thats right. But the issue is, hdfs that I am exporting to is
 not under HBase.
 Can you please provide some example command to do this...


 Thanks,
 Siddharth

 On Thu, Aug 1, 2013 at 7:17 PM, Matteo Bertozzi theo.berto...@gmail.com
 wrote:
  Yes, the export an HDFS path.
  $ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
  -snapshot MySnapshot -copy-to hdfs:///srv2:8082/hbase
 
  so you can export to some /my-backup-dir on your HDFS
  and then you've to export back to an hbase cluster, when you want to
  restore it
 
  Matteo
 
 
 
  On Thu, Aug 1, 2013 at 2:45 PM, Siddharth Karandikar 
  siddharth.karandi...@gmail.com wrote:
 
  Can't I export it to plain HDFS? I think that would be very useful.
 
  On Thu, Aug 1, 2013 at 7:08 PM, Matteo Bertozzi 
 theo.berto...@gmail.com
  wrote:
   The ExportSnapshot will export the snapshot data+metadata, in theory,
 to
   another hbase cluster.
   so on the second cluster you'll now be able to do list_snapshots
 from
   shell and see the exported snapshot.
   now you can simply do clone_snapshot snapshot_name, new_table_name
  and
   you're restoring a snapshot on the second cluster
  
   assuming that you have removed the snapshot from cluster1 and you
 want to
   export back your snapshot..
   you just use ExportSnapshot again to move the snapshot from cluster2
 to
   cluster1
   and same as before you do a clone_snapshot to restore it
  
   Matteo
  
  
  
   On Thu, Aug 1, 2013 at 2:35 PM, Siddharth Karandikar 
   siddharth.karandi...@gmail.com wrote:
  
   Hi there,
  
   I am testing out newly added snapshot capability, ExportSnapshot in
   particular.
   Its working fine for me. I am able to run ExportSnapshot properly.
  
   But the biggest (noob) issue is, once exported, is there any way to
   import those snapshots back in hbase? I don't see any ImportSnapshot
   util there.
  
  
   Thanks,
   Siddharth
  
 



Re: Import HBase snapshots possible?

2013-08-01 Thread Siddharth Karandikar
Tried what you suggested. Here is what I get -

ssk01:~/siddharth/tools/hbase-0.95.1-hadoop1 # ./bin/hbase
org.apache.hadoop.hbase.snapshot.ExportSnapshot
-Dhbase.rootdir=hdfs://10.209.17.88:9000/hbase -snapshot s1 -copy-to
/root/siddharth/tools/hbase-0.95.1-hadoop1/data/
Exception in thread main java.lang.IllegalArgumentException: Wrong
FS: hdfs://10.209.17.88:9000/hbase/.hbase-snapshot/s1/.snapshotinfo,
expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:125)
at 
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
at 
org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:296)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.getSnapshotFiles(ExportSnapshot.java:371)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:618)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:690)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:694)


Am I missing something?


Thanks,
Siddharth


On Thu, Aug 1, 2013 at 7:31 PM, Matteo Bertozzi theo.berto...@gmail.com wrote:
 Ok, so to export a snapshot from your HBase cluster, you can do
 $ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
 -snapshot MySnapshot -copy-to hdfs:///srv2:8082/my-backup-dir

 Now on cluster2, hdfs:///srv2:8082 you've your my-backup-dir that contains
 the exported snapshot (note that the snapshot is under hidden dirs
 .snapshots, and .archive)

 Now if you want to restore the snapshot,  you have to export it back an
 HBase cluster.
 So on cluster2, you can do:
 $ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot -D
 hbase.rootdir=hdfs:///srv2:8082/my-backup-dir -snapshot MySnapshot -copy-to
 hdfs:///hbaseSrv:8082/hbase


 so, to recap
  - You take a snapshot
  - You Export the snapshot from HBase Cluster-1 - to a simple HDFS dir in
 Cluster-2
  - Then you want to restore
  - You Export the snapshot from HDFS dir in Cluster-2 to HBase Cluster (it
 can be a different one from the original)
  - From the hbase shell you can just: clone_snapshot 'snapshotName',
 'newTableName' if the table does not exists or use restore_snapshot
 'snapshotName', if there's a table with the same name


 Matteo



 On Thu, Aug 1, 2013 at 2:54 PM, Siddharth Karandikar 
 siddharth.karandi...@gmail.com wrote:

 Yeah, thats right. But the issue is, hdfs that I am exporting to is
 not under HBase.
 Can you please provide some example command to do this...


 Thanks,
 Siddharth

 On Thu, Aug 1, 2013 at 7:17 PM, Matteo Bertozzi theo.berto...@gmail.com
 wrote:
  Yes, the export an HDFS path.
  $ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
  -snapshot MySnapshot -copy-to hdfs:///srv2:8082/hbase
 
  so you can export to some /my-backup-dir on your HDFS
  and then you've to export back to an hbase cluster, when you want to
  restore it
 
  Matteo
 
 
 
  On Thu, Aug 1, 2013 at 2:45 PM, Siddharth Karandikar 
  siddharth.karandi...@gmail.com wrote:
 
  Can't I export it to plain HDFS? I think that would be very useful.
 
  On Thu, Aug 1, 2013 at 7:08 PM, Matteo Bertozzi 
 theo.berto...@gmail.com
  wrote:
   The ExportSnapshot will export the snapshot data+metadata, in theory,
 to
   another hbase cluster.
   so on the second cluster you'll now be able to do list_snapshots
 from
   shell and see the exported snapshot.
   now you can simply do clone_snapshot snapshot_name, new_table_name
  and
   you're restoring a snapshot on the second cluster
  
   assuming that you have removed the snapshot from cluster1 and you
 want to
   export back your snapshot..
   you just use ExportSnapshot again to move the snapshot from cluster2
 to
   cluster1
   and same as before you do a clone_snapshot to restore it
  
   Matteo
  
  
  
   On Thu, Aug 1, 2013 at 2:35 PM, Siddharth Karandikar 
   siddharth.karandi...@gmail.com wrote:
  
   Hi there,
  
   I am testing out newly added snapshot capability, ExportSnapshot in
   particular.
   Its working fine for me. I am able to run ExportSnapshot properly.
  
   But the biggest (noob) issue is, once exported, is there any way to
   import those snapshots back in hbase? I don't see any ImportSnapshot
   util there.
  
  
   Thanks,
   Siddharth
  
 



Re: AssignmentManager looping?

2013-08-01 Thread Jean-Marc Spaggiari
I tried to remove the znodes but got the same result. So I shutted down all
the RS and restarted HBase, and now I have 0 regions for this table.
Running HBCK. Seems that it has a lot to do...

2013/8/1 Kevin O'dell kevin.od...@cloudera.com

 Yes you can if HBase is down, first I would copy .META out of HDFS local
 and then you can search it for split issues. Deleting those znodes should
 clear this up though.
 On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
 wrote:

  I can't check the meta since HBase is down.
 
  Regarding HDFS, I took few random lines like:
  2013-08-01 08:45:57,260 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Region
  28328fdb7181cbd9cc4d6814775e8895 not found on server
  node4,60020,1375319042033; failed processing
  2013-08-01 08:45:57,260 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
 region
  28328fdb7181cbd9cc4d6814775e8895 from server node4,60020,1375319042033
 but
  it doesn't exist anymore, probably already processed its split
 
  And each time, there is nothing like that.
  hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
  28328fdb7181cbd9cc4d6814775e8895
 
  On ZK side:
  [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog
 
  [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
  [28328fdb7181cbd9cc4d6814775e8895, a8781a598c46f19723a2405345b58470,
  b7ebfeb63b10997736fd12920fde2bb8, d95bb27cc026511c2a8c8ad155e79bf6,
  270a9c371fcbe9cd9a04986e0b77d16b, aff4d1d8bf470458bb19525e8aef0759]
 
  Can I just delete those zknodes? Worst case hbck will find them back from
  HDFS if required?
 
  JM
 
  2013/8/1 Kevin O'dell kevin.od...@cloudera.com
 
   Does it exist in meta or hdfs?
   On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
 
   wrote:
  
My master keep logging that:
   
2013-07-31 21:52:59,201 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,201 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:52:59,339 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,339 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:52:59,461 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,461 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:52:59,636 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,636 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:53:00,074 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:53:00,074 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:53:00,261 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:53:00,261 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its split
2013-07-31 21:53:00,417 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:53:00,417 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 

Re: Import HBase snapshots possible?

2013-08-01 Thread Matteo Bertozzi
you have to use 3 slashes otherwise is interpreted as local file-system path
-Dhbase.rootdir=hdfs:///10.209.17.88:9000/hbase

Matteo



On Thu, Aug 1, 2013 at 3:09 PM, Siddharth Karandikar 
siddharth.karandi...@gmail.com wrote:

 Tried what you suggested. Here is what I get -

 ssk01:~/siddharth/tools/hbase-0.95.1-hadoop1 # ./bin/hbase
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 -Dhbase.rootdir=hdfs://10.209.17.88:9000/hbase -snapshot s1 -copy-to
 /root/siddharth/tools/hbase-0.95.1-hadoop1/data/
 Exception in thread main java.lang.IllegalArgumentException: Wrong
 FS: hdfs://10.209.17.88:9000/hbase/.hbase-snapshot/s1/.snapshotinfo,
 expected: file:///
 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
 at
 org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
 at
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
 at
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
 at
 org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:125)
 at
 org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
 at
 org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:296)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.getSnapshotFiles(ExportSnapshot.java:371)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:618)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:690)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:694)


 Am I missing something?


 Thanks,
 Siddharth


 On Thu, Aug 1, 2013 at 7:31 PM, Matteo Bertozzi theo.berto...@gmail.com
 wrote:
  Ok, so to export a snapshot from your HBase cluster, you can do
  $ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
  -snapshot MySnapshot -copy-to hdfs:///srv2:8082/my-backup-dir
 
  Now on cluster2, hdfs:///srv2:8082 you've your my-backup-dir that
 contains
  the exported snapshot (note that the snapshot is under hidden dirs
  .snapshots, and .archive)
 
  Now if you want to restore the snapshot,  you have to export it back an
  HBase cluster.
  So on cluster2, you can do:
  $ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot -D
  hbase.rootdir=hdfs:///srv2:8082/my-backup-dir -snapshot MySnapshot
 -copy-to
  hdfs:///hbaseSrv:8082/hbase
 
 
  so, to recap
   - You take a snapshot
   - You Export the snapshot from HBase Cluster-1 - to a simple HDFS dir
 in
  Cluster-2
   - Then you want to restore
   - You Export the snapshot from HDFS dir in Cluster-2 to HBase Cluster
 (it
  can be a different one from the original)
   - From the hbase shell you can just: clone_snapshot 'snapshotName',
  'newTableName' if the table does not exists or use restore_snapshot
  'snapshotName', if there's a table with the same name
 
 
  Matteo
 
 
 
  On Thu, Aug 1, 2013 at 2:54 PM, Siddharth Karandikar 
  siddharth.karandi...@gmail.com wrote:
 
  Yeah, thats right. But the issue is, hdfs that I am exporting to is
  not under HBase.
  Can you please provide some example command to do this...
 
 
  Thanks,
  Siddharth
 
  On Thu, Aug 1, 2013 at 7:17 PM, Matteo Bertozzi 
 theo.berto...@gmail.com
  wrote:
   Yes, the export an HDFS path.
   $ bin/hbase class org.apache.hadoop.hbase.snapshot.tool.ExportSnapshot
   -snapshot MySnapshot -copy-to hdfs:///srv2:8082/hbase
  
   so you can export to some /my-backup-dir on your HDFS
   and then you've to export back to an hbase cluster, when you want to
   restore it
  
   Matteo
  
  
  
   On Thu, Aug 1, 2013 at 2:45 PM, Siddharth Karandikar 
   siddharth.karandi...@gmail.com wrote:
  
   Can't I export it to plain HDFS? I think that would be very useful.
  
   On Thu, Aug 1, 2013 at 7:08 PM, Matteo Bertozzi 
  theo.berto...@gmail.com
   wrote:
The ExportSnapshot will export the snapshot data+metadata, in
 theory,
  to
another hbase cluster.
so on the second cluster you'll now be able to do list_snapshots
  from
shell and see the exported snapshot.
now you can simply do clone_snapshot snapshot_name,
 new_table_name
   and
you're restoring a snapshot on the second cluster
   
assuming that you have removed the snapshot from cluster1 and you
  want to
export back your snapshot..
you just use ExportSnapshot again to move the snapshot from
 cluster2
  to
cluster1
and same as before you do a clone_snapshot to restore it
   
Matteo
   
   
   
On Thu, Aug 1, 2013 at 2:35 PM, Siddharth Karandikar 
siddharth.karandi...@gmail.com wrote:
   
Hi there,
   
I am testing out newly added snapshot capability, ExportSnapshot
 in

Re: Import HBase snapshots possible?

2013-08-01 Thread Siddharth Karandikar
Its failing with '//' as well as '///'. Error suggests that it needs local fs.

3 ///
ssk01:~/siddharth/tools/hbase-0.95.1-hadoop1 # ./bin/hbase
org.apache.hadoop.hbase.snapshot.ExportSnapshot
-Dhbase.rootdir=hdfs:///10.209.17.88:9000/hbase/s2 -snapshot s2
-copy-to /root/siddharth/tools/hbase-0.95.1-hadoop1/data/
Exception in thread main java.io.IOException: Incomplete HDFS URI,
no host: hdfs:///10.209.17.88:9000/hbase/s2
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:85)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:860)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:594)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:690)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:694)

2 //

ssk01:~/siddharth/tools/hbase-0.95.1-hadoop1 # ./bin/hbase
org.apache.hadoop.hbase.snapshot.ExportSnapshot
-Dhbase.rootdir=hdfs://10.209.17.88:9000/hbase/s2 -snapshot s2
-copy-to /root/siddharth/tools/hbase-0.95.1-hadoop1/data/
Exception in thread main java.lang.IllegalArgumentException: Wrong
FS: hdfs://10.209.17.88:9000/hbase/s2/.hbase-snapshot/s2/.snapshotinfo,
expected: file:///
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
at 
org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
at 
org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:125)
at 
org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
at 
org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:296)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.getSnapshotFiles(ExportSnapshot.java:371)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:618)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:690)
at 
org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:694)




Btw, I tried one more thing. From my HDFS location, I just did a copy like -
ssk01:~/siddharth/tools/hadoop-1.1.2 # ./bin/hadoop fs -copyToLocal
hdfs://10.209.17.88:9000/hbase/s1/.hbase-snapshot/s1
/root/siddharth/tools/hbase-0.95.1-hadoop1/data/.hbase-snapshot/

After doing this, I am able to see s1 in 'list_snapshots'. But it is
failing at 'clone_snapshot'.

hbase(main):014:0 clone_snapshot 's1', 'ts1'

ERROR: java.io.IOException: Table 'ts1' not yet enabled, after 199617ms.

Here is some help for this command:
Create a new table by cloning the snapshot content.
There're no copies of data involved.
And writing on the newly created table will not influence the snapshot data.

Examples:
  hbase clone_snapshot 'snapshotName', 'tableName'



On Thu, Aug 1, 2013 at 7:44 PM, Matteo Bertozzi theo.berto...@gmail.com wrote:
 you have to use 3 slashes otherwise is interpreted as local file-system path
 -Dhbase.rootdir=hdfs:///10.209.17.88:9000/hbase

 Matteo



 On Thu, Aug 1, 2013 at 3:09 PM, Siddharth Karandikar 
 siddharth.karandi...@gmail.com wrote:

 Tried what you suggested. Here is what I get -

 ssk01:~/siddharth/tools/hbase-0.95.1-hadoop1 # ./bin/hbase
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 -Dhbase.rootdir=hdfs://10.209.17.88:9000/hbase -snapshot s1 -copy-to
 /root/siddharth/tools/hbase-0.95.1-hadoop1/data/
 Exception in thread main java.lang.IllegalArgumentException: Wrong
 FS: hdfs://10.209.17.88:9000/hbase/.hbase-snapshot/s1/.snapshotinfo,
 expected: file:///
 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
 at
 org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
 at
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
 at
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
 at
 org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:125)
 at
 org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
 at 

Re: Import HBase snapshots possible?

2013-08-01 Thread Matteo Bertozzi
You can't just copy the .snapshot folder... so now you've the RSs that is
failing since the files for the cloned table are not available..

when you specify the hbase.rootdir you've just to specify the hbase.rootdir
the one in /etc/hbase-site.xml which doesn't contain the name of the
snapshot/table that you want to export (e.g.
hdfs://10.209.17.88:9000/hbasenot hdfs://
10.209.17.88:9000/hbase/s2)



Matteo



On Thu, Aug 1, 2013 at 3:25 PM, Siddharth Karandikar 
siddharth.karandi...@gmail.com wrote:

 Its failing with '//' as well as '///'. Error suggests that it needs local
 fs.

 3 ///
 ssk01:~/siddharth/tools/hbase-0.95.1-hadoop1 # ./bin/hbase
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 -Dhbase.rootdir=hdfs:///10.209.17.88:9000/hbase/s2 -snapshot s2
 -copy-to /root/siddharth/tools/hbase-0.95.1-hadoop1/data/
 Exception in thread main java.io.IOException: Incomplete HDFS URI,
 no host: hdfs:///10.209.17.88:9000/hbase/s2
 at
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:85)
 at
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
 at
 org.apache.hadoop.hbase.util.FSUtils.getRootDir(FSUtils.java:860)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:594)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:690)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:694)

 2 //

 ssk01:~/siddharth/tools/hbase-0.95.1-hadoop1 # ./bin/hbase
 org.apache.hadoop.hbase.snapshot.ExportSnapshot
 -Dhbase.rootdir=hdfs://10.209.17.88:9000/hbase/s2 -snapshot s2
 -copy-to /root/siddharth/tools/hbase-0.95.1-hadoop1/data/
 Exception in thread main java.lang.IllegalArgumentException: Wrong
 FS: hdfs://10.209.17.88:9000/hbase/s2/.hbase-snapshot/s2/.snapshotinfo,
 expected: file:///
 at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:381)
 at
 org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:55)
 at
 org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:393)
 at
 org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
 at
 org.apache.hadoop.fs.ChecksumFileSystem$ChecksumFSInputChecker.init(ChecksumFileSystem.java:125)
 at
 org.apache.hadoop.fs.ChecksumFileSystem.open(ChecksumFileSystem.java:283)
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
 at
 org.apache.hadoop.hbase.snapshot.SnapshotDescriptionUtils.readSnapshotInfo(SnapshotDescriptionUtils.java:296)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.getSnapshotFiles(ExportSnapshot.java:371)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.run(ExportSnapshot.java:618)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.innerMain(ExportSnapshot.java:690)
 at
 org.apache.hadoop.hbase.snapshot.ExportSnapshot.main(ExportSnapshot.java:694)




 Btw, I tried one more thing. From my HDFS location, I just did a copy like
 -
 ssk01:~/siddharth/tools/hadoop-1.1.2 # ./bin/hadoop fs -copyToLocal
 hdfs://10.209.17.88:9000/hbase/s1/.hbase-snapshot/s1
 /root/siddharth/tools/hbase-0.95.1-hadoop1/data/.hbase-snapshot/

 After doing this, I am able to see s1 in 'list_snapshots'. But it is
 failing at 'clone_snapshot'.

 hbase(main):014:0 clone_snapshot 's1', 'ts1'

 ERROR: java.io.IOException: Table 'ts1' not yet enabled, after 199617ms.

 Here is some help for this command:
 Create a new table by cloning the snapshot content.
 There're no copies of data involved.
 And writing on the newly created table will not influence the snapshot
 data.

 Examples:
   hbase clone_snapshot 'snapshotName', 'tableName'



 On Thu, Aug 1, 2013 at 7:44 PM, Matteo Bertozzi theo.berto...@gmail.com
 wrote:
  you have to use 3 slashes otherwise is interpreted as local file-system
 path
  -Dhbase.rootdir=hdfs:///10.209.17.88:9000/hbase
 
  Matteo
 
 
 
  On Thu, Aug 1, 2013 at 3:09 PM, Siddharth Karandikar 
  siddharth.karandi...@gmail.com wrote:
 
  Tried what you suggested. Here is what I get -
 
  ssk01:~/siddharth/tools/hbase-0.95.1-hadoop1 # ./bin/hbase
  org.apache.hadoop.hbase.snapshot.ExportSnapshot
  -Dhbase.rootdir=hdfs://10.209.17.88:9000/hbase -snapshot s1 -copy-to
  /root/siddharth/tools/hbase-0.95.1-hadoop1/data/
  Exception in thread main java.lang.IllegalArgumentException: Wrong
  FS: hdfs://10.209.17.88:9000/hbase/.hbase-snapshot/s1/.snapshotinfo,
  expected: file:///
  at
 

Why HBase integation with Hive makes Hive slow

2013-08-01 Thread Hao Ren

Hi,

I have a cluster (1 master + 3 slaves) on which there Hive, Hbase, and 
Hadoop.


In order to do some daily row-level update routine, we need to integrate 
Hbase with hive, but the performance is not good.


E.g. There are 2 tables in hive,
hbase_table:  a hbase table created via Hive
hive_table: a native hive table
 both hold the same data set.

When runing:
select count(*) from hbase_table; === takes 500 s
select count(*) from hive_table; === takes 6 s

I have tried a lot of queries on the two tables. But hbase_table is 
always very slow.


To be claire, I created the hbase_ table as below:

CREATE TABLE hbase_table (
idvisite string,
client_list Arraystring,
nb_client int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (hbase.columns.mapping = 
:key,clients:id_list,clients:nb)

TBLPROPERTIES(hbase.table.name = table_test)
;

And my Hbase is on pseudo-distributed mode.

I guess, at the beginning of a hive query execution, hive will load data 
from Hbase, where serde takes a long time.


Could someone tell me how to improve my poor performance ?
Is this cause by my wrongly configured integration ?
Is a fully-distributed mode needed here ?

Thank you in advance for your time.

Hao.


--
Hao Ren
ClaraVista
www.claravista.fr


ETL like merge databases to HBase

2013-08-01 Thread Shengjie Min
Hi All,

I have a use case I have a few applications running independently, Let's
say applications A, B, C. Each has a DB associated. I wanna a aggregated
view on all the databases so that I don't have to jump into different dbs
to find the info I need. Is there a tool out there, allows me to move data
from A,B,C to a single/centralised HBase cluster? Would be even nicer, if
DB A, B, C gets updated by the apps, the updates will be synchronised to
HBase too.

-- 
All the best,
Shengjie Min


Re: ETL like merge databases to HBase

2013-08-01 Thread Ted Yu
bq. Each has a DB associated

They're RDBMS, I assume ?

Have you looked at Sqoop ?

On Thu, Aug 1, 2013 at 8:04 AM, Shengjie Min kelvin@gmail.com wrote:

 Hi All,

 I have a use case I have a few applications running independently, Let's
 say applications A, B, C. Each has a DB associated. I wanna a aggregated
 view on all the databases so that I don't have to jump into different dbs
 to find the info I need. Is there a tool out there, allows me to move data
 from A,B,C to a single/centralised HBase cluster? Would be even nicer, if
 DB A, B, C gets updated by the apps, the updates will be synchronised to
 HBase too.

 --
 All the best,
 Shengjie Min



Re: Recursive delete upon cleanup

2013-08-01 Thread Ron Echeverri
On Tue, Jul 30, 2013 at 6:19 PM, Ted Yu yuzhih...@gmail.com wrote:
 I searched HBase 0.94 code base, hadoop 1 and hadoop 2 code base.
 I didn't find where 'Try with recursive flag' was logged.
 Mind giving us a bit more information on the Hadoop / HBase releases you
 were using ?

You're right, that error message is coming from our filesystem client.
I think that the HBase version is 0.94.5, but i'll need to look into
that later (more pressing things have arisen, of course). Thank you
for the help.

rone


Re: AssignmentManager looping?

2013-08-01 Thread Jimmy Xiang
Something went wrong with split.  It should be easy to fix your cluster.
However, it will be more interesting to find out how it happened. Do you
remember what has happened since it was good previously? Do you have all
the logs?


On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
 wrote:

 I tried to remove the znodes but got the same result. So I shutted down all
 the RS and restarted HBase, and now I have 0 regions for this table.
 Running HBCK. Seems that it has a lot to do...

 2013/8/1 Kevin O'dell kevin.od...@cloudera.com

  Yes you can if HBase is down, first I would copy .META out of HDFS local
  and then you can search it for split issues. Deleting those znodes should
  clear this up though.
  On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
  wrote:
 
   I can't check the meta since HBase is down.
  
   Regarding HDFS, I took few random lines like:
   2013-08-01 08:45:57,260 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   28328fdb7181cbd9cc4d6814775e8895 not found on server
   node4,60020,1375319042033; failed processing
   2013-08-01 08:45:57,260 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
  region
   28328fdb7181cbd9cc4d6814775e8895 from server node4,60020,1375319042033
  but
   it doesn't exist anymore, probably already processed its split
  
   And each time, there is nothing like that.
   hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
   28328fdb7181cbd9cc4d6814775e8895
  
   On ZK side:
   [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog
  
   [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
   [28328fdb7181cbd9cc4d6814775e8895, a8781a598c46f19723a2405345b58470,
   b7ebfeb63b10997736fd12920fde2bb8, d95bb27cc026511c2a8c8ad155e79bf6,
   270a9c371fcbe9cd9a04986e0b77d16b, aff4d1d8bf470458bb19525e8aef0759]
  
   Can I just delete those zknodes? Worst case hbck will find them back
 from
   HDFS if required?
  
   JM
  
   2013/8/1 Kevin O'dell kevin.od...@cloudera.com
  
Does it exist in meta or hdfs?
On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org
  
wrote:
   
 My master keep logging that:

 2013-07-31 21:52:59,201 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,201 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
 for
region
 270a9c371fcbe9cd9a04986e0b77d16b from server
  node7,60020,1375319044055
but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:52:59,339 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,339 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
 for
region
 270a9c371fcbe9cd9a04986e0b77d16b from server
  node7,60020,1375319044055
but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:52:59,461 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,461 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
 for
region
 270a9c371fcbe9cd9a04986e0b77d16b from server
  node7,60020,1375319044055
but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:52:59,636 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,636 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
 for
region
 270a9c371fcbe9cd9a04986e0b77d16b from server
  node7,60020,1375319044055
but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:53:00,074 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:53:00,074 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
 for
region
 270a9c371fcbe9cd9a04986e0b77d16b from server
  node7,60020,1375319044055
but
 it doesn't exist anymore, probably already processed its split
 2013-07-31 21:53:00,261 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:53:00,261 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
 for
region
 270a9c371fcbe9cd9a04986e0b77d16b from server
  node7,60020,1375319044055
but

Pagination with HBase

2013-08-01 Thread Jonathan Cardoso
Hi!

Is there a way to scan a HBase table getting, for example, the first 100
results, then later get the next 100 and so on... Just like in SQL we do
with LIMIT and OFFSET?


*Jonathan Cardoso** **
Universidade Federal de Goias*


Re: AssignmentManager looping?

2013-08-01 Thread Jean-Marc Spaggiari
Hi Jimmy,

I should still have all the logs.

What I did is pretty simple.

I tried to turn the cluster off while a single regioned 250GB table was
under major_compaction to get splitted.

I will targz all the logs for the few last days and make that available.

On the other side, I'm still not able to bring it back up...

JM

2013/8/1 Jimmy Xiang jxi...@cloudera.com

 Something went wrong with split.  It should be easy to fix your cluster.
 However, it will be more interesting to find out how it happened. Do you
 remember what has happened since it was good previously? Do you have all
 the logs?


 On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org
  wrote:

  I tried to remove the znodes but got the same result. So I shutted down
 all
  the RS and restarted HBase, and now I have 0 regions for this table.
  Running HBCK. Seems that it has a lot to do...
 
  2013/8/1 Kevin O'dell kevin.od...@cloudera.com
 
   Yes you can if HBase is down, first I would copy .META out of HDFS
 local
   and then you can search it for split issues. Deleting those znodes
 should
   clear this up though.
   On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari jean-m...@spaggiari.org
 
   wrote:
  
I can't check the meta since HBase is down.
   
Regarding HDFS, I took few random lines like:
2013-08-01 08:45:57,260 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
28328fdb7181cbd9cc4d6814775e8895 not found on server
node4,60020,1375319042033; failed processing
2013-08-01 08:45:57,260 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT for
   region
28328fdb7181cbd9cc4d6814775e8895 from server
 node4,60020,1375319042033
   but
it doesn't exist anymore, probably already processed its split
   
And each time, there is nothing like that.
hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
28328fdb7181cbd9cc4d6814775e8895
   
On ZK side:
[zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog
   
[zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
[28328fdb7181cbd9cc4d6814775e8895, a8781a598c46f19723a2405345b58470,
b7ebfeb63b10997736fd12920fde2bb8, d95bb27cc026511c2a8c8ad155e79bf6,
270a9c371fcbe9cd9a04986e0b77d16b, aff4d1d8bf470458bb19525e8aef0759]
   
Can I just delete those zknodes? Worst case hbck will find them back
  from
HDFS if required?
   
JM
   
2013/8/1 Kevin O'dell kevin.od...@cloudera.com
   
 Does it exist in meta or hdfs?
 On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org
   
 wrote:

  My master keep logging that:
 
  2013-07-31 21:52:59,201 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Region
  270a9c371fcbe9cd9a04986e0b77d16b not found on server
  node7,60020,1375319044055; failed processing
  2013-07-31 21:52:59,201 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
  for
 region
  270a9c371fcbe9cd9a04986e0b77d16b from server
   node7,60020,1375319044055
 but
  it doesn't exist anymore, probably already processed its split
  2013-07-31 21:52:59,339 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Region
  270a9c371fcbe9cd9a04986e0b77d16b not found on server
  node7,60020,1375319044055; failed processing
  2013-07-31 21:52:59,339 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
  for
 region
  270a9c371fcbe9cd9a04986e0b77d16b from server
   node7,60020,1375319044055
 but
  it doesn't exist anymore, probably already processed its split
  2013-07-31 21:52:59,461 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Region
  270a9c371fcbe9cd9a04986e0b77d16b not found on server
  node7,60020,1375319044055; failed processing
  2013-07-31 21:52:59,461 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
  for
 region
  270a9c371fcbe9cd9a04986e0b77d16b from server
   node7,60020,1375319044055
 but
  it doesn't exist anymore, probably already processed its split
  2013-07-31 21:52:59,636 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Region
  270a9c371fcbe9cd9a04986e0b77d16b not found on server
  node7,60020,1375319044055; failed processing
  2013-07-31 21:52:59,636 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
  for
 region
  270a9c371fcbe9cd9a04986e0b77d16b from server
   node7,60020,1375319044055
 but
  it doesn't exist anymore, probably already processed its split
  2013-07-31 21:53:00,074 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Region
  270a9c371fcbe9cd9a04986e0b77d16b not found on server
  node7,60020,1375319044055; failed processing
  2013-07-31 21:53:00,074 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
  for
 region
  270a9c371fcbe9cd9a04986e0b77d16b from server
   

Re: Pagination with HBase

2013-08-01 Thread Pavan Sudheendra
Use 
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#getStartRow()
and 
http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/Scan.html#getStopRow()


i=1; //start row
j=100; //end row
while(isomeval) {
Scan scan = new Scan(Bytes.ToBytes(i),Bytes.toBytes(j)); // i
denotes start row and j denotes end row
//probably convert i  j to string to keep it human-readable

scan.addColumn(ColumnFamily, ColumnQualifier);

// do something

i = i+100;
j = j +100;
}

On Thu, Aug 1, 2013 at 10:31 PM, Jonathan Cardoso
jonathancar...@gmail.com wrote:
 Hi!

 Is there a way to scan a HBase table getting, for example, the first 100
 results, then later get the next 100 and so on... Just like in SQL we do
 with LIMIT and OFFSET?


 *Jonathan Cardoso** **
 Universidade Federal de Goias*



-- 
Regards-
Pavan


Re: Pagination with HBase

2013-08-01 Thread Ted Yu
Take a look at ColumnPaginationFilter.java and its unit test.

Cheers

On Thu, Aug 1, 2013 at 10:01 AM, Jonathan Cardoso
jonathancar...@gmail.comwrote:

 Hi!

 Is there a way to scan a HBase table getting, for example, the first 100
 results, then later get the next 100 and so on... Just like in SQL we do
 with LIMIT and OFFSET?


 *Jonathan Cardoso** **
 Universidade Federal de Goias*



Re: Pagination with HBase

2013-08-01 Thread Pavan Sudheendra
@Jonathan Ted Yu is right! Ignore my mail :)

On Thu, Aug 1, 2013 at 10:46 PM, Ted Yu yuzhih...@gmail.com wrote:
 Take a look at ColumnPaginationFilter.java and its unit test.

 Cheers

 On Thu, Aug 1, 2013 at 10:01 AM, Jonathan Cardoso
 jonathancar...@gmail.comwrote:

 Hi!

 Is there a way to scan a HBase table getting, for example, the first 100
 results, then later get the next 100 and so on... Just like in SQL we do
 with LIMIT and OFFSET?


 *Jonathan Cardoso** **
 Universidade Federal de Goias*




-- 
Regards-
Pavan


Re: AssignmentManager looping?

2013-08-01 Thread Kevin O'dell
JM,

Stop HBase
rmr /hbase from zkcli
Sideline META
Run offline meta repair
Start HBase
On Aug 1, 2013 1:01 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:

 Hi Jimmy,

 I should still have all the logs.

 What I did is pretty simple.

 I tried to turn the cluster off while a single regioned 250GB table was
 under major_compaction to get splitted.

 I will targz all the logs for the few last days and make that available.

 On the other side, I'm still not able to bring it back up...

 JM

 2013/8/1 Jimmy Xiang jxi...@cloudera.com

  Something went wrong with split.  It should be easy to fix your cluster.
  However, it will be more interesting to find out how it happened. Do you
  remember what has happened since it was good previously? Do you have all
  the logs?
 
 
  On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org
   wrote:
 
   I tried to remove the znodes but got the same result. So I shutted down
  all
   the RS and restarted HBase, and now I have 0 regions for this table.
   Running HBCK. Seems that it has a lot to do...
  
   2013/8/1 Kevin O'dell kevin.od...@cloudera.com
  
Yes you can if HBase is down, first I would copy .META out of HDFS
  local
and then you can search it for split issues. Deleting those znodes
  should
clear this up though.
On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org
  
wrote:
   
 I can't check the meta since HBase is down.

 Regarding HDFS, I took few random lines like:
 2013-08-01 08:45:57,260 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 28328fdb7181cbd9cc4d6814775e8895 not found on server
 node4,60020,1375319042033; failed processing
 2013-08-01 08:45:57,260 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
 for
region
 28328fdb7181cbd9cc4d6814775e8895 from server
  node4,60020,1375319042033
but
 it doesn't exist anymore, probably already processed its split

 And each time, there is nothing like that.
 hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
 28328fdb7181cbd9cc4d6814775e8895

 On ZK side:
 [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog

 [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
 [28328fdb7181cbd9cc4d6814775e8895,
 a8781a598c46f19723a2405345b58470,
 b7ebfeb63b10997736fd12920fde2bb8, d95bb27cc026511c2a8c8ad155e79bf6,
 270a9c371fcbe9cd9a04986e0b77d16b, aff4d1d8bf470458bb19525e8aef0759]

 Can I just delete those zknodes? Worst case hbck will find them
 back
   from
 HDFS if required?

 JM

 2013/8/1 Kevin O'dell kevin.od...@cloudera.com

  Does it exist in meta or hdfs?
  On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org

  wrote:
 
   My master keep logging that:
  
   2013-07-31 21:52:59,201 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,201 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server
node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,339 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,339 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server
node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,461 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,461 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server
node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,636 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,636 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server
node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:53:00,074 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
  

Re: AssignmentManager looping?

2013-08-01 Thread Kevin O'dell
If that doesn't work you probably have an invalid reference file and you
will find that in RS logs for the HLog split that is never finishing.
On Aug 1, 2013 1:38 PM, Kevin O'dell kevin.od...@cloudera.com wrote:

 JM,

 Stop HBase
 rmr /hbase from zkcli
 Sideline META
 Run offline meta repair
 Start HBase
 On Aug 1, 2013 1:01 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
 wrote:

 Hi Jimmy,

 I should still have all the logs.

 What I did is pretty simple.

 I tried to turn the cluster off while a single regioned 250GB table was
 under major_compaction to get splitted.

 I will targz all the logs for the few last days and make that available.

 On the other side, I'm still not able to bring it back up...

 JM

 2013/8/1 Jimmy Xiang jxi...@cloudera.com

  Something went wrong with split.  It should be easy to fix your cluster.
  However, it will be more interesting to find out how it happened. Do you
  remember what has happened since it was good previously? Do you have all
  the logs?
 
 
  On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org
   wrote:
 
   I tried to remove the znodes but got the same result. So I shutted
 down
  all
   the RS and restarted HBase, and now I have 0 regions for this table.
   Running HBCK. Seems that it has a lot to do...
  
   2013/8/1 Kevin O'dell kevin.od...@cloudera.com
  
Yes you can if HBase is down, first I would copy .META out of HDFS
  local
and then you can search it for split issues. Deleting those znodes
  should
clear this up though.
On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org
  
wrote:
   
 I can't check the meta since HBase is down.

 Regarding HDFS, I took few random lines like:
 2013-08-01 08:45:57,260 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 28328fdb7181cbd9cc4d6814775e8895 not found on server
 node4,60020,1375319042033; failed processing
 2013-08-01 08:45:57,260 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
 for
region
 28328fdb7181cbd9cc4d6814775e8895 from server
  node4,60020,1375319042033
but
 it doesn't exist anymore, probably already processed its split

 And each time, there is nothing like that.
 hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
 28328fdb7181cbd9cc4d6814775e8895

 On ZK side:
 [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog

 [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
 [28328fdb7181cbd9cc4d6814775e8895,
 a8781a598c46f19723a2405345b58470,
 b7ebfeb63b10997736fd12920fde2bb8,
 d95bb27cc026511c2a8c8ad155e79bf6,
 270a9c371fcbe9cd9a04986e0b77d16b,
 aff4d1d8bf470458bb19525e8aef0759]

 Can I just delete those zknodes? Worst case hbck will find them
 back
   from
 HDFS if required?

 JM

 2013/8/1 Kevin O'dell kevin.od...@cloudera.com

  Does it exist in meta or hdfs?
  On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org

  wrote:
 
   My master keep logging that:
  
   2013-07-31 21:52:59,201 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,201 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server
node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,339 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,339 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server
node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,461 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,461 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   270a9c371fcbe9cd9a04986e0b77d16b from server
node7,60020,1375319044055
  but
   it doesn't exist anymore, probably already processed its split
   2013-07-31 21:52:59,636 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   270a9c371fcbe9cd9a04986e0b77d16b not found on server
   node7,60020,1375319044055; failed processing
   2013-07-31 21:52:59,636 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   270a9c371fcbe9cd9a04986e0b77d16b 

Re: Pagination with HBase

2013-08-01 Thread anil gupta
If you need more insight into HBase Pagination, these link might help you:
http://search-hadoop.com/m/feqnAUeLR1
http://search-hadoop.com/m/m5zM2rTSkb


On Thu, Aug 1, 2013 at 10:18 AM, Pavan Sudheendra pavan0...@gmail.comwrote:

 @Jonathan Ted Yu is right! Ignore my mail :)

 On Thu, Aug 1, 2013 at 10:46 PM, Ted Yu yuzhih...@gmail.com wrote:
  Take a look at ColumnPaginationFilter.java and its unit test.
 
  Cheers
 
  On Thu, Aug 1, 2013 at 10:01 AM, Jonathan Cardoso
  jonathancar...@gmail.comwrote:
 
  Hi!
 
  Is there a way to scan a HBase table getting, for example, the first 100
  results, then later get the next 100 and so on... Just like in SQL we do
  with LIMIT and OFFSET?
 
 
  *Jonathan Cardoso** **
  Universidade Federal de Goias*
 



 --
 Regards-
 Pavan




-- 
Thanks  Regards,
Anil Gupta


Re: AssignmentManager looping?

2013-08-01 Thread Jean-Marc Spaggiari
So I had to remove few reference files and run few hbck to get everything
back online.

Summary: don't stop your cluster while it's major compacting huge tables ;)

Thanks all!

JM

2013/8/1 Kevin O'dell kevin.od...@cloudera.com

 If that doesn't work you probably have an invalid reference file and you
 will find that in RS logs for the HLog split that is never finishing.
 On Aug 1, 2013 1:38 PM, Kevin O'dell kevin.od...@cloudera.com wrote:

  JM,
 
  Stop HBase
  rmr /hbase from zkcli
  Sideline META
  Run offline meta repair
  Start HBase
  On Aug 1, 2013 1:01 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
  wrote:
 
  Hi Jimmy,
 
  I should still have all the logs.
 
  What I did is pretty simple.
 
  I tried to turn the cluster off while a single regioned 250GB table was
  under major_compaction to get splitted.
 
  I will targz all the logs for the few last days and make that available.
 
  On the other side, I'm still not able to bring it back up...
 
  JM
 
  2013/8/1 Jimmy Xiang jxi...@cloudera.com
 
   Something went wrong with split.  It should be easy to fix your
 cluster.
   However, it will be more interesting to find out how it happened. Do
 you
   remember what has happened since it was good previously? Do you have
 all
   the logs?
  
  
   On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org
wrote:
  
I tried to remove the znodes but got the same result. So I shutted
  down
   all
the RS and restarted HBase, and now I have 0 regions for this table.
Running HBCK. Seems that it has a lot to do...
   
2013/8/1 Kevin O'dell kevin.od...@cloudera.com
   
 Yes you can if HBase is down, first I would copy .META out of HDFS
   local
 and then you can search it for split issues. Deleting those znodes
   should
 clear this up though.
 On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org
   
 wrote:

  I can't check the meta since HBase is down.
 
  Regarding HDFS, I took few random lines like:
  2013-08-01 08:45:57,260 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Region
  28328fdb7181cbd9cc4d6814775e8895 not found on server
  node4,60020,1375319042033; failed processing
  2013-08-01 08:45:57,260 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Received SPLIT
  for
 region
  28328fdb7181cbd9cc4d6814775e8895 from server
   node4,60020,1375319042033
 but
  it doesn't exist anymore, probably already processed its split
 
  And each time, there is nothing like that.
  hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
  28328fdb7181cbd9cc4d6814775e8895
 
  On ZK side:
  [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog
 
  [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
  [28328fdb7181cbd9cc4d6814775e8895,
  a8781a598c46f19723a2405345b58470,
  b7ebfeb63b10997736fd12920fde2bb8,
  d95bb27cc026511c2a8c8ad155e79bf6,
  270a9c371fcbe9cd9a04986e0b77d16b,
  aff4d1d8bf470458bb19525e8aef0759]
 
  Can I just delete those zknodes? Worst case hbck will find them
  back
from
  HDFS if required?
 
  JM
 
  2013/8/1 Kevin O'dell kevin.od...@cloudera.com
 
   Does it exist in meta or hdfs?
   On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org
 
   wrote:
  
My master keep logging that:
   
2013-07-31 21:52:59,201 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,201 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received
  SPLIT
for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its
 split
2013-07-31 21:52:59,339 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,339 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received
  SPLIT
for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its
 split
2013-07-31 21:52:59,461 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
270a9c371fcbe9cd9a04986e0b77d16b not found on server
node7,60020,1375319044055; failed processing
2013-07-31 21:52:59,461 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received
  SPLIT
for
   region
270a9c371fcbe9cd9a04986e0b77d16b from server
 node7,60020,1375319044055
   but
it doesn't exist anymore, probably already processed its
 split

Re: AssignmentManager looping?

2013-08-01 Thread Kevin O'dell
Jimmy,

  Sounds like our dreaded reference file issue again. I spoke with JM and
he is going to try to reproduce this  My gut tells me our point of no
return may be in the wrong place due to some code change along the way, but
hbck could also just be doing something wonky.

JM,

  This cluster is not CM managed correct?
On Aug 1, 2013 1:49 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
wrote:

 So I had to remove few reference files and run few hbck to get everything
 back online.

 Summary: don't stop your cluster while it's major compacting huge tables ;)

 Thanks all!

 JM

 2013/8/1 Kevin O'dell kevin.od...@cloudera.com

  If that doesn't work you probably have an invalid reference file and you
  will find that in RS logs for the HLog split that is never finishing.
  On Aug 1, 2013 1:38 PM, Kevin O'dell kevin.od...@cloudera.com wrote:
 
   JM,
  
   Stop HBase
   rmr /hbase from zkcli
   Sideline META
   Run offline meta repair
   Start HBase
   On Aug 1, 2013 1:01 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
 
   wrote:
  
   Hi Jimmy,
  
   I should still have all the logs.
  
   What I did is pretty simple.
  
   I tried to turn the cluster off while a single regioned 250GB table
 was
   under major_compaction to get splitted.
  
   I will targz all the logs for the few last days and make that
 available.
  
   On the other side, I'm still not able to bring it back up...
  
   JM
  
   2013/8/1 Jimmy Xiang jxi...@cloudera.com
  
Something went wrong with split.  It should be easy to fix your
  cluster.
However, it will be more interesting to find out how it happened. Do
  you
remember what has happened since it was good previously? Do you have
  all
the logs?
   
   
On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org
 wrote:
   
 I tried to remove the znodes but got the same result. So I shutted
   down
all
 the RS and restarted HBase, and now I have 0 regions for this
 table.
 Running HBCK. Seems that it has a lot to do...

 2013/8/1 Kevin O'dell kevin.od...@cloudera.com

  Yes you can if HBase is down, first I would copy .META out of
 HDFS
local
  and then you can search it for split issues. Deleting those
 znodes
should
  clear this up though.
  On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari 
   jean-m...@spaggiari.org

  wrote:
 
   I can't check the meta since HBase is down.
  
   Regarding HDFS, I took few random lines like:
   2013-08-01 08:45:57,260 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Region
   28328fdb7181cbd9cc4d6814775e8895 not found on server
   node4,60020,1375319042033; failed processing
   2013-08-01 08:45:57,260 WARN
   org.apache.hadoop.hbase.master.AssignmentManager: Received
 SPLIT
   for
  region
   28328fdb7181cbd9cc4d6814775e8895 from server
node4,60020,1375319042033
  but
   it doesn't exist anymore, probably already processed its split
  
   And each time, there is nothing like that.
   hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
   28328fdb7181cbd9cc4d6814775e8895
  
   On ZK side:
   [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog
  
   [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
   [28328fdb7181cbd9cc4d6814775e8895,
   a8781a598c46f19723a2405345b58470,
   b7ebfeb63b10997736fd12920fde2bb8,
   d95bb27cc026511c2a8c8ad155e79bf6,
   270a9c371fcbe9cd9a04986e0b77d16b,
   aff4d1d8bf470458bb19525e8aef0759]
  
   Can I just delete those zknodes? Worst case hbck will find
 them
   back
 from
   HDFS if required?
  
   JM
  
   2013/8/1 Kevin O'dell kevin.od...@cloudera.com
  
Does it exist in meta or hdfs?
On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org
  
wrote:
   
 My master keep logging that:

 2013-07-31 21:52:59,201 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,201 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received
   SPLIT
 for
region
 270a9c371fcbe9cd9a04986e0b77d16b from server
  node7,60020,1375319044055
but
 it doesn't exist anymore, probably already processed its
  split
 2013-07-31 21:52:59,339 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 270a9c371fcbe9cd9a04986e0b77d16b not found on server
 node7,60020,1375319044055; failed processing
 2013-07-31 21:52:59,339 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received
   SPLIT
 for
region
 270a9c371fcbe9cd9a04986e0b77d16b from server
  node7,60020,1375319044055
but
 it doesn't exist anymore, 

Re: Pagination with HBase

2013-08-01 Thread Jonathan Cardoso
Thanks! I've tested ColumnPaginationFilter but It's exactly what I need.

Correct me if I'm wrong please, but ColumnPaginationFilter filters the
columns of the result, how many of them will be retrieved based on the
settings of 'limit' and 'offset' properties.

But I need to make a Scan and get only the first X rows, not the first X
columns

*Jonathan Cardoso** **
Universidade Federal de Goias*


2013/8/1 anil gupta anilgupt...@gmail.com

 If you need more insight into HBase Pagination, these link might help you:
 http://search-hadoop.com/m/feqnAUeLR1
 http://search-hadoop.com/m/m5zM2rTSkb


 On Thu, Aug 1, 2013 at 10:18 AM, Pavan Sudheendra pavan0...@gmail.com
 wrote:

  @Jonathan Ted Yu is right! Ignore my mail :)
 
  On Thu, Aug 1, 2013 at 10:46 PM, Ted Yu yuzhih...@gmail.com wrote:
   Take a look at ColumnPaginationFilter.java and its unit test.
  
   Cheers
  
   On Thu, Aug 1, 2013 at 10:01 AM, Jonathan Cardoso
   jonathancar...@gmail.comwrote:
  
   Hi!
  
   Is there a way to scan a HBase table getting, for example, the first
 100
   results, then later get the next 100 and so on... Just like in SQL we
 do
   with LIMIT and OFFSET?
  
  
   *Jonathan Cardoso** **
   Universidade Federal de Goias*
  
 
 
 
  --
  Regards-
  Pavan
 



 --
 Thanks  Regards,
 Anil Gupta



Re: Pagination with HBase

2013-08-01 Thread Jonathan Cardoso
By anil's links I guess what I should use is PageFilter instead of
ColumnPaginationFilter.

*Jonathan Cardoso** **
Universidade Federal de Goias*


2013/8/1 Jonathan Cardoso jonathancar...@gmail.com

 Thanks! I've tested ColumnPaginationFilter but It's exactly what I need.

 Correct me if I'm wrong please, but ColumnPaginationFilter filters the
 columns of the result, how many of them will be retrieved based on the
 settings of 'limit' and 'offset' properties.

 But I need to make a Scan and get only the first X rows, not the first X
 columns

 *Jonathan Cardoso** **
 Universidade Federal de Goias*


 2013/8/1 anil gupta anilgupt...@gmail.com

 If you need more insight into HBase Pagination, these link might help you:
 http://search-hadoop.com/m/feqnAUeLR1
 http://search-hadoop.com/m/m5zM2rTSkb


 On Thu, Aug 1, 2013 at 10:18 AM, Pavan Sudheendra pavan0...@gmail.com
 wrote:

  @Jonathan Ted Yu is right! Ignore my mail :)
 
  On Thu, Aug 1, 2013 at 10:46 PM, Ted Yu yuzhih...@gmail.com wrote:
   Take a look at ColumnPaginationFilter.java and its unit test.
  
   Cheers
  
   On Thu, Aug 1, 2013 at 10:01 AM, Jonathan Cardoso
   jonathancar...@gmail.comwrote:
  
   Hi!
  
   Is there a way to scan a HBase table getting, for example, the first
 100
   results, then later get the next 100 and so on... Just like in SQL
 we do
   with LIMIT and OFFSET?
  
  
   *Jonathan Cardoso** **
   Universidade Federal de Goias*
  
 
 
 
  --
  Regards-
  Pavan
 



 --
 Thanks  Regards,
 Anil Gupta





Re: AssignmentManager looping?

2013-08-01 Thread Jean-Marc Spaggiari
No,it's a HBase 0.94.10 cluster with Hadoop 1.0.3, everything installed
manually from JARs ;) It's a mess to monitor and I would have loved to have
it under CM now, but I have to deal with that ;)

I'm building a 2nd cluster at home so I will be able to replicate this one
to the other one, which might allow me to play even further with it...

I will try to reproduce the issue, give me just couple of hours...

JM

2013/8/1 Kevin O'dell kevin.od...@cloudera.com

 Jimmy,

   Sounds like our dreaded reference file issue again. I spoke with JM and
 he is going to try to reproduce this  My gut tells me our point of no
 return may be in the wrong place due to some code change along the way, but
 hbck could also just be doing something wonky.

 JM,

   This cluster is not CM managed correct?
 On Aug 1, 2013 1:49 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
 wrote:

  So I had to remove few reference files and run few hbck to get everything
  back online.
 
  Summary: don't stop your cluster while it's major compacting huge tables
 ;)
 
  Thanks all!
 
  JM
 
  2013/8/1 Kevin O'dell kevin.od...@cloudera.com
 
   If that doesn't work you probably have an invalid reference file and
 you
   will find that in RS logs for the HLog split that is never finishing.
   On Aug 1, 2013 1:38 PM, Kevin O'dell kevin.od...@cloudera.com
 wrote:
  
JM,
   
Stop HBase
rmr /hbase from zkcli
Sideline META
Run offline meta repair
Start HBase
On Aug 1, 2013 1:01 PM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org
  
wrote:
   
Hi Jimmy,
   
I should still have all the logs.
   
What I did is pretty simple.
   
I tried to turn the cluster off while a single regioned 250GB table
  was
under major_compaction to get splitted.
   
I will targz all the logs for the few last days and make that
  available.
   
On the other side, I'm still not able to bring it back up...
   
JM
   
2013/8/1 Jimmy Xiang jxi...@cloudera.com
   
 Something went wrong with split.  It should be easy to fix your
   cluster.
 However, it will be more interesting to find out how it happened.
 Do
   you
 remember what has happened since it was good previously? Do you
 have
   all
 the logs?


 On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org
  wrote:

  I tried to remove the znodes but got the same result. So I
 shutted
down
 all
  the RS and restarted HBase, and now I have 0 regions for this
  table.
  Running HBCK. Seems that it has a lot to do...
 
  2013/8/1 Kevin O'dell kevin.od...@cloudera.com
 
   Yes you can if HBase is down, first I would copy .META out of
  HDFS
 local
   and then you can search it for split issues. Deleting those
  znodes
 should
   clear this up though.
   On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org
 
   wrote:
  
I can't check the meta since HBase is down.
   
Regarding HDFS, I took few random lines like:
2013-08-01 08:45:57,260 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Region
28328fdb7181cbd9cc4d6814775e8895 not found on server
node4,60020,1375319042033; failed processing
2013-08-01 08:45:57,260 WARN
org.apache.hadoop.hbase.master.AssignmentManager: Received
  SPLIT
for
   region
28328fdb7181cbd9cc4d6814775e8895 from server
 node4,60020,1375319042033
   but
it doesn't exist anymore, probably already processed its
 split
   
And each time, there is nothing like that.
hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
28328fdb7181cbd9cc4d6814775e8895
   
On ZK side:
[zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog
   
[zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
[28328fdb7181cbd9cc4d6814775e8895,
a8781a598c46f19723a2405345b58470,
b7ebfeb63b10997736fd12920fde2bb8,
d95bb27cc026511c2a8c8ad155e79bf6,
270a9c371fcbe9cd9a04986e0b77d16b,
aff4d1d8bf470458bb19525e8aef0759]
   
Can I just delete those zknodes? Worst case hbck will find
  them
back
  from
HDFS if required?
   
JM
   
2013/8/1 Kevin O'dell kevin.od...@cloudera.com
   
 Does it exist in meta or hdfs?
 On Aug 1, 2013 8:24 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org
   
 wrote:

  My master keep logging that:
 
  2013-07-31 21:52:59,201 WARN
  org.apache.hadoop.hbase.master.AssignmentManager: Region
  270a9c371fcbe9cd9a04986e0b77d16b not found on server
  node7,60020,1375319044055; failed processing
  2013-07-31 21:52:59,201 WARN
  org.apache.hadoop.hbase.master.AssignmentManager:
 Received
SPLIT
  for
 region
  270a9c371fcbe9cd9a04986e0b77d16b 

Re: AssignmentManager looping?

2013-08-01 Thread Jimmy Xiang
It will be great if you can reproduce this issue.  One thing to keep in
mind is not to run hbck(repair) in this case since hbck may have some
problem to handle the split parent properly.

By the way, in trunk, region split uses multi row mutate to update meta,
which is more reliable.  So I think the issue should have been fixed in
trunk.


On Thu, Aug 1, 2013 at 11:07 AM, Jean-Marc Spaggiari 
jean-m...@spaggiari.org wrote:

 No,it's a HBase 0.94.10 cluster with Hadoop 1.0.3, everything installed
 manually from JARs ;) It's a mess to monitor and I would have loved to have
 it under CM now, but I have to deal with that ;)

 I'm building a 2nd cluster at home so I will be able to replicate this one
 to the other one, which might allow me to play even further with it...

 I will try to reproduce the issue, give me just couple of hours...

 JM

 2013/8/1 Kevin O'dell kevin.od...@cloudera.com

  Jimmy,
 
Sounds like our dreaded reference file issue again. I spoke with JM and
  he is going to try to reproduce this  My gut tells me our point of no
  return may be in the wrong place due to some code change along the way,
 but
  hbck could also just be doing something wonky.
 
  JM,
 
This cluster is not CM managed correct?
  On Aug 1, 2013 1:49 PM, Jean-Marc Spaggiari jean-m...@spaggiari.org
  wrote:
 
   So I had to remove few reference files and run few hbck to get
 everything
   back online.
  
   Summary: don't stop your cluster while it's major compacting huge
 tables
  ;)
  
   Thanks all!
  
   JM
  
   2013/8/1 Kevin O'dell kevin.od...@cloudera.com
  
If that doesn't work you probably have an invalid reference file and
  you
will find that in RS logs for the HLog split that is never finishing.
On Aug 1, 2013 1:38 PM, Kevin O'dell kevin.od...@cloudera.com
  wrote:
   
 JM,

 Stop HBase
 rmr /hbase from zkcli
 Sideline META
 Run offline meta repair
 Start HBase
 On Aug 1, 2013 1:01 PM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org
   
 wrote:

 Hi Jimmy,

 I should still have all the logs.

 What I did is pretty simple.

 I tried to turn the cluster off while a single regioned 250GB
 table
   was
 under major_compaction to get splitted.

 I will targz all the logs for the few last days and make that
   available.

 On the other side, I'm still not able to bring it back up...

 JM

 2013/8/1 Jimmy Xiang jxi...@cloudera.com

  Something went wrong with split.  It should be easy to fix your
cluster.
  However, it will be more interesting to find out how it
 happened.
  Do
you
  remember what has happened since it was good previously? Do you
  have
all
  the logs?
 
 
  On Thu, Aug 1, 2013 at 7:08 AM, Jean-Marc Spaggiari 
  jean-m...@spaggiari.org
   wrote:
 
   I tried to remove the znodes but got the same result. So I
  shutted
 down
  all
   the RS and restarted HBase, and now I have 0 regions for this
   table.
   Running HBCK. Seems that it has a lot to do...
  
   2013/8/1 Kevin O'dell kevin.od...@cloudera.com
  
Yes you can if HBase is down, first I would copy .META out
 of
   HDFS
  local
and then you can search it for split issues. Deleting those
   znodes
  should
clear this up though.
On Aug 1, 2013 8:52 AM, Jean-Marc Spaggiari 
 jean-m...@spaggiari.org
  
wrote:
   
 I can't check the meta since HBase is down.

 Regarding HDFS, I took few random lines like:
 2013-08-01 08:45:57,260 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Region
 28328fdb7181cbd9cc4d6814775e8895 not found on server
 node4,60020,1375319042033; failed processing
 2013-08-01 08:45:57,260 WARN
 org.apache.hadoop.hbase.master.AssignmentManager: Received
   SPLIT
 for
region
 28328fdb7181cbd9cc4d6814775e8895 from server
  node4,60020,1375319042033
but
 it doesn't exist anymore, probably already processed its
  split

 And each time, there is nothing like that.
 hadoop@node3:~/hadoop-1.0.3$ bin/hadoop fs -lsr / | grep
 28328fdb7181cbd9cc4d6814775e8895

 On ZK side:
 [zk: localhost:2181(CONNECTED) 3] ls /hbase/splitlog

 [zk: localhost:2181(CONNECTED) 10] ls /hbase/unassigned
 [28328fdb7181cbd9cc4d6814775e8895,
 a8781a598c46f19723a2405345b58470,
 b7ebfeb63b10997736fd12920fde2bb8,
 d95bb27cc026511c2a8c8ad155e79bf6,
 270a9c371fcbe9cd9a04986e0b77d16b,
 aff4d1d8bf470458bb19525e8aef0759]

 Can I just delete those zknodes? Worst case hbck will find
   them
 back
   from
 HDFS if required?

 JM

 2013/8/1 Kevin O'dell kevin.od...@cloudera.com

  Does it exist in meta or 

Re: Region size per region on the table page

2013-08-01 Thread samar.opensource

Hi Jean,
  You are right , hannibal does that, but it a seperate process we need 
to install/maintail. I thought if we had a quick and easy way to see it 
from master-status page. The stats are already on the regionserver 
page(like total size of the store) , just that it would make sense to 
have it on the table page too(IMO) to understand the data size 
distribution of regions of a particular table.


Samar
On 01/08/13 5:51 PM, Jean-Marc Spaggiari wrote:

Hi Samar

Hannibal is already doing what you are looking for.

Cheers,

JMS

2013/8/1 samar.opensource samar.opensou...@gmail.com


Hi Devs/Users,
Most of the time we want to know if our table split logic is accurate
of if our current regions are well balanced for a table. I was wondering if
we can expose the size of region on the table.jsp too on the table region
table. If people thing it is useful I can pick it up. Also let me know if
it already exists.

Samar





Re: Why HBase integation with Hive makes Hive slow

2013-08-01 Thread lars hofhansl
Need to set scanner caching, otherwise each call to next will be an network RTT.




 From: Hao Ren h@claravista.fr
To: user@hbase.apache.org 
Sent: Thursday, August 1, 2013 7:45 AM
Subject: Why HBase integation with Hive makes Hive slow
 

Hi,

I have a cluster (1 master + 3 slaves) on which there Hive, Hbase, and 
Hadoop.

In order to do some daily row-level update routine, we need to integrate 
Hbase with hive, but the performance is not good.

E.g. There are 2 tables in hive,
     hbase_table:  a hbase table created via Hive
     hive_table: a native hive table
  both hold the same data set.

When runing:
     select count(*) from hbase_table; === takes 500 s
     select count(*) from hive_table; === takes 6 s

I have tried a lot of queries on the two tables. But hbase_table is 
always very slow.

To be claire, I created the hbase_ table as below:

CREATE TABLE hbase_table (
idvisite string,
client_list Arraystring,
nb_client int)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES (hbase.columns.mapping = 
:key,clients:id_list,clients:nb)
TBLPROPERTIES(hbase.table.name = table_test)
;

And my Hbase is on pseudo-distributed mode.

I guess, at the beginning of a hive query execution, hive will load data 
from Hbase, where serde takes a long time.

Could someone tell me how to improve my poor performance ?
Is this cause by my wrongly configured integration ?
Is a fully-distributed mode needed here ?

Thank you in advance for your time.

Hao.


-- 
Hao Ren
ClaraVista
www.claravista.fr

HDFS Restart with Replication

2013-08-01 Thread Patrick Schless
I'm running:
CDH4.1.2
HBase 0.92.1
Hadoop 2.0.0

Is there an issue with restarting a standby cluster with replication
running? I am doing the following on the standby cluster:

- stop hmaster
- stop name_node
- start name_node
- start hmaster

When the name node comes back up, it's reliably missing blocks. I started
with 0 missing blocks, and have run through this scenario a few times, and
am up to 46 missing blocks, all from the table that is the standby for our
production table (in a different datacenter). The missing blocks all are
from the same table, and look like:

blk_-2036986832155369224 /hbase/splitlog/data01.sea01.staging.tdb.com
,60020,1372703317824_hdfs%3A%2F%2Fname-node.sea01.staging.tdb.com
%3A8020%2Fhbase%2F.logs%2Fdata05.sea01.staging.tdb.com
%2C60020%2C1373557074890-splitting%2Fdata05.sea01.staging.tdb.com
%252C60020%252C1373557074890.1374960698485/tempodb-data/c9cdd64af0bfed70da154c219c69d62d/recovered.edits/01366319450.temp

Do I have to stop replication before restarting the standby?

Thanks,
Patrick


Reload configs

2013-08-01 Thread Patrick Schless
Is there a way to reload the HBase configs without restarting the whole
system (in other words, without an interruption of service)?

I'm on:
CDH4.1.2
HBase 0.92.1
Hadoop 2.0.0

Thanks,
Patrick


Re: Region size per region on the table page

2013-08-01 Thread Marcos Luis Ortiz Valmaseda
Hi, Bryan. If you file an issue for that, it would be nice to work on it.



2013/8/1 Bryan Beaudreault bbeaudrea...@hubspot.com

 Hannibal is very useful, but samar is right. It's another thing to install
 and maintain.  I'd hope that over time the need for tools like hannibal
 would be lessened as some of the features make its way into the main
 install.  Hannibal does its work through crawling log files, whereas some
 (or all) of the data it provides could be provided through the HBase api,
 and thus admin ui, in a less hacky way.

 If someone were willing to invest the time in adding such a metric to the
 hbase admin ui (and HBaseAdmin API please) it would bring us one step
 closer.


 On Thu, Aug 1, 2013 at 2:42 PM, samar.opensource 
 samar.opensou...@gmail.com
  wrote:

  Hi Jean,
You are right , hannibal does that, but it a seperate process we need
 to
  install/maintail. I thought if we had a quick and easy way to see it from
  master-status page. The stats are already on the regionserver page(like
  total size of the store) , just that it would make sense to have it on
 the
  table page too(IMO) to understand the data size distribution of regions
 of
  a particular table.
 
  Samar
 
  On 01/08/13 5:51 PM, Jean-Marc Spaggiari wrote:
 
  Hi Samar
 
  Hannibal is already doing what you are looking for.
 
  Cheers,
 
  JMS
 
  2013/8/1 samar.opensource samar.opensou...@gmail.com
 
   Hi Devs/Users,
  Most of the time we want to know if our table split logic is
 accurate
  of if our current regions are well balanced for a table. I was
 wondering
  if
  we can expose the size of region on the table.jsp too on the table
  region
  table. If people thing it is useful I can pick it up. Also let me know
 if
  it already exists.
 
  Samar
 
 
 




-- 
Marcos Ortiz Valmaseda
Product Manager at PDVSA
http://about.me/marcosortiz


Re: Reload configs

2013-08-01 Thread Jean-Marc Spaggiari
Hi Patrick,

I will say it depends on the configuration you want to change.
You can do a rolling restart so there is no server interruption?

JM

2013/8/1 Patrick Schless patrick.schl...@gmail.com

 Is there a way to reload the HBase configs without restarting the whole
 system (in other words, without an interruption of service)?

 I'm on:
 CDH4.1.2
 HBase 0.92.1
 Hadoop 2.0.0

 Thanks,
 Patrick



Re: HDFS Restart with Replication

2013-08-01 Thread Jean-Daniel Cryans
I can't think of a way how your missing blocks would be related to
HBase replication, there's something else going on. Are all the
datanodes checking back in?

J-D

On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless
patrick.schl...@gmail.com wrote:
 I'm running:
 CDH4.1.2
 HBase 0.92.1
 Hadoop 2.0.0

 Is there an issue with restarting a standby cluster with replication
 running? I am doing the following on the standby cluster:

 - stop hmaster
 - stop name_node
 - start name_node
 - start hmaster

 When the name node comes back up, it's reliably missing blocks. I started
 with 0 missing blocks, and have run through this scenario a few times, and
 am up to 46 missing blocks, all from the table that is the standby for our
 production table (in a different datacenter). The missing blocks all are
 from the same table, and look like:

 blk_-2036986832155369224 /hbase/splitlog/data01.sea01.staging.tdb.com
 ,60020,1372703317824_hdfs%3A%2F%2Fname-node.sea01.staging.tdb.com
 %3A8020%2Fhbase%2F.logs%2Fdata05.sea01.staging.tdb.com
 %2C60020%2C1373557074890-splitting%2Fdata05.sea01.staging.tdb.com
 %252C60020%252C1373557074890.1374960698485/tempodb-data/c9cdd64af0bfed70da154c219c69d62d/recovered.edits/01366319450.temp

 Do I have to stop replication before restarting the standby?

 Thanks,
 Patrick


Re: HDFS Restart with Replication

2013-08-01 Thread Patrick Schless
Yup, 14 datanodes, all check back in. However, all of the corrupt files
seem to be splitlogs from data05. This is true even though I've done
several restarts (each restart adding a few missing blocks). There's
nothing special about data05, and it seems to be in the cluster, the same
as anyone else.


On Thu, Aug 1, 2013 at 5:04 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:

 I can't think of a way how your missing blocks would be related to
 HBase replication, there's something else going on. Are all the
 datanodes checking back in?

 J-D

 On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless
 patrick.schl...@gmail.com wrote:
  I'm running:
  CDH4.1.2
  HBase 0.92.1
  Hadoop 2.0.0
 
  Is there an issue with restarting a standby cluster with replication
  running? I am doing the following on the standby cluster:
 
  - stop hmaster
  - stop name_node
  - start name_node
  - start hmaster
 
  When the name node comes back up, it's reliably missing blocks. I started
  with 0 missing blocks, and have run through this scenario a few times,
 and
  am up to 46 missing blocks, all from the table that is the standby for
 our
  production table (in a different datacenter). The missing blocks all are
  from the same table, and look like:
 
  blk_-2036986832155369224 /hbase/splitlog/data01.sea01.staging.tdb.com
  ,60020,1372703317824_hdfs%3A%2F%2Fname-node.sea01.staging.tdb.com
  %3A8020%2Fhbase%2F.logs%2Fdata05.sea01.staging.tdb.com
  %2C60020%2C1373557074890-splitting%2Fdata05.sea01.staging.tdb.com
 
 %252C60020%252C1373557074890.1374960698485/tempodb-data/c9cdd64af0bfed70da154c219c69d62d/recovered.edits/01366319450.temp
 
  Do I have to stop replication before restarting the standby?
 
  Thanks,
  Patrick



Re: HDFS Restart with Replication

2013-08-01 Thread Jean-Daniel Cryans
Can you follow the life of one of those blocks though the Namenode and
datanode logs? I'd suggest you start by doing a fsck on one of those
files with the option that gives the block locations first.

By the way why do you have split logs? Are region servers dying every
time you try out something?

On Thu, Aug 1, 2013 at 3:16 PM, Patrick Schless
patrick.schl...@gmail.com wrote:
 Yup, 14 datanodes, all check back in. However, all of the corrupt files
 seem to be splitlogs from data05. This is true even though I've done
 several restarts (each restart adding a few missing blocks). There's
 nothing special about data05, and it seems to be in the cluster, the same
 as anyone else.


 On Thu, Aug 1, 2013 at 5:04 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote:

 I can't think of a way how your missing blocks would be related to
 HBase replication, there's something else going on. Are all the
 datanodes checking back in?

 J-D

 On Thu, Aug 1, 2013 at 2:17 PM, Patrick Schless
 patrick.schl...@gmail.com wrote:
  I'm running:
  CDH4.1.2
  HBase 0.92.1
  Hadoop 2.0.0
 
  Is there an issue with restarting a standby cluster with replication
  running? I am doing the following on the standby cluster:
 
  - stop hmaster
  - stop name_node
  - start name_node
  - start hmaster
 
  When the name node comes back up, it's reliably missing blocks. I started
  with 0 missing blocks, and have run through this scenario a few times,
 and
  am up to 46 missing blocks, all from the table that is the standby for
 our
  production table (in a different datacenter). The missing blocks all are
  from the same table, and look like:
 
  blk_-2036986832155369224 /hbase/splitlog/data01.sea01.staging.tdb.com
  ,60020,1372703317824_hdfs%3A%2F%2Fname-node.sea01.staging.tdb.com
  %3A8020%2Fhbase%2F.logs%2Fdata05.sea01.staging.tdb.com
  %2C60020%2C1373557074890-splitting%2Fdata05.sea01.staging.tdb.com
 
 %252C60020%252C1373557074890.1374960698485/tempodb-data/c9cdd64af0bfed70da154c219c69d62d/recovered.edits/01366319450.temp
 
  Do I have to stop replication before restarting the standby?
 
  Thanks,
  Patrick



Re: Region size per region on the table page

2013-08-01 Thread Bryan Beaudreault
Created https://issues.apache.org/jira/browse/HBASE-9113


On Thu, Aug 1, 2013 at 5:34 PM, Marcos Luis Ortiz Valmaseda 
marcosluis2...@gmail.com wrote:

 Hi, Bryan. If you file an issue for that, it would be nice to work on it.



 2013/8/1 Bryan Beaudreault bbeaudrea...@hubspot.com

  Hannibal is very useful, but samar is right. It's another thing to
 install
  and maintain.  I'd hope that over time the need for tools like hannibal
  would be lessened as some of the features make its way into the main
  install.  Hannibal does its work through crawling log files, whereas some
  (or all) of the data it provides could be provided through the HBase api,
  and thus admin ui, in a less hacky way.
 
  If someone were willing to invest the time in adding such a metric to the
  hbase admin ui (and HBaseAdmin API please) it would bring us one step
  closer.
 
 
  On Thu, Aug 1, 2013 at 2:42 PM, samar.opensource 
  samar.opensou...@gmail.com
   wrote:
 
   Hi Jean,
 You are right , hannibal does that, but it a seperate process we need
  to
   install/maintail. I thought if we had a quick and easy way to see it
 from
   master-status page. The stats are already on the regionserver page(like
   total size of the store) , just that it would make sense to have it on
  the
   table page too(IMO) to understand the data size distribution of regions
  of
   a particular table.
  
   Samar
  
   On 01/08/13 5:51 PM, Jean-Marc Spaggiari wrote:
  
   Hi Samar
  
   Hannibal is already doing what you are looking for.
  
   Cheers,
  
   JMS
  
   2013/8/1 samar.opensource samar.opensou...@gmail.com
  
Hi Devs/Users,
   Most of the time we want to know if our table split logic is
  accurate
   of if our current regions are well balanced for a table. I was
  wondering
   if
   we can expose the size of region on the table.jsp too on the table
   region
   table. If people thing it is useful I can pick it up. Also let me
 know
  if
   it already exists.
  
   Samar
  
  
  
 



 --
 Marcos Ortiz Valmaseda
 Product Manager at PDVSA
 http://about.me/marcosortiz



Re: Excessive .META scans

2013-08-01 Thread Varun Sharma
Just patched 6870 and it immediately fixed the problem !


On Tue, Jul 30, 2013 at 12:57 PM, Stack st...@duboce.net wrote:

 Try turning off

 http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#setRegionCachePrefetch(byte[]
 ,
 boolean)

 St.Ack


 On Tue, Jul 30, 2013 at 11:27 AM, Varun Sharma va...@pinterest.com
 wrote:

  JD, its a big problem. The region server holding .META has 2X the network
  traffic and 2X the cpu load, I can easily spot the region server holding
  .META. by just looking at the ganglia graphs of the region servers side
 by
  side - I don't need to go the master console. So we can't scale up the
  cluster or add more load since its bottlenecked on this one region
 server.
 
  Thanks Nicholas for the pointer, its seems quite probable that this is
 the
  issue - it was fixed with 0.94.8 so we don't have it. I will give it a
  shot.
 
 
  On Mon, Jul 29, 2013 at 10:43 AM, Nicolas Liochon nkey...@gmail.com
  wrote:
 
   It could be HBASE-6870?
  
  
   On Mon, Jul 29, 2013 at 7:37 PM, Jean-Daniel Cryans 
 jdcry...@apache.org
   wrote:
  
Can you tell who's doing it? You could enable IPC debug for a few
 secs
to see who's coming in with scans.
   
You could also try to disable pre-fetching, set
hbase.client.prefetch.limit to 0
   
Also, is it even causing a problem or you're just worried it might
since it doesn't look normal?
   
J-D
   
On Mon, Jul 29, 2013 at 10:32 AM, Varun Sharma va...@pinterest.com
wrote:
 Hi folks,

 We are seeing an issue with hbase 0.94.3 on CDH 4.2.0 with
 excessive
.META.
 reads...

 In the steady state where there are no client crashes and there are
  no
 region server crashes/region movement, the server holding .META. is
serving
 an incredibly large # of read requests on the .META. table.

 From my understanding, in the steady state, region locations should
  be
 indefinitely cached in the client. The client is running a work
 load
  of
 multiput(s), puts, gets and coprocessor calls.

 Thanks
 Varun
   
  
 



Hitting HBASE-7693 with hbase-0.94.9

2013-08-01 Thread Mohammad Tariq
Hello list,

Although the issue https://issues.apache.org/jira/browse/HBASE-7693 has
been fixed, it looks like i'm hitting it.

*Environment :*
hadoop-1.1.2
hbase-0.94.9
OS X 10.8.4 (12E55)

I'd really appreciate if somebody could throw some light.

Here is the trace I see when I run my MR job against HBase :

2013-08-02 05:45:20.636 java[37884:1203] Unable to load realm mapping info
from SCDynamicStore

13/08/02 05:45:21 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable

13/08/02 05:45:21 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.

13/08/02 05:45:22 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).

13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT

13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client environment:host.name
=192.168.0.100

13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
environment:java.version=1.6.0_51

13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
environment:java.vendor=Apple Inc.

13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home

13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
environment:java.class.path=/workspace/hbasemr2/bin:/Users/miqbal1/hadoop-eco/hbase-0.94.9/lib/zookeeper-3.4.5.jar:/Users/miqbal1/hadoop-eco/hbase-0.94.9/lib/guava-11.0.2.jar:/Users/miqbal1/hadoop-eco/hbase-0.94.9/hbase-0.94.9.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/hadoop-core-1.1.2.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/asm-3.2.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/aspectjrt-1.6.11.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/aspectjtools-1.6.11.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-beanutils-1.7.0.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-beanutils-core-1.8.0.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-cli-1.2.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-codec-1.4.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-collections-3.2.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-configuration-1.6.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-daemon-1.0.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-digester-1.8.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-el-1.0.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-httpclient-3.0.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-io-2.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-lang-2.4.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-logging-1.1.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-logging-api-1.0.4.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-math-2.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/commons-net-3.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/core-3.1.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/hadoop-capacity-scheduler-1.1.2.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/hadoop-fairscheduler-1.1.2.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/hadoop-thriftfs-1.1.2.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/hsqldb-1.8.0.10.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jackson-core-asl-1.8.8.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jackson-mapper-asl-1.8.8.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jasper-compiler-5.5.12.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jasper-runtime-5.5.12.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jdeb-0.8.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jersey-core-1.8.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jersey-json-1.8.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jersey-server-1.8.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jets3t-0.6.1.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jetty-6.1.26.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jetty-util-6.1.26.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/jsch-0.1.42.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/junit-4.5.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/kfs-0.2.2.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/log4j-1.2.15.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/mockito-all-1.8.5.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/oro-2.0.8.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/servlet-api-2.5-20081211.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/slf4j-api-1.4.3.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/slf4j-log4j12-1.4.3.jar:/Users/miqbal1/hadoop-eco/hadoop-1.1.2/lib/xmlenc-0.52.jar:/Users/miqbal1/ZIPS-N-JARS/protobuf-java-2.4.1.jar

13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
environment:java.library.path=.:/Library/Java/Extensions:/System/Library/Java/Extensions:/usr/lib/java

13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
environment:java.io.tmpdir=/var/folders/n3/d0ghj1ln2zl0kpd8zkz4zf04mdm1y2/T/


importtsv issure

2013-08-01 Thread 闫昆
Hi all
I use importtsv tools load data to HBase ,but I just load data of about 5GB
and HDFS like this

NodeLast
ContactAdmin StateConfigured
Capacity (GB)Used
(GB)Non DFS
Used (GB)Remaining
(GB)Used
(%)Used
(%)Remaining
(%)BlocksBlock Pool
Used (GB)Block Pool
Used (%) BlocksFailed
Volumeshydra0003http://hydra0003:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
1In 
Service92.2372.223.9816.0378.3017.39121372.2278.300hydra0004http://hydra0004:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
0In 
Service92.2369.343.9218.9775.1820.57118369.3475.180hydra0005http://hydra0005:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
2In 
Service176.6259.9557.6159.0733.9433.4498759.9533.940hydra0006http://hydra0006:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
1In 
Service92.2367.854.2620.1373.5621.82115367.8573.560hydra0007http://hydra0007:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
1In Service176.6255.718.21112.7131.5463.8195955.7131.540
the 3,4,7 is regionserver local and why the HBase use to mang space ,In my
HDFS just store 5GB data
my hbase table have one column family and 151 column...
who can help me ?
Thanks
Yan


Re: Pagination with HBase

2013-08-01 Thread Mohammad Tariq
Hello Jonathan,

You might have to pay special attention to the OFFSET part though. OFFSET
of nth row will not remain n forever. As you insert new rows ordering will
get changed as rows are not arranged in the order of insertion in HBase.


Warm Regards,
Tariq
cloudfront.blogspot.com


On Thu, Aug 1, 2013 at 11:27 PM, Jonathan Cardoso
jonathancar...@gmail.comwrote:

 By anil's links I guess what I should use is PageFilter instead of
 ColumnPaginationFilter.

 *Jonathan Cardoso** **
 Universidade Federal de Goias*


 2013/8/1 Jonathan Cardoso jonathancar...@gmail.com

  Thanks! I've tested ColumnPaginationFilter but It's exactly what I need.
 
  Correct me if I'm wrong please, but ColumnPaginationFilter filters the
  columns of the result, how many of them will be retrieved based on the
  settings of 'limit' and 'offset' properties.
 
  But I need to make a Scan and get only the first X rows, not the first X
  columns
 
  *Jonathan Cardoso** **
  Universidade Federal de Goias*
 
 
  2013/8/1 anil gupta anilgupt...@gmail.com
 
  If you need more insight into HBase Pagination, these link might help
 you:
  http://search-hadoop.com/m/feqnAUeLR1
  http://search-hadoop.com/m/m5zM2rTSkb
 
 
  On Thu, Aug 1, 2013 at 10:18 AM, Pavan Sudheendra pavan0...@gmail.com
  wrote:
 
   @Jonathan Ted Yu is right! Ignore my mail :)
  
   On Thu, Aug 1, 2013 at 10:46 PM, Ted Yu yuzhih...@gmail.com wrote:
Take a look at ColumnPaginationFilter.java and its unit test.
   
Cheers
   
On Thu, Aug 1, 2013 at 10:01 AM, Jonathan Cardoso
jonathancar...@gmail.comwrote:
   
Hi!
   
Is there a way to scan a HBase table getting, for example, the
 first
  100
results, then later get the next 100 and so on... Just like in SQL
  we do
with LIMIT and OFFSET?
   
   
*Jonathan Cardoso** **
Universidade Federal de Goias*
   
  
  
  
   --
   Regards-
   Pavan
  
 
 
 
  --
  Thanks  Regards,
  Anil Gupta
 
 
 



Re: importtsv issure

2013-08-01 Thread Ted Yu
The following snapshot was taken after the loading, right ?

Did you happen to take snapshot before the loading ?

Was the table empty before loading ?

Cheers

On Thu, Aug 1, 2013 at 5:54 PM, 闫昆 yankunhad...@gmail.com wrote:

 Hi all
 I use importtsv tools load data to HBase ,but I just load data of about 5GB
 and HDFS like this

 NodeLast
 ContactAdmin StateConfigured
 Capacity (GB)Used
 (GB)Non DFS
 Used (GB)Remaining
 (GB)Used
 (%)Used
 (%)Remaining
 (%)BlocksBlock Pool
 Used (GB)Block Pool
 Used (%) BlocksFailed
 Volumeshydra0003
 http://hydra0003:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
 
 1In Service92.2372.223.9816.0378.3017.39121372.2278.300hydra0004
 http://hydra0004:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
 
 0In Service92.2369.343.9218.9775.1820.57118369.3475.180hydra0005
 http://hydra0005:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
 
 2In Service176.6259.9557.6159.0733.9433.4498759.9533.940hydra0006
 http://hydra0006:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
 
 1In Service92.2367.854.2620.1373.5621.82115367.8573.560hydra0007
 http://hydra0007:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
 
 1In Service176.6255.718.21112.7131.5463.8195955.7131.540
 the 3,4,7 is regionserver local and why the HBase use to mang space ,In my
 HDFS just store 5GB data
 my hbase table have one column family and 151 column...
 who can help me ?
 Thanks
 Yan



Re: Hitting HBASE-7693 with hbase-0.94.9

2013-08-01 Thread Ted Yu
Looking at ./src/core/org/apache/hadoop/net/DNS.java in hadoop branch-1,
here is line 79 :

String hostname = attribute.get(PTR).get().toString();

It is not clear which part was null.

Reading HBASE-7693 once more, it says:

PTR records contain a trailing period, which then shows up in the input
split location causing the JobTracker to incorrectly match map jobs to
data-local map slots.

So it seems that the problem you encountered was different.

Cheers

On Thu, Aug 1, 2013 at 5:18 PM, Mohammad Tariq donta...@gmail.com wrote:

 Hello list,

 Although the issue https://issues.apache.org/jira/browse/HBASE-7693 has
 been fixed, it looks like i'm hitting it.

 *Environment :*
 hadoop-1.1.2
 hbase-0.94.9
 OS X 10.8.4 (12E55)

 I'd really appreciate if somebody could throw some light.

 Here is the trace I see when I run my MR job against HBase :

 2013-08-02 05:45:20.636 java[37884:1203] Unable to load realm mapping info
 from SCDynamicStore

 13/08/02 05:45:21 WARN util.NativeCodeLoader: Unable to load native-hadoop
 library for your platform... using builtin-java classes where applicable

 13/08/02 05:45:21 WARN mapred.JobClient: Use GenericOptionsParser for
 parsing the arguments. Applications should implement Tool for the same.

 13/08/02 05:45:22 WARN mapred.JobClient: No job jar file set.  User classes
 may not be found. See JobConf(Class) or JobConf#setJar(String).

 13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
 environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT

 13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client environment:host.name
 =192.168.0.100

 13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
 environment:java.version=1.6.0_51

 13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
 environment:java.vendor=Apple Inc.

 13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client

 environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home

 13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client

 

Re: importtsv issure

2013-08-01 Thread 闫昆
my hbase 0.94 I am not sure use  snapshot  my hbase-site.xml not
configuration  snapshot  .
this is my first load data to hbase
but first load data 2Millions
  second load data 2Millions
  three load data 10Millions
In same hbase table 'data_rk'



2013/8/2 Ted Yu yuzhih...@gmail.com

 The following snapshot was taken after the loading, right ?

 Did you happen to take snapshot before the loading ?

 Was the table empty before loading ?

 Cheers

 On Thu, Aug 1, 2013 at 5:54 PM, 闫昆 yankunhad...@gmail.com wrote:

  Hi all
  I use importtsv tools load data to HBase ,but I just load data of about
 5GB
  and HDFS like this
 
  NodeLast
  ContactAdmin StateConfigured
  Capacity (GB)Used
  (GB)Non DFS
  Used (GB)Remaining
  (GB)Used
  (%)Used
  (%)Remaining
  (%)BlocksBlock Pool
  Used (GB)Block Pool
  Used (%) BlocksFailed
  Volumeshydra0003
 
 http://hydra0003:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
  
  1In Service92.2372.223.9816.0378.3017.39121372.2278.300hydra0004
 
 http://hydra0004:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
  
  0In Service92.2369.343.9218.9775.1820.57118369.3475.180hydra0005
 
 http://hydra0005:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
  
  2In Service176.6259.9557.6159.0733.9433.4498759.9533.940hydra0006
 
 http://hydra0006:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
  
  1In Service92.2367.854.2620.1373.5621.82115367.8573.560hydra0007
 
 http://hydra0007:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
  
  1In Service176.6255.718.21112.7131.5463.8195955.7131.540
  the 3,4,7 is regionserver local and why the HBase use to mang space ,In
 my
  HDFS just store 5GB data
  my hbase table have one column family and 151 column...
  who can help me ?
  Thanks
  Yan
 



Re: Hitting HBASE-7693 with hbase-0.94.9

2013-08-01 Thread Mohammad Tariq
Hello Ted,

Thank you so much for the quick response.

I need to dig a bit more in that case. Will get back here once I get
something.



Warm Regards,
Tariq
cloudfront.blogspot.com


On Fri, Aug 2, 2013 at 6:42 AM, Ted Yu yuzhih...@gmail.com wrote:

 Looking at ./src/core/org/apache/hadoop/net/DNS.java in hadoop branch-1,
 here is line 79 :

 String hostname = attribute.get(PTR).get().toString();

 It is not clear which part was null.

 Reading HBASE-7693 once more, it says:

 PTR records contain a trailing period, which then shows up in the input
 split location causing the JobTracker to incorrectly match map jobs to
 data-local map slots.

 So it seems that the problem you encountered was different.

 Cheers

 On Thu, Aug 1, 2013 at 5:18 PM, Mohammad Tariq donta...@gmail.com wrote:

  Hello list,
 
  Although the issue https://issues.apache.org/jira/browse/HBASE-7693
 has
  been fixed, it looks like i'm hitting it.
 
  *Environment :*
  hadoop-1.1.2
  hbase-0.94.9
  OS X 10.8.4 (12E55)
 
  I'd really appreciate if somebody could throw some light.
 
  Here is the trace I see when I run my MR job against HBase :
 
  2013-08-02 05:45:20.636 java[37884:1203] Unable to load realm mapping
 info
  from SCDynamicStore
 
  13/08/02 05:45:21 WARN util.NativeCodeLoader: Unable to load
 native-hadoop
  library for your platform... using builtin-java classes where applicable
 
  13/08/02 05:45:21 WARN mapred.JobClient: Use GenericOptionsParser for
  parsing the arguments. Applications should implement Tool for the same.
 
  13/08/02 05:45:22 WARN mapred.JobClient: No job jar file set.  User
 classes
  may not be found. See JobConf(Class) or JobConf#setJar(String).
 
  13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
  environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52
 GMT
 
  13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client environment:host.name
  =192.168.0.100
 
  13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
  environment:java.version=1.6.0_51
 
  13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
  environment:java.vendor=Apple Inc.
 
  13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
 
 
 environment:java.home=/System/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home
 
  13/08/02 05:45:22 INFO zookeeper.ZooKeeper: Client
 
 
 

Re: ETL like merge databases to HBase

2013-08-01 Thread Shengjie Min
@Ted Yu Yes, they are pretty much RDBMS, I've looked at Sqoop, but looks
like Sqoop only does one time migration? How about continus updates?

Shengjie


On 2 August 2013 00:13, Ted Yu yuzhih...@gmail.com wrote:

 bq. Each has a DB associated

 They're RDBMS, I assume ?

 Have you looked at Sqoop ?

 On Thu, Aug 1, 2013 at 8:04 AM, Shengjie Min kelvin@gmail.com wrote:

  Hi All,
 
  I have a use case I have a few applications running independently, Let's
  say applications A, B, C. Each has a DB associated. I wanna a aggregated
  view on all the databases so that I don't have to jump into different dbs
  to find the info I need. Is there a tool out there, allows me to move
 data
  from A,B,C to a single/centralised HBase cluster? Would be even nicer, if
  DB A, B, C gets updated by the apps, the updates will be synchronised to
  HBase too.
 
  --
  All the best,
  Shengjie Min
 




-- 
All the best,
Shengjie Min


Re: ETL like merge databases to HBase

2013-08-01 Thread Ted Yu
Mind asking the question on sqoop mailing list ?

http://sqoop.apache.org/mail-lists.html

On Thu, Aug 1, 2013 at 6:23 PM, Shengjie Min kelvin@gmail.com wrote:

 @Ted Yu Yes, they are pretty much RDBMS, I've looked at Sqoop, but looks
 like Sqoop only does one time migration? How about continus updates?

 Shengjie


 On 2 August 2013 00:13, Ted Yu yuzhih...@gmail.com wrote:

  bq. Each has a DB associated
 
  They're RDBMS, I assume ?
 
  Have you looked at Sqoop ?
 
  On Thu, Aug 1, 2013 at 8:04 AM, Shengjie Min kelvin@gmail.com
 wrote:
 
   Hi All,
  
   I have a use case I have a few applications running independently,
 Let's
   say applications A, B, C. Each has a DB associated. I wanna a
 aggregated
   view on all the databases so that I don't have to jump into different
 dbs
   to find the info I need. Is there a tool out there, allows me to move
  data
   from A,B,C to a single/centralised HBase cluster? Would be even nicer,
 if
   DB A, B, C gets updated by the apps, the updates will be synchronised
 to
   HBase too.
  
   --
   All the best,
   Shengjie Min
  
 



 --
 All the best,
 Shengjie Min



Re: importtsv issure

2013-08-01 Thread Ted Yu
By snapshot I meant the status of hdfs, shown in your first email.

Which HBase 0.94 release were you using ?

The 2Millions below refer to number of rows or amount of data ?

On Thu, Aug 1, 2013 at 6:14 PM, 闫昆 yankunhad...@gmail.com wrote:

 my hbase 0.94 I am not sure use  snapshot  my hbase-site.xml not
 configuration  snapshot  .
 this is my first load data to hbase
 but first load data 2Millions
   second load data 2Millions
   three load data 10Millions
 In same hbase table 'data_rk'



 2013/8/2 Ted Yu yuzhih...@gmail.com

  The following snapshot was taken after the loading, right ?
 
  Did you happen to take snapshot before the loading ?
 
  Was the table empty before loading ?
 
  Cheers
 
  On Thu, Aug 1, 2013 at 5:54 PM, 闫昆 yankunhad...@gmail.com wrote:
 
   Hi all
   I use importtsv tools load data to HBase ,but I just load data of about
  5GB
   and HDFS like this
  
   NodeLast
   ContactAdmin StateConfigured
   Capacity (GB)Used
   (GB)Non DFS
   Used (GB)Remaining
   (GB)Used
   (%)Used
   (%)Remaining
   (%)BlocksBlock Pool
   Used (GB)Block Pool
   Used (%) BlocksFailed
   Volumeshydra0003
  
 
 http://hydra0003:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
   
   1In Service92.2372.223.9816.0378.3017.39121372.2278.300hydra0004
  
 
 http://hydra0004:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
   
   0In Service92.2369.343.9218.9775.1820.57118369.3475.180hydra0005
  
 
 http://hydra0005:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
   
   2In Service176.6259.9557.6159.0733.9433.4498759.9533.940hydra0006
  
 
 http://hydra0006:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
   
   1In Service92.2367.854.2620.1373.5621.82115367.8573.560hydra0007
  
 
 http://hydra0007:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020
   
   1In Service176.6255.718.21112.7131.5463.8195955.7131.540
   the 3,4,7 is regionserver local and why the HBase use to mang space ,In
  my
   HDFS just store 5GB data
   my hbase table have one column family and 151 column...
   who can help me ?
   Thanks
   Yan
  
 



Re: importtsv issure

2013-08-01 Thread 闫昆
I use hbase-0.94.6-cdh4.3.0
Total about 16Millions rows and data size about 4-5GB
I am sorry my englishi is poor,..
Thanks
Yu


2013/8/2 Ted Yu yuzhih...@gmail.com

 By snapshot I meant the status of hdfs, shown in your first email.

 Which HBase 0.94 release were you using ?

 The 2Millions below refer to number of rows or amount of data ?

 On Thu, Aug 1, 2013 at 6:14 PM, 闫昆 yankunhad...@gmail.com wrote:

  my hbase 0.94 I am not sure use  snapshot  my hbase-site.xml not
  configuration  snapshot  .
  this is my first load data to hbase
  but first load data 2Millions
second load data 2Millions
three load data 10Millions
  In same hbase table 'data_rk'
 
 
 
  2013/8/2 Ted Yu yuzhih...@gmail.com
 
   The following snapshot was taken after the loading, right ?
  
   Did you happen to take snapshot before the loading ?
  
   Was the table empty before loading ?
  
   Cheers
  
   On Thu, Aug 1, 2013 at 5:54 PM, 闫昆 yankunhad...@gmail.com wrote:
  
Hi all
I use importtsv tools load data to HBase ,but I just load data of
 about
   5GB
and HDFS like this
   
NodeLast
ContactAdmin StateConfigured
Capacity (GB)Used
(GB)Non DFS
Used (GB)Remaining
(GB)Used
(%)Used
(%)Remaining
(%)BlocksBlock Pool
Used (GB)Block Pool
Used (%) BlocksFailed
Volumeshydra0003
   
  
 
 http://hydra0003:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020

1In Service92.2372.223.9816.0378.3017.39121372.2278.300hydra0004
   
  
 
 http://hydra0004:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020

0In Service92.2369.343.9218.9775.1820.57118369.3475.180hydra0005
   
  
 
 http://hydra0005:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020

2In Service176.6259.9557.6159.0733.9433.4498759.9533.940hydra0006
   
  
 
 http://hydra0006:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020

1In Service92.2367.854.2620.1373.5621.82115367.8573.560hydra0007
   
  
 
 http://hydra0007:50075/browseDirectory.jsp?namenodeInfoPort=50070dir=%2Fnnaddr=192.5.1.50:8020

1In Service176.6255.718.21112.7131.5463.8195955.7131.540
the 3,4,7 is regionserver local and why the HBase use to mang space
 ,In
   my
HDFS just store 5GB data
my hbase table have one column family and 151 column...
who can help me ?
Thanks
Yan
   
  
 



Re: ETL like merge databases to HBase

2013-08-01 Thread Jay Vyas
Hbase doesn't have dynamic views on data outside of itself. But you can easily 
re run your sqoop flow to dump information into hbase.

Actually, it might be easier to go with a pure RDBMS solution here since 
nowadays the Slave/master architectures in postgre and MySQL are mature enough 
to handle this sort of thing even for hundreds of thousands of rows.

Re: ETL like merge databases to HBase

2013-08-01 Thread Shahab Yunus
Though it is better as Ted suggested to discuss this in Sqoop mailing list
(as Sqoop 2 supposed to be more feature rich) but just to get this out,
Sqoop does support incremental imports if you can come up with a suitable
and compatible strategy. Tha tmigh thelp you if you configure you imports
on some periodic schedule.

Regards,
Shahab


On Thu, Aug 1, 2013 at 10:17 PM, Jay Vyas jayunit...@gmail.com wrote:

 Hbase doesn't have dynamic views on data outside of itself. But you can
 easily re run your sqoop flow to dump information into hbase.

 Actually, it might be easier to go with a pure RDBMS solution here since
 nowadays the Slave/master architectures in postgre and MySQL are mature
 enough to handle this sort of thing even for hundreds of thousands of rows.


issure about DNS error in running hbase mapreduce

2013-08-01 Thread ch huang
i use hadoop-dns-checker check the dns problem ,seems all ok,but when i run
MR task in hbase,it report problem,anyone have good idea?

# ./run-on-cluster.sh hosts1
 CH22 
The authenticity of host 'ch22 (192.168.10.22)' can't be established.
RSA key fingerprint is f3:4a:ca:a3:17:08:98:c2:0a:bd:27:99:a3:65:bc:89.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ch22,192.168.10.22' (RSA) to the list of known
hosts.
root@ch22's password:
sending incremental file list
created directory hadoop-dns
a.jar
hosts1
run.sh
sent 2394 bytes  received 69 bytes  547.33 bytes/sec
total size is 2618  speedup is 1.06
root@ch22's password:
# self check...
-- host : CH22
   host lookup : success (192.168.10.22)
   reverse lookup : success (CH22)
   is reachable : yes
# end self check
 Running on : CH22/192.168.10.22 =
-- host : CH22
   host lookup : success (192.168.10.22)
   reverse lookup : success (CH22)
   is reachable : yes
-- host : CH34
   host lookup : success (192.168.10.34)
   reverse lookup : success (CH34)
   is reachable : yes
-- host : CH35
   host lookup : success (192.168.10.35)
   reverse lookup : success (CH35)
   is reachable : yes
-- host : CH36
   host lookup : success (192.168.10.36)
   reverse lookup : success (CH36)
   is reachable : yes

 CH34 
root@ch34's password:
sending incremental file list
created directory hadoop-dns
a.jar
hosts1
run.sh
sent 2394 bytes  received 69 bytes  703.71 bytes/sec
total size is 2618  speedup is 1.06
root@ch34's password:
# self check...
-- host : CH34
   host lookup : success (192.168.10.34)
   reverse lookup : success (CH34)
   is reachable : yes
# end self check
 Running on : CH34/192.168.10.34 =
-- host : CH22
   host lookup : success (192.168.10.22)
   reverse lookup : success (CH22)
   is reachable : yes
-- host : CH34
   host lookup : success (192.168.10.34)
   reverse lookup : success (CH34)
   is reachable : yes
-- host : CH35
   host lookup : success (192.168.10.35)
   reverse lookup : success (CH35)
   is reachable : yes
-- host : CH36
   host lookup : success (192.168.10.36)
   reverse lookup : success (CH36)
   is reachable : yes
 CH35 
root@ch35's password:
sending incremental file list
created directory hadoop-dns
a.jar
hosts1
run.sh
sent 2394 bytes  received 69 bytes  703.71 bytes/sec
total size is 2618  speedup is 1.06
root@ch35's password:
# self check...
-- host : CH35
   host lookup : success (192.168.10.35)
   reverse lookup : success (CH35)
   is reachable : yes
# end self check
 Running on : CH35/192.168.10.35 =
-- host : CH22
   host lookup : success (192.168.10.22)
   reverse lookup : success (CH22)
   is reachable : yes
-- host : CH34
   host lookup : success (192.168.10.34)
   reverse lookup : success (CH34)
   is reachable : yes
-- host : CH35
   host lookup : success (192.168.10.35)
   reverse lookup : success (CH35)
   is reachable : yes
-- host : CH36
   host lookup : success (192.168.10.36)
   reverse lookup : success (CH36)
   is reachable : yes
 CH36 
root@ch36's password:
sending incremental file list
created directory hadoop-dns
a.jar
hosts1
run.sh
sent 2394 bytes  received 69 bytes  703.71 bytes/sec
total size is 2618  speedup is 1.06
root@ch36's password:
# self check...
-- host : CH36
   host lookup : success (192.168.10.36)
   reverse lookup : success (CH36)
   is reachable : yes
# end self check
 Running on : CH36/192.168.10.36 =
-- host : CH22
   host lookup : success (192.168.10.22)
   reverse lookup : success (CH22)
   is reachable : yes
-- host : CH34
   host lookup : success (192.168.10.34)
   reverse lookup : success (CH34)
   is reachable : yes
-- host : CH35
   host lookup : success (192.168.10.35)
   reverse lookup : success (CH35)
   is reachable : yes
-- host : CH36
   host lookup : success (192.168.10.36)
   reverse lookup : success (CH36)
   is reachable : yes

#  yarn jar mapreducehbaseTest.jar com.mediaadx.hbase.hadoop.test.TxtHbase
'' ''
13/08/02 13:10:17 WARN conf.Configuration: dfs.df.interval is deprecated.
Instead, use fs.df.interval
13/08/02 13:10:17 WARN conf.Configuration: hadoop.native.lib is deprecated.
Instead, use io.native.lib.available
13/08/02 13:10:17 WARN conf.Configuration: fs.default.name is deprecated.
Instead, use fs.defaultFS
13/08/02 13:10:17 WARN conf.Configuration: topology.script.number.args is
deprecated. Instead, use net.topology.script.number.args
13/08/02 13:10:17 WARN conf.Configuration: dfs.umaskmode is deprecated.
Instead, use fs.permissions.umask-mode
13/08/02 13:10:17 WARN conf.Configuration:
topology.node.switch.mapping.impl is deprecated. Instead, use
net.topology.node.switch.mapping.impl
13/08/02 13:10:17 WARN conf.Configuration: session.id is deprecated.
Instead, use dfs.metrics.session-id
13/08/02 13:10:17 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
13/08/02 13:10:17 WARN conf.Configuration: slave.host.name is deprecated.
Instead, use