Decommission a Region Server

2011-04-08 Thread Vivek Krishna
Is there a procedure to decommission a region server? If I just kill a region server process, the master tries to reconnect again and again and the master logs looks like this 11/04/08 21:08:35 WARN util.FSUtils: Waited 2533953ms for lease recovery on hdfs://

Drop a inconsistent table.

2011-03-25 Thread Vivek Krishna
The table is in inconsistent state. Reason being it was not able to locate a few regions. When I disable this table using hbase shell, the master log says RetriesException and is in the process of transition. This takes a lot of time. Is it possible to force drop this table? Or rather what

Yet another bulk import question

2011-03-24 Thread Vivek Krishna
Data Size - 20 GB. It took about an hour with default hbase setting and after varying several parameters, we were able to get this done in ~20 minutes. This is slow and we are trying to improve. We wrote a java client which would essentially `put` to hbase tables in batches. Our fine-tuning

Re: Yet another bulk import question

2011-03-24 Thread Vivek Krishna
, Vivek Krishna vivekris...@gmail.comwrote: Data Size - 20 GB. It took about an hour with default hbase setting and after varying several parameters, we were able to get this done in ~20 minutes. This is slow and we are trying to improve. We wrote a java client which would essentially `put

Re: File formats in Hadoop

2011-03-22 Thread Vivek Krishna
http://nosql.mypopescu.com/post/3220921756/hbase-internals-hfile-explained might help. Viv On Tue, Mar 22, 2011 at 11:43 AM, Weishung Chung weish...@gmail.com wrote: My fellow superb hbase experts, Looking at the HFile specs and have some questions. How is a particular table cell in a

Manual Region Splitting Question.

2011-03-22 Thread Vivek Krishna
I have GBs of data to be dumped to HBase. After lots of trials and reading through the mailing list, I figured out creating regions manually is a good option because all data was hitting one node initially... My approach to creating regions is as follow. - I sampled like about 1% of the

Re: Manual Region Splitting Question.

2011-03-22 Thread Vivek Krishna
a through z, but you start inserting only keys with starting with a, then you'll only hit the first regions. J-D On Tue, Mar 22, 2011 at 11:46 AM, Vivek Krishna vivekris...@gmail.com wrote: I have GBs of data to be dumped to HBase. After lots of trials and reading through the mailing list

Using split command in shell

2011-03-22 Thread Vivek Krishna
The command is `split table or region row` How to find what the region row is? I tried the ones shown in node:60030 webpage. Does not work. Viv

Re: Using split command in shell

2011-03-22 Thread Vivek Krishna
such message in the region server log. J-D On Tue, Mar 22, 2011 at 2:00 PM, Vivek Krishna vivekris...@gmail.com wrote: The command is `split table or region row` How to find what the region row is? I tried the ones shown in node:60030 webpage. Does not work. Viv

Re: Using split command in shell

2011-03-22 Thread Vivek Krishna
+region ? Viv On Tue, Mar 22, 2011 at 6:15 PM, Stack st...@duboce.net wrote: On Tue, Mar 22, 2011 at 3:11 PM, Vivek Krishna vivekris...@gmail.com wrote: `split region_name`, I don't know what to use as region name? hbase(main):001:0 help 'split' Split table or pass a region row

RowKey using importtsv

2011-03-21 Thread Vivek Krishna
I have a TSV file like this : ab c d e I want the rowkey to be a_d and the rest to be column fields. Is there a way of doing this using importtsv out of the box or do I need to touch the code? Viv

Re: RowKey using importtsv

2011-03-21 Thread Vivek Krishna
ImportTsv class to add in the functionality to specify multiple fields as parts of the row key. On Mon, Mar 21, 2011 at 4:25 PM, Vivek Krishna vivekris...@gmail.com wrote: I have a TSV file like this : ab c d e I want the rowkey to be a_d and the rest to be column fields

Re: Bulk Load question.

2011-03-21 Thread Vivek Krishna
Thanks Harsh.. Viv On Sat, Mar 19, 2011 at 11:52 AM, Harsh J qwertyman...@gmail.com wrote: Have you tried out the mix of importtsv + completebulkload? Would that work for you? On Sat, Mar 19, 2011 at 9:18 PM, Vivek Krishna vivekris...@gmail.com wrote: I have around 20 GB of data

StoreFileScanner Error.

2011-03-21 Thread Vivek Krishna
I keep getting this error very often. java.io.IOException: java.io.IOException: Could not seek StoreFileScanner[HFileScanner for reader reader=hdfs://e I have a 30 node cluster with several writers writing data. Once the write is done and I run rowcounter job to count the records, I face the

Re: StoreFileScanner Error.

2011-03-21 Thread Vivek Krishna
On Mon, Mar 21, 2011 at 6:07 PM, Vivek Krishna vivekris...@gmail.com wrote: I keep getting this error very often. java.io.IOException: java.io.IOException: Could not seek StoreFileScanner[HFileScanner for reader reader=hdfs://e I have a 30 node cluster with several writers writing data

Bulk Load question.

2011-03-19 Thread Vivek Krishna
I have around 20 GB of data to be dumped into a hbase table. Initially, I had a simple java program to put the values in a batch of (5000-1) records. I tried concurrent inserts and each insert took about 15 seconds to write. Which is very slow and was taking ages. Next approach was to use

Row Counters

2011-03-16 Thread Vivek Krishna
1. How do I count rows fast in hbase? First I tired count 'test' , takes ages. Saw that I could use RowCounter, but looks like it is deprecated. When I try to use it, I get java.io.IOException: Cannot create a record reader because of a previous error. Please look at the previous logs lines

Re: Row Counters

2011-03-16 Thread Vivek Krishna
I guess it is using the mapred class 11/03/16 20:58:27 INFO mapred.JobClient: Task Id : attempt_201103161245_0005_m_04_0, Status : FAILED java.io.IOException: Cannot create a record reader because of a previous error. Please look at the previous logs lines from the task's full log for more

Re: Row Counters

2011-03-16 Thread Vivek Krishna
, 2011 at 1:59 PM, Vivek Krishna vivekris...@gmail.com wrote: I guess it is using the mapred class 11/03/16 20:58:27 INFO mapred.JobClient: Task Id : attempt_201103161245_0005_m_04_0, Status : FAILED java.io.IOException: Cannot create a record reader because of a previous error. Please