Re: Unable to create table

2011-03-29 Thread Hari Sreekumar
Here is the stack trace: 11/03/28 18:47:02 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=hadoop2:2181 sessionTimeout=18 watcher=hconnection 11/03/28 18:47:02 INFO zookeeper.ClientCnxn: Opening socket connection to server hadoop2/192.168.1.111:2181 11/03/28 18:47:02

Re: LZO Compression changes in 0.90 ?

2011-03-29 Thread Todd Lipcon
Yep. Kevin has apparently gotten too many job promotions recently, so it's mostly me maintaining it these days :) But he usually reviews and pulls my changes within a few days. -Todd On Mon, Mar 28, 2011 at 9:05 AM, Stack st...@duboce.net wrote: It doesn't matter IIUC (Correct me if I'm wrong

Re: How could I re-calculate every entries in hbase efficiently through mapreduce?

2011-03-29 Thread Stanley Xu
Dear all, Thanks for all your suggestions. I probably understand that the use a reduce process and do things like sort in reduce side is meaningless at most time. So now what I did is like the following: 1. Use a Mapper and no Reducer to go through the whole table and recalculate the scores and

A lot of data is lost when name node crashed

2011-03-29 Thread Gaojinchao
I do some performance test for hbase version 0.90.1 when the name node crashed, I find some data lost. I'm not sure exactly what arose it. It seems like split logs failed. I think the master should shutdown itself when HDFS crashed. The logs is : 2011-03-22 13:21:55,056 WARN

Re: Region server crashes when using replication

2011-03-29 Thread Eran Kutner
Thanks again J-D. I will avoid using stop_replication from now on. As for the shell, JRuby (or even Java for that matter) is not really our strong suit here, but I'll try to give it a look when I have some time. -eran On Mon, Mar 28, 2011 at 23:43, Jean-Daniel Cryans jdcry...@apache.org

Confusion regarding version of hadoop to use in hbase 0.90.1

2011-03-29 Thread Hari Sreekumar
Hi, I went through this thread ( http://www.apacheserver.net/Using-Hadoop-bundled-in-lib-directory-HBase-at1134614.htm) which mentions that the hadoop jars ( hadoop-0.20.2/hadoop-0.20.2-core.jar and hadoop-0.20.2/hadoop-0.20.2-core.jar) can simply be replaced by hbase-0.90.1/lib/hadoop-*.jar

Re: Confusion regarding version of hadoop to use in hbase 0.90.1

2011-03-29 Thread Harsh J
Hello, On Tue, Mar 29, 2011 at 5:55 PM, Hari Sreekumar hsreeku...@clickable.com wrote: Hi, I went through this thread ( http://www.apacheserver.net/Using-Hadoop-bundled-in-lib-directory-HBase-at1134614.htm) which mentions that the hadoop jars ( hadoop-0.20.2/hadoop-0.20.2-core.jar and

hole in META

2011-03-29 Thread Venkatesh
Hi Using hbase-0.20.6..This has happened quite often..Is this a known issue in 0.20.6 that we would n't see in 0.90.1 (or) see less of? ..Attempt to fix/avoid this earlier times by truncating table, running add_table.rb before What is the best way to fix this in 0.20.6? Now it's there in

java.lang.IllegalArgumentException in incrementColumnValue and Increment

2011-03-29 Thread sulabh choudhury
Hi, Unable to use the Increment function, can anybody suggest what am I doing wrong... I enter data by :- theput.add(Bytes.toBytes(uid),Bytes.toBytes(1), 130108782L + t, Bytes.toBytes(10)) Now when I try to increment the value I have tried...

Re: Confusion regarding version of hadoop to use in hbase 0.90.1

2011-03-29 Thread Hari Sreekumar
It worked when I added the append jar file to HADOOP_CLASSPATH in hadoop-env.sh. I think this line in bin/hadoop is the culprit: for f in $HADOOP_HOME/hadoop-*-core.jar; do CLASSPATH=${CLASSPATH}:$f; The append jar file doesn't match the hadoop-*-core.jar pattern, so it did not get added to

Re: java.lang.IllegalArgumentException in incrementColumnValue and Increment

2011-03-29 Thread Stack
Try http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#increment(org.apache.hadoop.hbase.client.Increment) instead. It looks like its whats taking over from ICV (and we should be decrementing ICV). St.Ack On Tue, Mar 29, 2011 at 8:22 AM, sulabh choudhury

Re: java.lang.IllegalArgumentException in incrementColumnValue and Increment

2011-03-29 Thread Jesse Hutton
Hi, It looks like the problem is that the initial value you're inserting in the column is an int, while HTable#incrementColumnValue() expects a long. Instead of: I enter data by :- theput.add(Bytes.toBytes(uid),Bytes.toBytes(1), 130108782L + t, Bytes.toBytes(10)) try:

Re: hole in META

2011-03-29 Thread Stack
On Tue, Mar 29, 2011 at 7:38 AM, Venkatesh vramanatha...@aol.com wrote: What is the best way to fix this in 0.20.6? Move to 0.90.1 to avoid holes in .META. and to avoid losing data. Let us know if we can help you with upgrade. St.Ack

Re: hole in META

2011-03-29 Thread Venkatesh
Thanks. St.Ack Yeah...I'm eager to upgrade I had to make one small change to HBase client API to use the new version.. I ran into missing jar with hadoop jar file when running a map reduce..which i could n't fix it..That is the only known issue with upgrade If I can fix that, i'll upgrade

Re: Replication and Rebalancer produces a lot of exceptions

2011-03-29 Thread Jeff Whiting
That was it. I didn't realize that was the exception on the remote cluster...make sense now. ~Jeff On 3/28/2011 3:34 PM, Jean-Daniel Cryans wrote: The slave cluster is saying that the table user-session doesn't exist... is it the case? J-D On Mon, Mar 28, 2011 at 1:38 PM, Jeff

Re: java.lang.IllegalArgumentException in incrementColumnValue and Increment

2011-03-29 Thread sulabh choudhury
On Tue, Mar 29, 2011 at 8:56 AM, Stack st...@duboce.net wrote: Try http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/HTable.html#increment(org.apache.hadoop.hbase.client.Increment) instead. It looks like its whats taking over from ICV (and we should be decrementing ICV). St.Ack

Re: java.lang.IllegalArgumentException in incrementColumnValue and Increment

2011-03-29 Thread sulabh choudhury
Thanks Jesse. Changing the 10 to 10L made it work. On Tue, Mar 29, 2011 at 8:59 AM, Jesse Hutton jesse.hut...@gmail.comwrote: Hi, It looks like the problem is that the initial value you're inserting in the column is an int, while HTable#incrementColumnValue() expects a long. Instead of:

Re: java.lang.IllegalArgumentException in incrementColumnValue and Increment

2011-03-29 Thread sulabh choudhury
I just realized that using the increment function creates another version, with a new timestamp. Is there a way we can use the previous TS, hence over writing the value ? On Tue, Mar 29, 2011 at 9:38 AM, sulabh choudhury sula...@gmail.com wrote: Thanks Jesse. Changing the 10 to 10L made it

Re: hole in META

2011-03-29 Thread Stack
On Tue, Mar 29, 2011 at 9:09 AM, Venkatesh vramanatha...@aol.com wrote:  I ran into missing jar with hadoop jar file when running a map reduce..which i could n't fix it..That is the only known issue with upgrade If I can fix that, i'll upgrade Tell us more. Whats the complaint? Missing

Re: Replication and Rebalancer produces a lot of exceptions

2011-03-29 Thread Jean-Daniel Cryans
Do you think the issue could be better reported to the user? I find the error message obvious, but I've been around that code for a year now :) Thx! J-D On Tue, Mar 29, 2011 at 9:21 AM, Jeff Whiting je...@qualtrics.com wrote: That was it.  I didn't realize that was the exception on the remote

Re: A lot of data is lost when name node crashed

2011-03-29 Thread Jean-Daniel Cryans
I was expecting it would die, strange it didn't. Could you provide a bigger log, this one basically tells us the NN is gone but that's about it. Please put it on a web server or something else that's easily reachable for anyone (eg don't post the full thing here). Thx, J-D On Tue, Mar 29, 2011

question about region assignment

2011-03-29 Thread Jack Levin
Hello, we have this one table what about 12 regions, that is super hot with writes, for some reason most of the regions were assigned to a single server, which cause it to flush and compact every 10 minutes, causing suboptimal performance. We do use random row_keys, so I don't see how sorted ROWs

Re: question about region assignment

2011-03-29 Thread Ted Yu
Are you using 0.90.1 (where regions are randomly distributed across cluser) ? I logged HBASE-3373 but was told it is specific to our usage. On Tue, Mar 29, 2011 at 10:39 AM, Jack Levin magn...@gmail.com wrote: Hello, we have this one table what about 12 regions, that is super hot with writes,

Re: Performance test results

2011-03-29 Thread Jean-Daniel Cryans
Hey Eran, Usually this mailing list doesn't accept attachements (or it works for voodoo reasons) so you'd be better off pastebin'ing them. Some thoughts: - Inserting into a new table without pre-splitting it is bound to be a red herring of bad performance. Please pre-split it with methods such

Re: Unable to create table

2011-03-29 Thread Jean-Daniel Cryans
The 60 secs timeout means that the client was waiting on the master for some operation but the master took longer than 60 secs to do it, so its log should be the next place too look for something whack. BTW deleting the rows from .META. directly is probably the worst thing you can do. J-D On

Re: Hmaster had crashed as disabling table

2011-03-29 Thread Jean-Daniel Cryans
Oh yeah I see. So the issue is that if a region was closed and disabled when the first master was running, it won't be assigned anywhere and won't be in transition either (it's called being in RIT in the code). When the new master comes around, and disable is called, it does a check to see if the

Re: question about region assignment

2011-03-29 Thread Jack Levin
I am on 0.89-830 On Tue, Mar 29, 2011 at 10:44 AM, Ted Yu yuzhih...@gmail.com wrote: Are you using 0.90.1 (where regions are randomly distributed across cluser) ? I logged HBASE-3373 but was told it is specific to our usage. On Tue, Mar 29, 2011 at 10:39 AM, Jack Levin magn...@gmail.com

Re: Unable to create table

2011-03-29 Thread Hari Sreekumar
Hi J-D, Here is the tail of HMaster log: 2011-03-29 23:48:51,155 DEBUG org.apache.hadoop.hbase.master.handler.DeleteTableHandler: Waiting on region to clear regions in transition; AcContact,,1301416789483.0cd6d132b2f367f21e88f00778349215. state=OPENING, ts=1301422405271 2011-03-29 23:48:52,158

Re: serverAddress is wrong after copy files uesed distcp

2011-03-29 Thread Jean-Daniel Cryans
That's a very old version, you should consider upgrading. When you said it worked the first day but now it's broke, is it because you did another copy like that or it all of a sudden stopped working? If I remember correctly, in 0.20.6 that message was usually followed by the master to take the

Re: Performance test results

2011-03-29 Thread Ted Dunning
Watch out when pre-splitting. Your key distribution may not be as uniform as you might think. This particularly happens when keys are represented in some printable form. Base 64, for instance only populates a small fraction of the base 256 key space. On Tue, Mar 29, 2011 at 10:54 AM,

Re: Unable to create table

2011-03-29 Thread Jean-Daniel Cryans
There's a reason why disabling takes time, if you delete rows from .META. you might end up in an inconsistent situation and we'll have a hard time helping you :) So HBASE-3557 is what you want. Regarding your current issue, RIT is region in transition meaning that the region is in a state

Re: Unable to create table

2011-03-29 Thread Hari Sreekumar
Yep I know, I think I was also on the mailing list thread that inspired HBASE-3557 :) But what can I do in the current version better than that? In any case, I do that only when I catch IOException. So why am I getting this IOException. I deleted the hbase folder in HDFS and tried doing

Re: java.lang.IllegalArgumentException in incrementColumnValue and Increment

2011-03-29 Thread Jesse Hutton
AFAIK, (and maybe some experts can chime in here with some details) there is no real way to over write a value in hbase. If you want to control the number of versions, you can set the max versions property on the column family, and that will be enforced whenever a major compaction occurs [1].

Re: Unable to create table

2011-03-29 Thread Jean-Daniel Cryans
hadoop3 was trying to open it, but it seems like it's not able to. It really looks like: https://issues.apache.org/jira/browse/HBASE-3669 You should also check what's going on on that slave. J-D On Tue, Mar 29, 2011 at 12:12 PM, Hari Sreekumar hsreeku...@clickable.com wrote: Yep I know, I

Tips on pre-splitting

2011-03-29 Thread Bill Graham
I've been thinking about this topic lately so I'll fork from another discussion to ask if anyone has a good approach to determining keys for pre-splitting from a known dataset. We have a key scenario similar to what Ted describes below. We periodically run MR jobs to transform and bulk load data

Export/Import and # of regions

2011-03-29 Thread Venkatesh
Hi, If I export existing table using Export MR job, truncate the table, increase region size, do a Import will it make use of the new region size? thanks V

Re: Export/Import and # of regions

2011-03-29 Thread Jean-Daniel Cryans
Yes but you'll start with a single region, instead of truncating you probably want instead to create a pre-split table. J-D On Tue, Mar 29, 2011 at 2:27 PM, Venkatesh vramanatha...@aol.com wrote:  Hi, If I export existing table using Export MR job, truncate the table, increase region

Re: Tips on pre-splitting

2011-03-29 Thread Ted Yu
I am not very familiar with Pig. Assuming reducer output file is SequenceFile, steps 2 and 3 can be automated. On Tue, Mar 29, 2011 at 2:15 PM, Bill Graham billgra...@gmail.com wrote: I've been thinking about this topic lately so I'll fork from another discussion to ask if anyone has a good

Re: Export/Import and # of regions

2011-03-29 Thread Jean-Daniel Cryans
Pre-splitting was discussed a few times on the mailing list today, and a few times in the past weeks, for example: http://search-hadoop.com/m/XB9Vr1gQc66 Import works on a pre-existing table so it won't recreate it. Also it doesn't know how your key space is constructed, so it cannot guess the

region in a bad state - how to manually fix

2011-03-29 Thread Bill Graham
Hi, We have an empty table that is somehow in a bad state that I'm unable to disable or drop. We're running 0.90.0 on CDH3b2. Is there a way that I can manually remove this table from HBase without making a mess of things? The table has 2 CFs and it's empty. When I do a scan I get this:

Re: Tips on pre-splitting

2011-03-29 Thread Bill Graham
The output is a text file. I'm sure I could write something using the HDFS Java API to pull the first line of each file, but I'm looking for an approach to extract these keys all via MR, if possible. On Tue, Mar 29, 2011 at 2:33 PM, Ted Yu yuzhih...@gmail.com wrote: I am not very familiar with

Re: Tips on pre-splitting

2011-03-29 Thread Ted Dunning
It should be pretty easy to down-sample the data to have no more than 1000-10,000 keys. Sort those and take every n-th key omitting the first and last key. This last can probably best be done as a conventional script after you have knocked down the data to small size. Note that most of your

Re: region in a bad state - how to manually fix

2011-03-29 Thread Stack
On Tue, Mar 29, 2011 at 2:44 PM, Bill Graham billgra...@gmail.com wrote: What would happen if I manually removed it's regions from .META. and it's directory from HDFS? Grep kpd_user_info2,,1300458908181.412ae74e76cc4000da10233b3e6f163e. in master logs See if its on any server. If not, go

Re: Tips on pre-splitting

2011-03-29 Thread Bill Graham
last key.  This last can probably best be done as a conventional script after you have knocked down the data to small size. Yeah, the split command comes to mind. Note that most of your joins can just go away since all you want are the keys. Sure, but you need to make sure that you're still

Re: Performance test results

2011-03-29 Thread Jean-Daniel Cryans
Inline. J-D Hi J-D, I can't paste the entire file because it's 126K. Trying to attach it now as zip, lets see if that has more luck. In the jstack you posted, all the Gets were hitting HDFS which is probably why it's slow. Until you can get something like HDFS-347 in your Hadoop you'll have

Re: HTable and threads

2011-03-29 Thread Jean-Daniel Cryans
Hey Joe, That TPE is used to do batch operations from a single HTable, but those pools cannot be shared the way the code works right now. If you don't need batch operations, you can set hbase.htable.threads.max to 1. It seems that when you call htable.close it doesn't close the TPE, which is a

Re: HTable and threads

2011-03-29 Thread Ted Yu
Are you suggesting that the thread pool be shutdown in this method ? public void close() throws IOException { flushCommits(); } On Tue, Mar 29, 2011 at 3:49 PM, Joe Pallas pal...@cs.stanford.edu wrote: Trying to understand why out test program was generating so many threads (HBase

Re: HTable and threads

2011-03-29 Thread Jean-Daniel Cryans
Yeah after flushing the remaining edits. On Tue, Mar 29, 2011 at 3:56 PM, Ted Yu yuzhih...@gmail.com wrote: Are you suggesting that the thread pool be shutdown in this method ?  public void close() throws IOException {    flushCommits();  } On Tue, Mar 29, 2011 at 3:49 PM, Joe Pallas

Re: HTable and threads

2011-03-29 Thread Ted Yu
See https://issues.apache.org/jira/browse/HBASE-3712 On Tue, Mar 29, 2011 at 3:58 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Yeah after flushing the remaining edits. On Tue, Mar 29, 2011 at 3:56 PM, Ted Yu yuzhih...@gmail.com wrote: Are you suggesting that the thread pool be shutdown

Re: Replication and Rebalancer produces a lot of exceptions

2011-03-29 Thread Jeff Whiting
I'm not sure...the key to everything was realizing picking up on RemoteException in Unable to replicate because org.apache.hadoop.ipc.RemoteException and realizing that it was on the replication cluster. If it said something like Unable to replicate. Destination cluster threw an exception:

Re: Tips on pre-splitting

2011-03-29 Thread Ted Dunning
Your mileage may vary. If you are grouping records so that all the results are equal size, then uniquing the keys before sampling is good. On the other hand, if you have larger data items for repeated keys due to the grouping, then giving fewer big keys to some regionservers is good. You can

YCSB performance degradation with lzo

2011-03-29 Thread Jeff Whiting
I'm running some YCSB tests and am seeing performance loss when I enable lzo on the table when doing the load. There are times where the insert rate will drop to 0 operations per second. The drop in ops/sec is caused by: 16:17:51,410 INFO HRegion: Blocking updates for 'IPC Server handler 72 on

Re: Export/Import and # of regions

2011-03-29 Thread Venkatesh
Thanks J-D..Using 0.20.6..I don't see that method with pre-split in 0.20.6 API spec 1) Will the data still be accessible if I Import the data to a new table? (purely for backup reasons) I tried on small data set..I could.. Before I do export/Import on large table, want to make sure.. 2)

Re: Replication and Rebalancer produces a lot of exceptions

2011-03-29 Thread Jean-Daniel Cryans
Sounds good! I'll do that change. Thx, J-D On Tue, Mar 29, 2011 at 4:01 PM, Jeff Whiting je...@qualtrics.com wrote: I'm not sure...the key to everything was realizing picking up on RemoteException in Unable to replicate because org.apache.hadoop.ipc.RemoteException and realizing that it was

Re: Export/Import and # of regions

2011-03-29 Thread Jean-Daniel Cryans
 Thanks J-D..Using 0.20.6..I don't see that method with pre-split in 0.20.6 API spec It's new from 0.89, please consider upgrading. 1) Will the data still be accessible if I Import the data to a new table? (purely for backup reasons) I tried on small data set..I could.. Before I do

RE: HTable and threads

2011-03-29 Thread Buttler, David
Do the static methods on HTable (like isTableEnabled), also have this problem? From the code it looks like if you naively call the static method without a Configuration object it will create a configuration and put it into a HashMap where it will live around forever. This really bit me

Re: HTable and threads

2011-03-29 Thread Jean-Daniel Cryans
Do the static methods on HTable (like isTableEnabled), also have this problem? From the code it looks like if you naively call the static method without a Configuration object it will create a configuration and put it into a HashMap where it will live around forever. Good point then the

RE: HTable and threads

2011-03-29 Thread Buttler, David
Thanks. I agree HBaseAdmin is probably the way to go. I guess what was unexpected about this was that the static method HTable.isTableEnabled(tableName) really creates a configuration object under the hood and uses that configuration object to manage the connections. Maybe this method should

Re: hole in META

2011-03-29 Thread Venkatesh
I've regions like this... add_table.rb is unable to fix this... Is there anything else I could do to fix holes? startkey end-key yv018381 yv018381 yv018381 . yv018381 . -Original Message- From:

Re: Unable to create table

2011-03-29 Thread Hari Sreekumar
Ah, yea. I have this error in the regionserver log: 2011-03-30 10:09:26,534 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: LRU Stats: total=1.63 MB, free=196.71 MB, max=198.34 MB, blocks=0, accesses=0, hits=0, hitRatio=�%, cachingAccesses=0, cachingHits=0, cachingHitsRatio=�%,

Re: hole in META

2011-03-29 Thread Stack
What is that? Overlapping regions? Can you try merging them with merge tool? Else, study whats in hdfs. One may have nothing in it (check sizes). It might just be reference files only. If so, lets go from there. And I describe how to merge. St.Ack On Tue, Mar 29, 2011 at 9:25 PM, Venkatesh