Is there a setting to cap row size?

2011-04-07 Thread Bryan Keller
I have a wide table schema for an HBase table, where I model a one-to-many relationship of purchase orders and line items. Each row is a purchase order, and I add columns for each line item. Under normal circumstances I don't expect more than a few thousand columns per row, totalling less than

Re: Is there a setting to cap row size?

2011-04-07 Thread Ryan Rawson
Sounds like you are having a HDFS related problem. Check those datanode logs for errors. As for a setting for max row size, this might not be so easy to do, since during the Put time we don't actually know anything about the existing row data. To find that out we'd have to go and read the row

timing out for hdfs errors faster

2011-04-07 Thread Jack Levin
Hello, I get those errors sometimes: 2011-04-07 07:49:41,527 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.103.7.5:50010 for file /hbase/media_data/1c95bfcf0dd19800b1f44278627259ae/att/7725092577730365184 for block 802538788372768807:java.net.SocketTimeoutException: 6 millis

org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLo

2011-04-07 Thread Shuja Rehman
Hi I am trying to read from hbase the following code. http://pastebin.com/wvVVUT3p it reads for first 4-5 times but after that it start throwing this exception SEVERE: null org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException:

file is already being created by NN_Recovery

2011-04-07 Thread Daniel Iancu
Hello everybody We've run into this, now popular, error on our cluster 2011-04-07 16:28:00,654 WARN IPC Server handler 0 on 8020 org.apache.hadoop.hdfs.StateChange - DIR* NameSystem.startFile: failed to create file

Re: file is already being created by NN_Recovery

2011-04-07 Thread Jack Levin
If you have socket.dfs.timeout set to 0, consider removing it, most of our issues like that went away after that. This problem occurs when you have datanode crash, and there is a conflict with the lease on the file (which should expire in one hour, this is unconfigurable hard timeout). If you

Task process exit with nonzero status of 255.

2011-04-07 Thread Shahnawaz Saifi
Hi, While executing MR with 472G data, I am running into following error: java.io.IOException: Task process exit with nonzero status of 255. at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:418) 2011-04-07 07:54:15,442 INFO org.apache.zookeeper.ClientCnxn: Opening socket

Re: Task process exit with nonzero status of 255.

2011-04-07 Thread Jean-Daniel Cryans
Check the log of the zookeeper at the adress that's printed, it may be a problem of too many connections (in which case you need to make sure you reuse the configuration objects). J-D On Thu, Apr 7, 2011 at 9:49 AM, Shahnawaz Saifi shahsa...@gmail.com wrote: Hi, While executing MR with 472G

Re: file is already being created by NN_Recovery

2011-04-07 Thread Stack
The RegionServer is down for sure? Else it sounds like an issue that was addressed by the addition of a new short-circuit API call added to HDFS on the hadoop-0.20-append branch. The patches that added this new call went into the branch quite a while ago. They are: HDFS-1554. New semantics

Re: timing out for hdfs errors faster

2011-04-07 Thread Jean-Daniel Cryans
Another question, why would dfsclient setting for sockettimeout (for data reading) would be set so high by default if HBASE is expected to be real time?  Shouldn't it be few seconds (5?). Not all clusters are used for real time applications, also usually users first try to cram as much data as

Re: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connecti

2011-04-07 Thread Jean-Daniel Cryans
You should be seeing more log lines related to ZooKeeper before that. Also make sure your client connects to the zk server. J-D On Thu, Apr 7, 2011 at 9:11 AM, Shuja Rehman shujamug...@gmail.com wrote: Hi I am trying to read from hbase the following code. http://pastebin.com/wvVVUT3p it

Re: timing out for hdfs errors faster

2011-04-07 Thread Stack
On Thu, Apr 7, 2011 at 7:58 AM, Jack Levin magn...@gmail.com wrote: Hello, I get those errors sometimes: 2011-04-07 07:49:41,527 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.103.7.5:50010 for file /hbase/media_data/1c95bfcf0dd19800b1f44278627259ae/att/7725092577730365184

Re: HTable.put hangs on bulk loading

2011-04-07 Thread Jean-Daniel Cryans
There's nothing of use in the pasted logs unfortunately, and the log didn't get attached to your mail (happens often). Consider putting on a web server or pastebin. Also I see you are on an older version, upgrading isn't going to fix your issue (which is probably related to your environment or

Re: timing out for hdfs errors faster

2011-04-07 Thread Stack
Jack: Pardon me. What J-D said. You were asking about DN timeout. Below I write about RS timeout. St.Ack On Thu, Apr 7, 2011 at 10:28 AM, Stack st...@duboce.net wrote: On Thu, Apr 7, 2011 at 7:58 AM, Jack Levin magn...@gmail.com wrote: Hello, I get those errors sometimes: 2011-04-07

Re: timing out for hdfs errors faster

2011-04-07 Thread Jack Levin
Thanks, How about setting hbase-site.xml with dfs.datanode.socket.write.timeout dfs.datanode.socket.read.write.timeout If tcp connection is established, but harddrive fails right after that, I do not want to wait 60 seconds to read, I want to quicky timeout and move to next datanode. -Jack

Re: timing out for hdfs errors faster

2011-04-07 Thread Jack Levin
I meant to say dfs.datanode.socket.read.timeout -Jack On Thu, Apr 7, 2011 at 10:54 AM, Jack Levin magn...@gmail.com wrote: Thanks, How about setting hbase-site.xml with dfs.datanode.socket.write.timeout dfs.datanode.socket.read.write.timeout If tcp connection is established, but harddrive

Hadoop Append Github

2011-04-07 Thread Jason Rutherglen
Is https://github.com/facebook/hadoop-20-append the Github branch for Hadoop Append 0.20?

Re: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connecti

2011-04-07 Thread Shuja Rehman
here is more log. now it is not connecting at all. 11/04/07 23:02:55 WARN hbase.HBaseConfiguration: instantiating HBaseConfiguration() is deprecated. Please use HBaseConfiguration#create() to construct a plain Configuration 11/04/07 23:02:55 INFO zookeeper.ZooKeeper: Client

Re: Hadoop Append Github

2011-04-07 Thread Jean-Daniel Cryans
That's the one published by Facebook, the one maintained by Apache is https://github.com/apache/hadoop-common/tree/branch-0.20-append J-D On Thu, Apr 7, 2011 at 11:04 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Is https://github.com/facebook/hadoop-20-append the Github branch for

Re: Hadoop Append Github

2011-04-07 Thread Stack
That one looks dead Jason. There was a bulk upload in December and nought since. St.Ack On Thu, Apr 7, 2011 at 11:04 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Is https://github.com/facebook/hadoop-20-append the Github branch for Hadoop Append 0.20?

Re: Hadoop Append Github

2011-04-07 Thread Jason Rutherglen
Ah ok, Google turned up the one I posted, I wonder why this one was harder to find? Thanks! On Thu, Apr 7, 2011 at 11:07 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: That's the one published by Facebook, the one maintained by Apache is

Re: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connecti

2011-04-07 Thread Jean-Daniel Cryans
If you look at 204.13.166.85's zookeeper log, do you see anything that looks bad around the time you ran this? J-D On Thu, Apr 7, 2011 at 11:04 AM, Shuja Rehman shujamug...@gmail.com wrote: here is more log. now it is not connecting at all. 11/04/07 23:02:55 WARN hbase.HBaseConfiguration:

Re: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connecti

2011-04-07 Thread Shuja Rehman
i got the log file and it says that 2011-04-07 11:17:41,864 - WARN [NIOServerCxn.Factory: 0.0.0.0/0.0.0.0:2181:NIOServerCnxn$Factory@247] - Too many connections from /182.178.254.222 - max is 10 2011-04-07 11:17:45,453 - WARN [NIOServerCxn.Factory:

Re: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connecti

2011-04-07 Thread Jean-Daniel Cryans
So regarding finding your logs and other stuff related to that, since you are using CDH you should always check their documentation. In ZooKeeper there's a configurable limit of 30 connections per IP. HTable.close won't close the connection since you can have multiple HTables using the same

Re: HTable.put hangs on bulk loading

2011-04-07 Thread ajay.gov
Sorry, my server config was not attached. Its here: http://pastebin.com/U41QZGiq thanks -ajay ajay.gov wrote: I am doing a load test for which I need to load a table with many rows. I have a small java program that has a for loop and calls HTable.put. I am inserting a map of 2 items

Re: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.hadoop.hbase.ZooKeeperConnectionException: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = Connecti

2011-04-07 Thread Jean-Daniel Cryans
To help usability, I created https://issues.apache.org/jira/browse/HBASE-3755 J-D On Thu, Apr 7, 2011 at 11:39 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: So regarding finding your logs and other stuff related to that, since you are using CDH you should always check their documentation.

Re: Hadoop Append Github

2011-04-07 Thread Jason Rutherglen
Is http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append different than the Github one at https://github.com/apache/hadoop-common/tree/branch-0.20-append ? As I can apply the HDFS-347 patch successfully to the SVN version, however the Github one has a number of rejects. Are

Re: Hadoop Append Github

2011-04-07 Thread Jean-Daniel Cryans
As far as I can tell, they are at the same revision. J-D On Thu, Apr 7, 2011 at 1:19 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Is http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-append different than the Github one at

Re: Hadoop Append Github

2011-04-07 Thread Jason Rutherglen
How did you compare? On Thu, Apr 7, 2011 at 1:37 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: As far as I can tell, they are at the same revision. J-D On Thu, Apr 7, 2011 at 1:19 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Is

Re: Hadoop Append Github

2011-04-07 Thread Jason Rutherglen
It looks like they may [somehow] be different? The latest change to SVN happened 2011-01-10 whereas the Github one was changed Wed Apr 6 19:03:37 2011? Here's from Github: commit 53d6ff79e8c4ee850cf4e592ddd20b8e116a8513 Author: Konstantin Shvachko s...@apache.org Date: Wed Apr 6 19:03:37 2011

Re: Hadoop Append Github

2011-04-07 Thread Jean-Daniel Cryans
That last change on github was for trunk, not the append branch. The last one I see in that branch is: HDFS-1554. New semantics for recoverLease. Contributed by Hairong Kuang. Hairong Kuang (author) January 10, 2011 Same as in SVN. J-D On Thu, Apr 7, 2011 at 2:09 PM, Jason Rutherglen

Re: Hadoop Append Github

2011-04-07 Thread Jason Rutherglen
How using Github were you able to see only the log for the given branch/URL? I'm not sure why the patch won't apply. I ran diff, and there are differences, though they're mostly in the scripts and other non-source code files. On Thu, Apr 7, 2011 at 2:20 PM, Jean-Daniel Cryans

Re: Hadoop Append Github

2011-04-07 Thread Jean-Daniel Cryans
So from that page: https://github.com/apache/hadoop-common Switch to the append branch then click on the right on the history button. Make sure you switched to the append branch in your cloned git repo too, by default you are on trunk. J-D On Thu, Apr 7, 2011 at 4:08 PM, Jason Rutherglen

Re: HTable.put hangs on bulk loading

2011-04-07 Thread Ajay Govindarajan
Thanks for pointing this out. I have uploaded the server config at: http://pastebin.com/U41QZGiq thanks -ajay From: Jean-Daniel Cryans jdcry...@apache.org To: user@hbase.apache.org Sent: Thursday, April 7, 2011 10:29 AM Subject: Re: HTable.put hangs on bulk

zookeper warning with 0.90.1 hbase

2011-04-07 Thread Venkatesh
I see lot of these warnings..everything seems to be working otherwise..Is this something that can be ignored? 2011-04-07 21:29:15,032 WARN Timer-0-SendThread(..:2181) org.apache.zookeeper.ClientCnxn - Session 0x0 for server :2181, unexpected error, closing socket connection and

Re: zookeper warning with 0.90.1 hbase

2011-04-07 Thread Stack
They happen on the end of a map task or on shutdown? If so, yes, ignore (or, if you want to have nice clean shutdown, figure how Session 0x0 was set up -- was it you -- and call appropriate close in time). St.Ack On Thu, Apr 7, 2011 at 6:33 PM, Venkatesh vramanatha...@aol.com wrote:  I see