Re: RowCounter example run time

2010-05-23 Thread Jean-Daniel Cryans
I don't have a set requirement.  Just trying to learn more about the system and 25 minutes seemed excessive.  I really have nothing to compare against and have no expectations; but, it takes about 900 seconds to run the count function in the shell.  My main goal is to figure out what

Re: Effect of turning major compactions off..

2010-05-26 Thread Jean-Daniel Cryans
can have a max of around 4 open files if there are 2000 regions per node)... Let me also check the logs a little more carefully and get back to the forum.. Thank you Vidhya On 5/26/10 9:38 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: I'm pretty sure something else is going on. 1

Re: Custom compaction

2010-05-26 Thread Jean-Daniel Cryans
Invisible. What's your need? J-D On Wed, May 26, 2010 at 3:56 PM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: Is there a way to customize the compaction function (like a hook provided by the API) or is it invisible to the user? Thank you Vidhya

Re: Performance at large number of regions/node

2010-05-27 Thread Jean-Daniel Cryans
With beefy nodes, don't be afraid of using bigger regions... and LZO. At stumbleupon we have 1GB maxfilesize on our 13B rows table and LZO enabled on every table. The number of regions per node is a factor of so many things... size of rows, acces pattern, hardware, etc. FWIW, I would say that you

Re: broken mirror fs hierarchy / incorrect links on download page

2010-05-29 Thread Jean-Daniel Cryans
HBase became an Apache Top Level project recently so everything is moving around, and this isn't an atomic operation ;) Thanks for reporting! J-D On Sat, May 29, 2010 at 9:52 AM, Charles Woerner charleswoer...@gmail.com wrote: Following the links from the Releases page (

Re: OOME during frequent updates...

2010-06-09 Thread Jean-Daniel Cryans
OOME is a Java exception, nothing HBase specific. It means that the JVM ran out of memory. BTW your log wasn't attached to your email (they are usually blocked), so please post it on a web server or pastebin it so we can help you. J-D On Wed, Jun 9, 2010 at 11:02 AM, Vidhyashankar Venkataraman

Re: OOME during frequent updates...

2010-06-09 Thread Jean-Daniel Cryans
On Wed, Jun 9, 2010 at 11:16 AM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: What do you mean by pastebinning it? I will try hosting it on a webserver.. pastebin.com I know that OOME is Java running out of heap space: Can you let me know what are the usual causes for OOME

Re: OOME during frequent updates...

2010-06-09 Thread Jean-Daniel Cryans
allow more than around 3-4 gigs of RAM.. On 6/9/10 11:26 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: On Wed, Jun 9, 2010 at 11:16 AM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: What do you mean by pastebinning it? I will try hosting it on a webserver.. pastebin.com I

Re: dead-lock at HTable flusCommits with multiple clients...

2010-06-10 Thread Jean-Daniel Cryans
Also 0.20.4 has the ExplicitColumnTracker that spins in a infinite loop in some situations. J-D On Thu, Jun 10, 2010 at 3:38 PM, Ryan Rawson ryano...@gmail.com wrote: hey, so you have discovered a particular 'trick' about how the HBase RPC works... at the lowest level there is only 1 socket

Re: hbase cluster cold start: master and region server did not connect!

2010-06-11 Thread Jean-Daniel Cryans
You can check the general health by using the webui, it runs on the master node at port 60010. For the errors, the context you gave is so limited that giving any meaningful answer is impossible. Please post full logs on a web server or on pastebin.com (or your preferred code pasting site) if it

Re: experiences with hbase-2492

2010-06-15 Thread Jean-Daniel Cryans
Friso, This is very interesting, and nobody answered probably because no one tried tcp_tw_recycle. I personally didn't even know about that config until a few minutes ago ;) So from the varnish mailing list, it seems that machines behind firewalls or NAT won't play well with that config, but I

Re: How to recover from an attempt to connect to an unavailable region?

2010-06-18 Thread Jean-Daniel Cryans
While I'm trying to figure out what is causing the region to be non-responsive, what's the best way to recover? I'd recover from that the same way I'd recover from any unavailable storage component, since that can happen with any DB/SAN/etc right? The link between your client and HBase could

Re: multiple reads from a Map - optimization question

2010-06-23 Thread Jean-Daniel Cryans
this can be achieved with much improved performance? Thank you. Regards, Raghava. On Tue, Jun 22, 2010 at 12:57 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: This is not super clear, some comments inline. J-D On Tue, Jun 22, 2010 at 12:49 AM, Raghava Mutharaju m.vijayaragh

Re: Cannot open filename exception in 0.20.5

2010-06-25 Thread Jean-Daniel Cryans
At first glance it looks like a double assignment of .META., and the file it's trying to get was probably already compacted by another region server. Check the master log to see why/how it happened. J-D On Fri, Jun 25, 2010 at 11:13 AM, Ted Yu yuzhih...@gmail.com wrote: Hi, I upgraded a 3 node

Re: Hbase cluster Monitoring

2010-06-25 Thread Jean-Daniel Cryans
We use ganglia at StumbleUpon, to enable metrics see http://hbase.apache.org/docs/r0.20.5/metrics.html J-D On Fri, Jun 25, 2010 at 11:52 AM, Palaniappan Thiyagarajan pthiyagara...@cashedge.com wrote: Hi, I would like to know what kind of monitoring you are having for your production env and

Re: Cannot open filename exception in 0.20.5

2010-06-25 Thread Jean-Daniel Cryans
:39,898 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Closing TED-PACKAGESUMMARY-1277485167424-0,,1277485173774: disabling compactions flushes Any insight ? Thanks On Fri, Jun 25, 2010 at 11:31 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: At first glance it looks like a double

Re: HBase 0.20.5 issues

2010-07-01 Thread Jean-Daniel Cryans
(sorry it took so long to answer, we were all busy with the various meetings around the Bay Area) I can see the issue: 2010-06-30 13:48:16,135 DEBUG master.BaseScanner (BaseScanner.java:checkAssigned(580)) - Current assignment of .META.,,1 is not valid; serverAddress=, startCode=0 unknown. ...

Re: dilemma of memory and CPU for hbase.

2010-07-01 Thread Jean-Daniel Cryans
://hbase.apache.org/docs/r0.89.20100621/ for more info. Cloudera's CDH3b2 also has everything you need. J-D On Thu, Jul 1, 2010 at 12:03 PM, Jean-Daniel Cryans jdcry...@apache.org wrote: 653 regions is very low, even if you had a total of 3 region servers I wouldn't expect any problem. So to me

Re: dilemma of memory and CPU for hbase.

2010-07-01 Thread Jean-Daniel Cryans
the whole hbase.  Anyway, I am regenerating the data from scratch and let's see if it will work out. Jimmy. -- From: Jean-Daniel Cryans jdcry...@apache.org Sent: Thursday, July 01, 2010 2:17 PM To: user@hbase.apache.org Subject: Re: dilemma

Re: dilemma of memory and CPU for hbase.

2010-07-01 Thread Jean-Daniel Cryans
-0.20.4, and restart all hbase master and regionservers. recreate all tables, etc.essentially starting from scratch. Jimmy -- From: Jean-Daniel Cryans jdcry...@apache.org Sent: Thursday, July 01, 2010 5:10 PM To: user@hbase.apache.org Subject: Re

Re: dilemma of memory and CPU for hbase.

2010-07-01 Thread Jean-Daniel Cryans
-- From: Jean-Daniel Cryans jdcry...@apache.org Sent: Thursday, July 01, 2010 5:10 PM To: user@hbase.apache.org Subject: Re: dilemma of memory and CPU for hbase. add_table.rb doesn't actually write much in the file system, all your data is still there. It just

Re: hbase 0.20.5 and maven2

2010-07-01 Thread Jean-Daniel Cryans
You won't find it, because 0.20 isn't mavenized. Trunk is, so the next major version will be available. J-D On Thu, Jul 1, 2010 at 7:47 PM, Fabiano Beppler f.bepp...@gmail.com wrote: Hi, Does anyone know where can I find hbase-0.20.5 in a maven repository? Thanks in advance! Fabiano

Re: About HDFS-630 and hbase 0.20.5

2010-07-02 Thread Jean-Daniel Cryans
release: http://archive.cloudera.com/cdh/2/hadoop-0.20.1+169.89.releasenotes.html Jean-Daniel Cryans wrote: Yes, and to deploy the cloudera release on your cluster :) J-D On Thu, Jul 1, 2010 at 11:31 AM, Ferdy ferdy.gal...@kalooga.com wrote: Allright so I will use a cloudera release. If I

Re: Zookeeper exceptions while starting up region

2010-07-06 Thread Jean-Daniel Cryans
You need to configure hbase.zookeeper.quorum, else the region server cannot guess on which machine in the local network it should find ZooKeeper. This is described in http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#fully-distrib Also I recommend using 0.20.5 J-D On Tue, Jul 6,

Re: HBase on same boxes as HDFS Data nodes

2010-07-07 Thread Jean-Daniel Cryans
Jamie, Does your configuration meets the requirements? http://hbase.apache.org/docs/r0.20.5/api/overview-summary.html#requirements ulimit and xcievers, if not set, are usually time bombs that blow off when the cluster is under load. J-D On Wed, Jul 7, 2010 at 9:11 AM, Jamie Cockrill

Re: Java Client hangs after around 2 min

2010-07-07 Thread Jean-Daniel Cryans
Which HBase version are you running? Did you even take a look at the server logs? (this client log you pasted just informs us that the connection to ZooKeeper was successful) J-D On Wed, Jul 7, 2010 at 8:49 AM, manua agarwal.m...@gmail.com wrote: Hi All, I have set up the Hbase in

Re: Whether by applying HBase, an application still needs RDBMS?

2010-07-07 Thread Jean-Daniel Cryans
HBase is currently faster at writing than random reading, but long scans are faster than writing. Not sure exactly what that unnamed article is referring to. Also, about using any other DBMS in conjunction with HBase, I would simply recommend using the right tool for the right job. J-D On Wed,

Re: HBase on Hadoop 0.21

2010-07-07 Thread Jean-Daniel Cryans
HBase probably won't support 0.21 at all, since that release is marked unstable. HBase 0.90 will be on hadoop 0.20-append which has a different implementation for sync (HDFS-200 instead of HDFS-265). I personally expect that everything will be tied back together for Hadoop 0.22 J-D On Wed, Jul

Re: Whether by applying HBase, an application still needs RDBMS?

2010-07-07 Thread Jean-Daniel Cryans
, So, If the application is like application for teller in banking, whether the right to use Hbase? regards Firdaus On 07/07/2010 11:36 PM, Jean-Daniel Cryans wrote: HBase is currently faster at writing than random reading, but long scans are faster than writing. Not sure exactly what

Re: How to specify HBase cluster end-points from HBase client code in HBase 0.20.0

2010-07-07 Thread Jean-Daniel Cryans
Passing the hbase.zookeeper.quorum config will do exactly what you need in 0.89, but I'm not sure that it will work in 0.20 J-D On Wed, Jul 7, 2010 at 10:46 AM, Jun Li jltz922...@gmail.com wrote: Hello, In my current application environment, I need to have two HBase clusters running in two

Re: HBase on same boxes as HDFS Data nodes

2010-07-07 Thread Jean-Daniel Cryans
. I've upped my MAX_FILESIZE on my table to 1GB to see if that helps (not sure if it will!). Thanks, Jamie On 7 July 2010 18:12, Jean-Daniel Cryans jdcry...@apache.org wrote: xcievers exceptions will be in the datanodes' logs, and your problem totally looks like it. 0.20.5 will have

Re: Java Client hangs after around 2 min

2010-07-07 Thread Jean-Daniel Cryans
You configured your table to use LZO, but it's not on the classpath. Please read and follow http://wiki.apache.org/hadoop/UsingLzoCompression J-D On Wed, Jul 7, 2010 at 11:58 AM, manua agarwal.m...@gmail.com wrote: Hi, I have created a swap space of 1Gb, reduced the heap size to 500Mb and

Re: Whether by applying HBase, an application still needs RDBMS?

2010-07-07 Thread Jean-Daniel Cryans
remain. On Wed, Jul 7, 2010 at 1:13 PM, nocones77-gro...@yahoo.com wrote: From: Jean-Daniel Cryans jdcry...@apache.org Also, about using any other DBMS in conjunction with  HBase, I would simply recommend using the right tool for the  right  job. This seems like a sensible approach to me

Re: Java Client hangs after around 2 min

2010-07-08 Thread Jean-Daniel Cryans
You don't, but the LZO code is in C++ so the jar only contains the bindings. You also need to copy the librairies like it does in http://wiki.apache.org/hadoop/UsingLzoCompression : cp build/native/Linux-amd64-64/lib/libgplcompression.* hbase/lib/native/Linux-amd64-64/ But adapted to your

Re: HBase on same boxes as HDFS Data nodes

2010-07-08 Thread Jean-Daniel Cryans
memory is about the same, but the memory cache is much much bigger, which presumably is healthlier as, in theory, that ought to relinquish memory to processes that request it. Lets see if that does the trick! ta Jamie On 7 July 2010 19:30, Jean-Daniel Cryans jdcry...@apache.org wrote

Re: columns.to_java_bytes undefined method in HBase.rb line 554

2010-07-08 Thread Jean-Daniel Cryans
Ted Yu already answered that yesterday: If you look at HBase.rb, line 554, you would find the bug. Here is the correct call: split = KeyValue.parseColumn(column.to_java_bytes) I guess that line was copied from line 546 which isn't in for loop. You can file a JIRA. J-D On Wed,

Re: HBase on same boxes as HDFS Data nodes

2010-07-08 Thread Jean-Daniel Cryans
More info on this blog post: http://www.larsgeorge.com/2010/05/hbase-file-locality-in-hdfs.html J-D On Thu, Jul 8, 2010 at 10:11 AM, Jean-Daniel Cryans jdcry...@apache.org wrote: This would be done at the expense of network IO, since you will lose locality for jobs that read/write to HBase

Re: HBase on same boxes as HDFS Data nodes

2010-07-08 Thread Jean-Daniel Cryans
..Following up on this thread the article.. Could some one elaborate why locality is lost upon restart? Is it because of random assignment by HMaster and/or HRegionServer is stateless or other reasons? thanks venkatesh -Original Message- From: Jean-Daniel Cryans jdcry

About data locality (Was: Re: HBase on same boxes as HDFS Data nodes)

2010-07-08 Thread Jean-Daniel Cryans
(changing the subject, let's not hijack threads) will the data move over time though...for example if i have lots of access to data in DataNode A ? without the current work that is in progress.. HBase has no control on that, but data will be moved if those regions are used. Like the article

Re: About data locality (Was: Re: HBase on same boxes as HDFS Data nodes)

2010-07-08 Thread Jean-Daniel Cryans
subject before last send ..forgot.. thanks again.. i'll be one active user with tons of Qs in the next few months :) -Original Message- From: Jean-Daniel Cryans jdcry...@apache.org To: user@hbase.apache.org Sent: Thu, Jul 8, 2010 1:58 pm Subject: About data locality (Was: Re

Re: zookeeper HBase

2010-07-08 Thread Jean-Daniel Cryans
It's not IO intense, it's IO latency sensitive eg. if other processes are sucking up most of the IO bandwidth then ZK will have a hard time taking quorum decisions. Disks are cheap, and a single 7.2k dedicated disk can be enough. J-D On Thu, Jul 8, 2010 at 5:38 PM, Arun Ramakrishnan

Re: online automatic region merge

2010-07-08 Thread Jean-Daniel Cryans
HBASE-1621 isn't about automatic merging and it's still very experimental. The issue with doing automatically is that you have to figure that two regions, together, are smaller in size than the max size of a region to split. At the same time, it's not because two regions are small that you want

Re: real world usage, any web applications built using hbase?

2010-07-10 Thread Jean-Daniel Cryans
At stumbleupon, we have su.pr (url shortner / advertising platform) that's totally based on HBase and has been in production for more than a year. Also many other parts of our main product also rely on HBase. J-D On Sat, Jul 10, 2010 at 10:43 AM, S Ahmed sahmed1...@gmail.com wrote: Its my

Re: Production usage stats/experience

2010-07-10 Thread Jean-Daniel Cryans
. thanks in advance venkatesh -Original Message- From: Jean-Daniel Cryans jdcry...@apache.org To: user@hbase.apache.org Sent: Sat, Jul 10, 2010 1:47 pm Subject: Re: real world usage, any web applications built using hbase? At stumbleupon, we have su.pr (url shortner / advertising

Re: hbase restart after hdfs reconfig Issues

2010-07-11 Thread Jean-Daniel Cryans
This is all zookeeper logging, and although it's chatty I don't see anything wrong in it (the NodeExists is normal since in 0.20 there are some places that instead of checking for a znode existence, we try to create it and only if it fails we update it). Also we log everything that comes out of

Re: Replication

2010-07-13 Thread Jean-Daniel Cryans
Which HBase version are you using? Currently replication is only available in trunk (and will be available in the next 0.89 release). The documentation is available at src/main/java/org/apache/hadoop/hbase/replication/package.html, you probably forgot to run the bin/replication/add_peer.rb script

Re: Handling downtime from hbase.client

2010-07-13 Thread Jean-Daniel Cryans
This kind of issue was discussed a couple of times on this mailing list. Basically, you can play with hbase.client.pause and hbase.client.retries.number but you won't find this satisfactory, which is why we opened https://issues.apache.org/jira/browse/HBASE-2445 J-D On Tue, Jul 13, 2010 at 10:38

Re: Replication

2010-07-13 Thread Jean-Daniel Cryans
Thanks for info where i can find some documentation. There is info about zookeeper that it need running in standalone mode it is true? Well you can run add_peer.rb when the clusters are running, but they won't pickup the change live (that part isn't done yet). So if you run the script while

Re: Handling downtime from hbase.client

2010-07-13 Thread Jean-Daniel Cryans
! -justin On 7/13/10 10:47 AM, Jean-Daniel Cryans wrote: This kind of issue was discussed a couple of times on this mailing list. Basically, you can play with hbase.client.pause and hbase.client.retries.number but you won't find this satisfactory, which is why we opened https

Re: Replication

2010-07-13 Thread Jean-Daniel Cryans
loading data connect slave or turn on replication on existing tables with data? W dniu 13.07.2010 19:56, Jean-Daniel Cryans pisze: Thanks for info where i can find some documentation. There is info about zookeeper that it need running in standalone mode it is true? Well you can run

Re: Replication

2010-07-13 Thread Jean-Daniel Cryans
)        at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:262) W dniu 13.07.2010 20:18, Jean-Daniel Cryans pisze: No, but you can use the new mapreduce utility org.apache.hadoop.hbase.mapreduce.CopyTable to copy whole tables between clusters. It's like distcp for HBase. Oh and looking

Re: regionserver crash under heavy load

2010-07-13 Thread Jean-Daniel Cryans
Please use a pasting service for the log traces. I personally use pastebin.com You probably had a GC that lasted too long, this is something out of the control of the application (apart from trying to put as less data in memory as possible, but you are inserting so...). Your log doesn't contain

Re: regionserver crash under heavy load

2010-07-13 Thread Jean-Daniel Cryans
? I don't work for cloudera, but IIRC the next beta for CDH3 is due for September. Jimmy -- From: Jean-Daniel Cryans jdcry...@apache.org Sent: Tuesday, July 13, 2010 2:55 PM To: user@hbase.apache.org Subject: Re: regionserver crash under heavy

Re: Replication

2010-07-14 Thread Jean-Daniel Cryans
14.07.2010 00:08, Jean-Daniel Cryans pisze: Just looked at the head of 0.20-append and I see it contains the missing patch (was committed as part of HDFS-1057). So that would mean that the file is just empty :) If you insert a few rows in the shell on the master cluster, do you see them some

Re: regionserver crash under heavy load

2010-07-14 Thread Jean-Daniel Cryans
release of the cloudera distribution for both hadoop and hbase. Jimmy -- From: Jean-Daniel Cryans jdcry...@apache.org Sent: Tuesday, July 13, 2010 4:24 PM To: user@hbase.apache.org Subject: Re: regionserver crash under heavy load Your region

Re: Trying to write too much to stdout destabilizes cluster across reboots

2010-07-14 Thread Jean-Daniel Cryans
Nice story Stu ;) So the first thing you saw in the master is that it was splitting the HLogs from the dead region servers (looks like they were kill -9 or failed very hard during/after your mapreduce job). As for all the HDFS error messages, it sounds like you are missing the last requirement on

Re: regionserver crash under heavy load

2010-07-14 Thread Jean-Daniel Cryans
/backup.tar.gz Jinsong -- From: Jean-Daniel Cryans jdcry...@apache.org Sent: Wednesday, July 14, 2010 11:16 AM To: user@hbase.apache.org Subject: Re: regionserver crash under heavy load So your region servers had their session expired but I don't see

Re: region servers crashing

2010-07-14 Thread Jean-Daniel Cryans
Dmitry, Your log shows this: 2010-07-12 15:10:03,299 WARN org.apache.hadoop.hbase.util.Sleeper: We slept 86246ms, ten times longer than scheduled: 1000 This is a pause that lasted more than a minute, the process was in that state (GC, swapping, mix of all of them) for some reason and it was

Re: Cluster destabilizes recovering from crash

2010-07-14 Thread Jean-Daniel Cryans
Stuart, I answered this to your last serie of emails some hours ago, I still stand behind my observations: So the first thing you saw in the master is that it was splitting the HLogs from the dead region servers (looks like they were kill -9 or failed very hard during/after your mapreduce job).

Re: Cluster destabilizes recovering from crash

2010-07-14 Thread Jean-Daniel Cryans
Check xcievers too, giving more mem to HBase won't fix HDFS. J-D On Wed, Jul 14, 2010 at 5:53 PM, Stuart Smith stu24m...@yahoo.com wrote: Ok, back up again. This time, in addition to just watching restarting, the error logs led me to this:

Re: regionserver crash under heavy load

2010-07-15 Thread Jean-Daniel Cryans
About your new crash: 2010-07-15 04:09:03,248 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=3.3544312MB (3517376), Free=405.42056MB (425114272), Max=408.775MB (428631648), Counts: Blocks=0, Access=1985476, Hit=0, Miss=1985476, Evictions=0, Evicted=0, Ratios: Hit

Re: Hanging regionservers

2010-07-15 Thread Jean-Daniel Cryans
Nothing particular in that dump, and I'm not aware of any deadlock in 0.20.3, could we see the region server log? Thx J-D On Thu, Jul 15, 2010 at 4:17 PM, Luke Forehand luke.foreh...@networkedinsights.com wrote: First post evar! I have a 3 node cluster and have set

Re: Flaky tableExists()

2010-07-20 Thread Jean-Daniel Cryans
Looks like your .META. is confused about things, or your master is editing it in a weird way. AFAIK, it's not a known issue with 0.20.5. I would advise first scanning the .META. table and look if your rows are changing between shell invocations (just look at the first row of each table). If it

Re: HBase 0.89 and JDK version

2010-07-20 Thread Jean-Daniel Cryans
u18 frequently sigsegvs on users (and then they wonder why region servers are missing), and this is also true for Hadoop. u20 seems stable but a lot of people still prefer u16. J-D On Tue, Jul 20, 2010 at 4:23 PM, Syed Wasti mdwa...@hotmail.com wrote: Hi, We recently upgraded our QA cluster

Re: Best way to write data

2010-07-21 Thread Jean-Daniel Cryans
So you would buffer edits going to the same row? Unless you have your own write-ahead-log, you'd likely lose data on node failure. But WRT your question, 5 cells with different timestamps is as costly to store/query as 5 cells with the same timestamp. The major difference is that the former case

Re: querying number of live region servers

2010-07-21 Thread Jean-Daniel Cryans
HBaseAdmin.getClusterStatus().getServers() J-D On Wed, Jul 21, 2010 at 9:56 AM, Ted Yu yuzhih...@gmail.com wrote: Hi, Is there API to query the number of live region servers ? Thanks

Re: restarting region server which shutdown due to GC pause

2010-07-21 Thread Jean-Daniel Cryans
times longer than scheduled: 1 Is it possible for HBase Master to restart dead region server in this case ? On Wed, Jul 21, 2010 at 10:02 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: HBaseAdmin.getClusterStatus().getServers() J-D On Wed, Jul 21, 2010 at 9:56 AM, Ted Yu yuzhih

Re: Regionserver died due to problem connecting to HMaster?

2010-07-21 Thread Jean-Daniel Cryans
ZooKeeper is only a canary, telling the region server that it was partionned from the cluster for longer than the default timeout somehow, usually because of GC pauses. You should see lines like slept for x, long than y messages before what you pasted. J-D On Wed, Jul 21, 2010 at 2:49 PM, Steve

Re: CLASSPATH setup for HBase

2010-07-21 Thread Jean-Daniel Cryans
Is HADOOP_CLASSPATH=${HBSE_CONF_DIR}' pointing to the right location on every machine in the cluster? While the job is running, you can go on one slave machine, issue a ps aux | grep java and check if the Child tasks have the correct classpath. J-D On Wed, Jul 21, 2010 at 3:03 PM, HAN LIU

Re: Regionserver died due to problem connecting to HMaster?

2010-07-21 Thread Jean-Daniel Cryans
and increase zookeeper timeout settings.  I will give them a trial after finishing queued data load. Other suggestions are most welcome. On Wed, Jul 21, 2010 at 2:55 PM, Jean-Daniel Cryans jdcry...@apache.orgwrote: ZooKeeper is only a canary, telling the region server

Re: HBase performace bulk load

2010-07-22 Thread Jean-Daniel Cryans
On Jul 22, 2010, at 4:43 PM, Jean-Daniel Cryans wrote: Han, This is bad, you must be doing something slow like creating a new HTable for each put call. Also you need to use the write buffer (disable auto flushing, then set the write buffer size on HTable during the map configuration) if since

Re: How does HBase ensure strong consistency?

2010-07-23 Thread Jean-Daniel Cryans
See http://hbase.apache.org/docs/r0.89.20100621/acid-semantics.html But, Please note that this is not true _across rows_ for multirow batch mutations. is actually untrue for 0.89 since https://issues.apache.org/jira/browse/HBASE-2353 (wasn't reflected in that document). Hope that helps, J-D On

Re: TableMapReduceUtil parallel puts to same row

2010-07-27 Thread Jean-Daniel Cryans
TableOutputFormat is really just a wrapper around a HTable, see for yourself http://github.com/apache/hbase/blob/0.20/src/java/org/apache/hadoop/hbase/mapreduce/TableOutputFormat.java So there must be something else about the way you use it, or the way you use HTable directly. Showing bits of

Re: HBase minimum block size for sequential access

2010-07-27 Thread Jean-Daniel Cryans
Ryan (who wrote HFile) did a lot of testing around block size and didn't really see any difference when changing it. So I would recommend that you benchmark different values with your own data/usage pattern and see if you do have better/worse perfs. The tradeoff for larger values is that in order

Re: HBase minimum block size for sequential access

2010-07-27 Thread Jean-Daniel Cryans
Thanks for the heads up.  Do you know what happens if I set this value larger than 5MB?  We will always be scanning the data, and always in large blocks.   I have yet to calculate the typical size of a single scan but imagine that it will usually be larger than 1MB. I never tried that, hard

Re: TableMapReduceUtil parallel puts to same row

2010-07-27 Thread Jean-Daniel Cryans
commits. There seems to be some other issue too here. Thanks Karthik On Tue, Jul 27, 2010 at 10:18 AM, Jean-Daniel Cryans jdcry...@apache.orgwrote: Since they are in the same batch, they could end up on the same timestamp and one will hide the other. When not batched, there's always a few

Re: HBase minimum block size for sequential access

2010-07-28 Thread Jean-Daniel Cryans
, 2010, at 10:13 AM, Jean-Daniel Cryans wrote: After altering the table, issue a major compaction on it and everything will be re-written with the new block size.

Re: sometimes more than 1 value stored, even though VERSIONS is 1

2010-07-29 Thread Jean-Daniel Cryans
Inline. J-D On Thu, Jul 29, 2010 at 8:54 AM, Ferdy Galema ferdy.gal...@kalooga.com wrote: Using Hbase 0.20.5 with Hadoop CDH2 0.20.1+169.89 I noticed something very strange. When overwriting a certain column in a column family with 1 VERSIONS, and removing that value later (for example

Re: GC [ParNew...] took 299 secs causing region server to die

2010-07-29 Thread Jean-Daniel Cryans
Well it says; Times: user=0.17 sys=0.04, real=299.23 secs So why did it take 0.04 of system time but 300 secs of real time? That's insane. Either the region server process was completely starved of CPU cycles (are you on EC2 or any virtualized service like that?), or the computer was put to

Re: Thousands of tablesq

2010-07-30 Thread Jean-Daniel Cryans
I see. Usually a whole customer fits within a region. Actually, the number of customers that doesn't fit in a single region are only two or three. But then another question comes up. Even if a put all the data in a single table, given that the keys are written in order, and given that several

Re: GC [ParNew...] took 299 secs causing region server to die

2010-07-30 Thread Jean-Daniel Cryans
swappiness is something else, it's good to set it at 0 when you have enough RAM to fit everything but it will still swap when you run out if it and it will be a big hit. I would advise monitoring the cluster, or at least very least look at the output of the top command while the job is running.

Re: connectString to point to the server instead of localhost

2010-08-02 Thread Jean-Daniel Cryans
From http://hbase.apache.org/docs/r0.20.6/api/overview-summary.html#overview_description At minimum, you should set the list of servers that you want ZooKeeper to run on using the hbase.zookeeper.quorum property. This property defaults to localhost which is not suitable for a fully distributed

Re: Which LZO library to use?

2010-08-02 Thread Jean-Daniel Cryans
I think that the person who wrote the header of that page meant that the hadoop-gpl-compression project lacks fixes included in Kevin's repo. AFAIK you can hit those if you use LZOed files as input for MR, but I've been using the second one for more than a year without any issue (in HBase). J-D

Re: Regionserver tanked, can't seem to get master back up fully

2010-08-02 Thread Jean-Daniel Cryans
Is that coming from the master? If so, it means that it was trying to write recovered data from a failed region server and wasn't able to do so. It sounds bad. - Can we get full stack traces of that error? - Did you check the datanode logs for any exception? Very often (strong emphasis on very),

Re: Memory Consumption and Processing questions

2010-08-02 Thread Jean-Daniel Cryans
cluster restart, is there any memory of which region servers last served which regions or some other method to improve data locality? Nope, not yet. The new master code for 0.90 has some basics, but it's a bit complicated and we're not there yet. It basically requires asking the Namenode for

Re: Regionserver tanked, can't seem to get master back up fully

2010-08-03 Thread Jean-Daniel Cryans
correctly as per the API doc you mention. Thanks Jamie On 2 August 2010 19:18, Jean-Daniel Cryans jdcry...@apache.org wrote: Is that coming from the master? If so, it means that it was trying to write recovered data from a failed region server and wasn't able to do so. It sounds bad

Re: redundancy testing

2010-08-03 Thread Jean-Daniel Cryans
On a small cluster like that I wouldn't bother giving 3 machines to zookeeper since your cluster is a reliable as your master node. Instead, make sure that your master has some redundant hardware and put a standalone zookeeper on it. J-D On Tue, Aug 3, 2010 at 3:41 PM, Justin Cohen

Re: Question on changing block settings...

2010-08-03 Thread Jean-Daniel Cryans
Inline. J-D On Tue, Aug 3, 2010 at 2:44 PM, Vidhyashankar Venkataraman vidhy...@yahoo-inc.com wrote: A few issues I have been observing on changing block settings:  1.  What happens if we change the block size of a column family on an already populated database? Will this not throw apps on

Re: Deployment architecture for Hadoop, HBase Hive recommendations?

2010-08-03 Thread Jean-Daniel Cryans
Sorry took a day to answer, see inline. J-D On Mon, Aug 2, 2010 at 10:47 AM, Maxim Veksler ma...@vekslers.org wrote: Hello, We're setting up a data warehouse environment that includes Hadoop, HBase, Hive and our own in-house MR jobs. I would like with your permission to discuss the

Re: Put MR job.. Regionservers crashes..

2010-08-04 Thread Jean-Daniel Cryans
The first part of the log is usually a client that died while it was doing a request. The second part of the log is a session expiration. The log fragment is too small to tell if it was the region server that paused or the ZK ensemble that was unreachable during that time... are the zk servers

Re: Put MR job.. Regionservers crashes..

2010-08-04 Thread Jean-Daniel Cryans
Wow. I get tons of them in the logs.. And there arent that many clients that got killed as reported by the MR job.. Is that the only case when these errors are reported? What about speculative execution? Or RPC timeouts (do you log that)? Ok good, so one of the two happened then.. I will

Re: How to delete rows in a FIFO manner

2010-08-06 Thread Jean-Daniel Cryans
If the inserts are coming from more than 1 client, and your are trying to delete from only 1 client, then likely it won't work. You could try using a pool of deleters (multiple threads that delete rows) that you feed from the scanner. Or you could run a MapReduce that would parallelize that for

Re: DFSClient: DataStreamer Exception: java.io.IOException: Broken pipe

2010-08-09 Thread Jean-Daniel Cryans
HDFS errors + this message: 2010-08-07 08:41:40,392 INFO org.apache.zookeeper.ClientCnxn: Client session timed out, have not heard from server in 332256ms for sessionid points to a GC pause (that lasted about 5 minutes). Make sure you aren't swapping, that HBase isn't CPU starved, monitor your

Re: Server-side write buffer configuration

2010-08-09 Thread Jean-Daniel Cryans
Hard to tell if it's decent performance. How do you define decent? What kind of hardware are we talking about? Which version are you using? How much memory was given to HBase? Also did you set the write buffer on the client side on HTable? Did you also turn off auto-flushing? Do you monitor your

Re: Server-side write buffer configuration

2010-08-09 Thread Jean-Daniel Cryans
-Daniel Cryans wrote: Hard to tell if it's decent performance. How do you define decent? I consider it descent if it is roughly the best performance one can get using my schema on my machines What kind of hardware are we talking about? One machine for HBase master and 6 regionservers. Specs

Re: Server-side write buffer configuration

2010-08-09 Thread Jean-Daniel Cryans
? For example how did you do the first 3 steps described on that page? Also is there any documentation that describes the multiple-client in HBase? On Aug 9, 2010, at 2:37 PM, Jean-Daniel Cryans wrote: That's pretty powerful machines, I would expect more performance. You could try using the same

Re: Server-side write buffer configuration

2010-08-09 Thread Jean-Daniel Cryans
I see. 0.89 is the still a developer release and I hear that it is not stable. But it sounds really tempting because it boosts performance by a lot. Can I trust it if my final goal is to insert about 100TB of data? What could be the possible issues? Also when shall I expect to see a stable

Re: YASQ (Yet Another Silly Question)

2010-08-10 Thread Jean-Daniel Cryans
The region server sets the timestamp to System.currentTimeMillis if it wasn't set by the user. J-D On Tue, Aug 10, 2010 at 11:38 AM, Michael Segel michael_se...@hotmail.com wrote: Its possible that when you're writing a row to HBase that you do not specify the timestamp. My question is

Re: About LZO

2010-08-11 Thread Jean-Daniel Cryans
I don't remember seeing anything like that, except that if your data isn't compressible then it will probably take a little longer to insert your data. The vmstat and top outputs are pretty useless in this case, as we can't tell exactly what's going on in the processes. I suggest you take a look

Re: Server-side write buffer configuration

2010-08-11 Thread Jean-Daniel Cryans
clients? Or maybe point me to a reference on such topics since I am not really an expert on Java. :p Thanks again for your reply. Han On Aug 9, 2010, at 4:13 PM, Jean-Daniel Cryans wrote: I see. 0.89 is the still a developer release and I hear that it is not stable. But it sounds really

  1   2   3   4   5   6   7   8   9   10   >