Re: Best way to write data

2010-07-22 Thread Héctor Izquierdo Seliva
El mié, 21-07-2010 a las 09:39 -0700, Jean-Daniel Cryans escribió: So you would buffer edits going to the same row? Unless you have your own write-ahead-log, you'd likely lose data on node failure. But WRT your question, 5 cells with different timestamps is as costly to store/query as 5

[ann] Lily NoSQL content repository out and open

2010-07-22 Thread Steven Noels
Hi, (summarized from http://bit.ly/lilynosqlout) slightly over a year ago, we set out on a course to investigate what content applications would encounter in this new era where data has moved from a liability and cost to an opportunity - if you have the infrastructure to scale. We decided to

Re: [ann] Lily NoSQL content repository out and open

2010-07-22 Thread Stack
Congrats on making the 'PoA' release lads. Keep on doing the good stuff. St.Ack On Thu, Jul 22, 2010 at 3:20 AM, Steven Noels stev...@outerthought.org wrote: Hi, (summarized from http://bit.ly/lilynosqlout) slightly over a year ago, we set out on a course to investigate what content

HBase performace bulk load

2010-07-22 Thread HAN LIU
Hi Guys, I've been doing some data insertion from HDFS to HBase and the performance seems to be really bad. It took about 3 hours to insert 15 GB of data. The mapreduce job is launched from one machine which grabs data from HDFS and insert them into an HTable located at 3 other machines (1

Re: java driver versus thrift

2010-07-22 Thread Ryan Rawson
Hi, The situation is fairly complex and there is no super clear answer. Here are some facts: - Thrift requires the use of a shared-infrastructure client/server app that is another scaling factor - Thrift servers live a long time and thus can more effective amortize the HTable cache across

Re: memstore flushing took long time

2010-07-22 Thread Ted Yu
I think you meant LeaseExpiredException. Here the lines preceding previous log: 2010-07-22 06:50:06,856 DEBUG org.apache.hadoop.hbase.regionserver.HRegion: Updates disabled for region, no outstanding scanners on

Re: HBase performace bulk load

2010-07-22 Thread Jean-Daniel Cryans
Yes, then you should really look at using the write buffer. J-D On Thu, Jul 22, 2010 at 3:22 PM, HAN LIU ha...@andrew.cmu.edu wrote: Thanks J-D. The only place where I create an HTable is in the constructor of my Mapper.   The constructor is called only once for each map task right? Han

Smallest production HBase cluster

2010-07-22 Thread Paul Smith
anyone able to share their experience, thoughts on the 'smallest' production HBase cluster in operation?Thinking there may be some point in the # Nodes scale where one transitions from/to that's silly to that's actually more like it. Anyone out there with a small HBase cluster in operation

HBase reliability testing tools

2010-07-22 Thread Mingjie Lai
Hi. Todd and Jonathan mentioned in last HUG that there are some reliability testing tools for HBase flying around between developers. Could you point me where I can find them? (As far as I know Gremlins is one of them.) Thanks, Mingjie

Re: memstore flushing took long time

2010-07-22 Thread Ted Yu
I checked time on related servers. There wasn't significant lag. On Thu, Jul 22, 2010 at 9:04 PM, Ted Yu yuzhih...@gmail.com wrote: Here is the master log snippet: http://pastebin.com/fyzSb2pv 10.201.8.208 is sjc1-hadoop4.sjc1.carrieriq.com sjc1-hadoop0.sjc1.carrieriq.com is hadoop namenode