Re: Long GC pause question

2011-01-07 Thread ChingShen
Hi J-D, Yes, I run a MR job on my cluster, and when I set the MR configs as below that long gc pause is occurred. MR config: (4-core cpu per RS/DN/TT node) mapred.tasktracker.reduce.tasks.maximum = 3 mapred.tasktracker.map.tasks.maximum = 4 mapred.reduce.slowstart.completed.maps = 0.05

How to rename table's family name

2011-01-07 Thread 陈加俊
Hi everyone! How to rename the table's family name. I created the table and it's families , and insert many data into it, but I want to rename one family name now premise is not lost data .

Thrift WAL

2011-01-07 Thread Jan Lukavský
Hello everyone, we are missing possibility to disable WAL through Thrft server, is this option missing by design? Thanks, Jan

Re: Node Shutdown Problems

2011-01-07 Thread Wayne
Thanks for the reply. We are running hadoop branch-0.20-append. We have upped our xceivers to 4096 and restarted the hadoop cluster. We had upped them before but had yet to restart. Hopefully again this is our mistake not setting up correctly from the start. So far so good. Thanks for your help.

regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner

2011-01-07 Thread Wayne
I see the message below as often as every few minutes. It appears to occur after compaction begins. Is this normal? Is it an indication of bigger issues? This is after having upped our xceivers. WARN org.apache.hadoop.hbase.regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreSc

Re: Bulk load using HFileOutputFormat.RecordWriter

2011-01-07 Thread Nanheng Wu
Also do you see a problem with dropping tables while serving queries for other tables? We are using hbase 20 right now and using this bulk load method we are getting great performance but it does require using a new table for each load. We want to clean up older data by dropping their tables either

Re: regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner

2011-01-07 Thread Ryan Rawson
Its nothing, just a small logging mistake, nothing is actually wrong. On Jan 7, 2011 9:29 AM, "Wayne" wrote: > I see the message below as often as every few minutes. It appears to occur > after compaction begins. Is this normal? Is it an indication of bigger > issues? This is after having upped ou

Re: Hardware requirement for HBase/Hadoop - looking for fast 1TB disks

2011-01-07 Thread John Overman
I would recommend Samsung Spinpoint F3 drives, which have slightly higher ratings than the WD1002FAEX on newegg, and are the fastest 1TB drive for streaming reads. They're listed as $52.99+shipping ($49.65 for 10 drives) at CompUPlus.com ( http://www.compuplus.com/Drives-and-storage/Samsung-1TB-Sp

Column family data distribution and performance

2011-01-07 Thread Chris Tarnas
I was wondering how much impact on read and write performance a column family would have on rows where they don't contain any data? I'm testing out an indexing method where rather than have a separate table for storing indexes I just keep them in the same table in an INDEX column family. The co

Re: Hardware requirement for HBase/Hadoop - looking for fast 1TB disks

2011-01-07 Thread John Overman
Sorry I don't have hbase specific benchmarks. I think it still operates with mostly streaming reads and writes, but it would probably depend on your specific application. On Fri, Jan 7, 2011 at 11:58 AM, John Overman wrote: > I would recommend Samsung Spinpoint F3 drives, which have slightly hi

Re: Column family data distribution and performance

2011-01-07 Thread Stack
On Fri, Jan 7, 2011 at 10:01 AM, Chris Tarnas wrote: > I was wondering how much impact on read and write performance a column family > would have on rows where they don't contain any data? > The index column family would have data, right, just not data for every row? If you don't query this ind

Re: regionserver.Store: Not in setorg.apache.hadoop.hbase.regionserver.StoreScanner

2011-01-07 Thread Stack
Fixed in 0.90.0. St.Ack On Fri, Jan 7, 2011 at 9:52 AM, Ryan Rawson wrote: > Its nothing, just a small logging mistake, nothing is actually wrong. > On Jan 7, 2011 9:29 AM, "Wayne" wrote: >> I see the message below as often as every few minutes. It appears to occur >> after compaction begins. Is

Re: Bulk load using HFileOutputFormat.RecordWriter

2011-01-07 Thread Stack
On Fri, Jan 7, 2011 at 9:31 AM, Nanheng Wu wrote: > Also do you see a problem with dropping tables while serving queries > for other tables? You mean in shell doing disable, drop? That functionality is kinda flakey in 0.20. It does not work reliably. The disable action runs through all regions

Re: Thrift WAL

2011-01-07 Thread Jean-Daniel Cryans
Not by design, it's really just missing. J-D On Fri, Jan 7, 2011 at 4:56 AM, Jan Lukavský wrote: > Hello everyone, > > we are missing possibility to disable WAL through Thrft server, is this > option missing by design? > > Thanks, >  Jan > >

Re: Hardware requirement for HBase/Hadoop - looking for fast 1TB disks

2011-01-07 Thread Dieter Reuter
John, thanks for your answer and the link to the hard drive benchmark. I've already found a comparison between the WD1002FAEX and the F3 HD103SJ from Samsung. Performance is approx. the same, but the F3 is cheaper. Both drives are not recommended for a 24x7 use, but for my POC I think it's quite OK

Re: Column family data distribution and performance

2011-01-07 Thread Chris Tarnas
On Jan 7, 2011, at 10:14 AM, Stack wrote: > On Fri, Jan 7, 2011 at 10:01 AM, Chris Tarnas wrote: >> I was wondering how much impact on read and write performance a column >> family would have on rows where they don't contain any data? >> > > The index column family would have data, right, jus

Re: problem with LZO compressor on write only loads

2011-01-07 Thread Ryan Rawson
Hey, Here at SU we continue to use version 0.1.0 of hadoop-gpl-compression. I know some of the newer versions had bugs which leaked DirectByteBuffer space, which might be what you are running in to. Give the older version a shot, there really hasnt been much in the way of how LZO works in a whil

Re: How to rename table's family name

2011-01-07 Thread Stack
On Fri, Jan 7, 2011 at 4:30 AM, 陈加俊 wrote: > Hi everyone! > > How to rename the table's family name. > > I created the table and it's families , and insert many data into it, but I > want to rename one family name now premise is not lost data . > We do not have a mechanism to do this 陈加俊. Curren

HBase, Thrift and DemoClient.php

2011-01-07 Thread John Overman
I am trying to figure out how to use thrift with PHP and C++, and I usually start with example code. I'm having problems with the DemoClient.php. After configuration, it was throwing the exception "shouldn't get here!" on line 158, which I just commented it out. Now it starts a scanner, but I'm

Getting rid of "delete forward" in HBase 0.92+, please weigh in

2011-01-07 Thread Ryan Rawson
Hi all, We are thinking of getting rid of the "delete forward" misfeature in HBase. The one way we'd implement it would permanently remove this "feature", and prevent it from being put back in ever. What is a delete forward you ask? This is where you do a delete, but because deletes are really

Re: Column family data distribution and performance

2011-01-07 Thread Sean Bigdatafun
On Fri, Jan 7, 2011 at 10:01 AM, Chris Tarnas wrote: > I was wondering how much impact on read and write performance a column > family would have on rows where they don't contain any data? > > I'm testing out an indexing method where rather than have a separate table > for storing indexes I just

Video: The Underlying Technology of Facebook Messages

2011-01-07 Thread Nicolas Spiegelberg
For those interested, our engineering bloggers just posted the video of our tech talk about using HBase as the datastore behind Facebook messages. Thanks for being a great community! http://www.facebook.com/video/video.php?v=690851516105

Re: Video: The Underlying Technology of Facebook Messages

2011-01-07 Thread Stack
On Fri, Jan 7, 2011 at 5:35 PM, Nicolas Spiegelberg wrote: > For those interested,  our engineering bloggers just posted the video of our > tech talk about using HBase as the datastore behind Facebook messages.   > Thanks for being a great community! > > http://www.facebook.com/video/video.php?v=

log reply failures, how to resolve

2011-01-07 Thread Jack Levin
Greetings all. I have been observing some interesting problems that sometimes making hbase start/restart very hard to achieve. Here is a situation: Power goes out of a rack, and kills some datanodes, and some regionservers. We power things back on, HDFS reports all datanodes back to normal, and

Re: Getting rid of "delete forward" in HBase 0.92+, please weigh in

2011-01-07 Thread M. C. Srivas
+1 Just a clarification : by delete-forward, do you mean that a delete of a non-existent key causes a future insert of the key to get deleted? On Fri, Jan 7, 2011 at 4:23 PM, Ryan Rawson wrote: > Hi all, > > We are thinking of getting rid of the "delete forward" misfeature in > HBase. The on

Re: Getting rid of "delete forward" in HBase 0.92+, please weigh in

2011-01-07 Thread Ryan Rawson
Yes that's it exactly! On Jan 7, 2011 7:24 PM, "M. C. Srivas" wrote: > +1 > > Just a clarification : by delete-forward, do you mean that a delete of a > non-existent key causes a future insert of the key to get deleted? > > > On Fri, Jan 7, 2011 at 4:23 PM, Ryan Rawson wrote: > >> Hi all, >> >> W

question about merge-join (or AND operator betwween colums)

2011-01-07 Thread Jack Levin
Hello all, I have a scanner question, we have this table: hbase(main):002:0> scan 'mattest' ROW COLUMN+CELL 1 column=generic:, timestamp=1294454057618, value=1 1 column=ph

Re: question about merge-join (or AND operator betwween colums)

2011-01-07 Thread Phil Whelan
Hi Jack, I'm just trying follow the logic and I'm a bit confused. > Note that  ['generic', 'photo'], utilizes 'OR' operator, and not > 'AND'.   Is it possible to create a scanner that will not AND and not > OR?, in which case something like this: Am I right in thinking you meant "AND and not OR"

Strange regionserver behavior with GZ compression

2011-01-07 Thread Chris Tarnas
Thanks in advance for any help. I've been quite pleased with Hbase for this current project and until this problem it has worked quite well. Test cluster setup is CDH3b3 on a 7 nodes: 5 data nodes with 48GB RAM, 8 cores, 4 disks, 2 masters with 8 cores, 2 disks 24GB RAM for master/zookeeper/nam