Yeah, those families are all needed -- but I didn't realize the files were so small. That's odd -- and you're right, that'd certainly throw it off. I'll merge them all and see if that helps.
On Wed, Sep 1, 2010 at 5:24 PM, Jean-Daniel Cryans <[email protected]> wrote: > Took a quick look at your RS log, it looks like you are using a lot of > families and loading them pretty much at the same rate. Look at lines > that start with: > > INFO org.apache.hadoop.hbase.regionserver.Store: Added ... > > And you will see that you are dumping very small files on the > filesystem, on average 5MB, that together account for ~64MB which is > the default flush size (and then it generates tons of compactions > which makes it even worse). Do you really need all those families? Try > merging them and see the difference. > > J-D > > On Wed, Sep 1, 2010 at 5:03 PM, Bradford Stephens > <[email protected]> wrote: >> 'allo, >> >> I changed the cluster form m1.large to c1.xlarge -- we're getting >> about 4k inserts /node / minute instead of 2k. A small improvement, >> but nowhere near what I'm used to, even from vague memories of old >> clusters on EC2. >> >> I also stripped all the Cascading from my code and have a very basic >> raw MR job -- we're basically reading raw text, splitting it into >> fields, and adding those rows to HBase. About the simplest task you >> could do. >> >> Ideas for next steps? What other info could I share? >> >> Cheers, >> B >> >> On Wed, Sep 1, 2010 at 10:55 AM, Andrew Purtell <[email protected]> wrote: >>>> From: Gary Helmling >>>> >>>> If you're using AMIs based on the latest Ubuntu (10.4), >>>> theres a known kernel issue that seems to be causing >>>> high loads while idle. More info here: >>>> >>>> https://bugs.launchpad.net/ubuntu/+source/linux-ec2/+bug/574910 >>> >>> Seems best to avoid using Lucid on EC2 for now, then. >>> >>> FYI, the EC2 scripts that I use build AMIs based on Amazon's old FC8 AMI >>> (with updates). See http://github.com/apurtell/hbase-ec2 >>> >>> - Andy >>> >>> >>> >>> >>> >> >> >> >> -- >> Bradford Stephens, >> Founder, Drawn to Scale >> drawntoscalehq.com >> 727.697.7528 >> >> http://www.drawntoscalehq.com -- The intuitive, cloud-scale data >> solution. Process, store, query, search, and serve all your data. >> >> http://www.roadtofailure.com -- The Fringes of Scalability, Social >> Media, and Computer Science >> > -- Bradford Stephens, Founder, Drawn to Scale drawntoscalehq.com 727.697.7528 http://www.drawntoscalehq.com -- The intuitive, cloud-scale data solution. Process, store, query, search, and serve all your data. http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science
