Re: Question for HBase users

2008-01-07 Thread Taeho Kang
Hi Lars, The test result for the effect of optimization can be found at the bottom of this link http://issues.apache.org/jira/browse/HADOOP-1687 However, If you were using the latest version of Hadoop (0.15 and up), then all the namenode optimization would've been built in. Using Zip archive is a

Re: Question for HBase users

2008-01-06 Thread Lars George
Ted, This is actually both, I first tried Hadoop directly and then HBase in my second attempt. Lars Ted Dunning wrote: He didn't store all of these documents in separate files. He stored them in hbase, hence his pain with the upgrade. On 1/6/08 5:44 PM, "Taeho Kang" <[EMAIL PROTECTED]> w

Re: Question for HBase users

2008-01-06 Thread Lars George
Hi Taeho, Fortunately for us, we don't have a need for storing millions of files in HDFS just yet. We are adding only a few thousand files a day, so that gives us a handful of days. And we've been using Hadoop more than a year, and its reliability has been superb. Sounds great. This is ju

Re: Question for HBase users

2008-01-06 Thread Ted Dunning
He didn't store all of these documents in separate files. He stored them in hbase, hence his pain with the upgrade. On 1/6/08 5:44 PM, "Taeho Kang" <[EMAIL PROTECTED]> wrote: > Thanks for sharing your "painful" experience with us, Lars. I always > wondered what would hapeen if HDFS tried to h

Re: Question for HBase users

2008-01-06 Thread Taeho Kang
Hello, Lars. > Thank you for the input. May I ask where your experience comes from? > What was the largest you have seen in real live? Just asking. Fortunately for us, we don't have a need for storing millions of files in HDFS just yet. We are adding only a few thousand files a day, so that gi

Re: Question for HBase users

2008-01-06 Thread Lars George
Hi Taeho, Thank you for the input. May I ask where your experience comes from? What was the largest you have seen in real live? Just asking. As far as consolidating files is concerned, this sound like having to store small databases in Hadoop. But then how is that for performance when needin

Re: Question for HBase users

2008-01-06 Thread Taeho Kang
Thanks for sharing your "painful" experience with us, Lars. I always wondered what would hapeen if HDFS tried to host few hundred million files. By the way, I think, with the current namenode desgin of Hadoop, it is unlikely that you will ever be able to host 500 million files on a single cluster

Re: Question for HBase users

2008-01-05 Thread Lars George
Ted, In an absolute worst case scenario. I know this is beta and all, but I start using HBase in a production environment and need to limit downtime (which is what this architecture promises) to minimum - none at all if I can. All in all, if I cannot rely on HBase yet being stable, what woul

Re: Question for HBase users

2008-01-05 Thread Ted Dunning
Lars, Can you dump your documents to external storage (either HDFS or ordinary file space storage)? On 1/4/08 10:01 PM, "larsgeorge" <[EMAIL PROTECTED]> wrote: > > Jim, > > I have inserted about 5million documents into HBase and translate them into > 15 languages (means I end up with about 7

Re: Question for HBase users

2008-01-04 Thread larsgeorge
Jim, I have inserted about 5million documents into HBase and translate them into 15 languages (means I end up with about 75million in the end). That data is only recreatable if we process them costly again. So I am in need of a migration path. For me this is a definitely +1 for a migration tool

Re: Question for HBase users

2008-01-03 Thread Billy
Saying that the working release only came out 3 months ago I would hope no one had data stored in hbase at this time that is not backup and/or stored somewhere else. so that puts me at a -1 on the migration utility. but I might be wrong above most if using hbase they should be able to output the