Hi Lars,
The test result for the effect of optimization can be found at the bottom of
this link http://issues.apache.org/jira/browse/HADOOP-1687
However, If you were using the latest version of Hadoop (0.15 and up), then
all the namenode optimization would've been built in.
Using Zip archive is a
Ted,
This is actually both, I first tried Hadoop directly and then HBase in
my second attempt.
Lars
Ted Dunning wrote:
He didn't store all of these documents in separate files. He stored them in
hbase, hence his pain with the upgrade.
On 1/6/08 5:44 PM, "Taeho Kang" <[EMAIL PROTECTED]> w
Hi Taeho,
Fortunately for us, we don't have a need for storing millions of files in
HDFS just yet. We are adding only a few thousand files a day, so that gives
us a handful of days. And we've been using Hadoop more than a year, and its
reliability has been superb.
Sounds great.
This is ju
He didn't store all of these documents in separate files. He stored them in
hbase, hence his pain with the upgrade.
On 1/6/08 5:44 PM, "Taeho Kang" <[EMAIL PROTECTED]> wrote:
> Thanks for sharing your "painful" experience with us, Lars. I always
> wondered what would hapeen if HDFS tried to h
Hello, Lars.
> Thank you for the input. May I ask where your experience comes from?
> What was the largest you have seen in real live? Just asking.
Fortunately for us, we don't have a need for storing millions of files in
HDFS just yet. We are adding only a few thousand files a day, so that gi
Hi Taeho,
Thank you for the input. May I ask where your experience comes from?
What was the largest you have seen in real live? Just asking.
As far as consolidating files is concerned, this sound like having to
store small databases in Hadoop. But then how is that for performance
when needin
Thanks for sharing your "painful" experience with us, Lars. I always
wondered what would hapeen if HDFS tried to host few hundred million files.
By the way, I think, with the current namenode desgin of Hadoop, it is
unlikely that you will ever be able to host 500 million files on a single
cluster
Ted,
In an absolute worst case scenario. I know this is beta and all, but I
start using HBase in a production environment and need to limit downtime
(which is what this architecture promises) to minimum - none at all if I
can.
All in all, if I cannot rely on HBase yet being stable, what woul
Lars,
Can you dump your documents to external storage (either HDFS or ordinary
file space storage)?
On 1/4/08 10:01 PM, "larsgeorge" <[EMAIL PROTECTED]> wrote:
>
> Jim,
>
> I have inserted about 5million documents into HBase and translate them into
> 15 languages (means I end up with about 7
Jim,
I have inserted about 5million documents into HBase and translate them into
15 languages (means I end up with about 75million in the end). That data is
only recreatable if we process them costly again. So I am in need of a
migration path.
For me this is a definitely +1 for a migration tool
Saying that the working release only came out 3 months ago I would hope no
one had data stored in hbase at this time that is not backup and/or stored
somewhere else. so that puts me at a -1 on the migration utility. but I
might be wrong above most if using hbase they should be able to output the
11 matches
Mail list logo