>  So unless the process is drastically simpler than I've estimated, I
>  think my next stop is going to be a SimpleDB tutorial, keeping my
>  hbase work handy as another alternative.

Well, SimpleDB is out - the limited beta is closed. That leaves me with just S3.

-- Jim


On Thu, May 8, 2008 at 10:19 AM, Jim R. Wilson <[EMAIL PROTECTED]> wrote:
> Unfortunately, I'm about to give up on hbase over ec2.
>
>  In my application, the hbase storage is very simple, write-once text
>  storage.  To get this to work on ec2, I've concluded I need the
>  following:
>
>  1. A cluster of hadoop machines running an appropriate version of
>  hadoop (0.16.3 at the time of this writing)
>
>  2. Hbase running on the same cluster, either connected to S3, which
>  I've been warned as slow, or HDFS on top of PersistentFS which may or
>  may not fair better.
>
>  3. Thrift service running atop hbase for interaction from remote
>  (outside ec2) Python and PHP scripts.
>
>  4. Static IP's for any hadoop nodes running data-transfer jobs due to
>  firewall restrictions on the MySQL end (outside ec2), and also so that
>  the Python/PHP scripts know where to find Thrift.
>
>  5. Mechanism to force all hbase nodes to write any memory-resident
>  changes to disc for backup purposes (Java).
>
>  Now, my particular problem is very simple - just numeric key to text
>  storage.  Ex: { "1":"Hello", "2":"World" }.  I've (nearly) come to the
>  conclusion that I would be much better off either:
>
>  a. Using an S3 bucket to store 1.txt, 2.txt etc (probably with a
>  heirarchical dir structure to keep the directories small - I've got
>  about 4 million such number/text pairs at the moment).
>
>  b. Using SimpleDB (which I've yet to learn, but expect to be similar
>  to hbase/BigTable)
>
>  c. Running an hbase/hadoop cluster somewhere else (I already have a
>  single-node cluster working great on our hosting provider's internal
>  network).
>
>  So unless the process is drastically simpler than I've estimated, I
>  think my next stop is going to be a SimpleDB tutorial, keeping my
>  hbase work handy as another alternative.
>
>  -- Jim R. Wilson (jimbojw)
>

Reply via email to