I am not sure how to handle permanent storage in EC2 but you can run derby as a networked database similar to that of mysql. You can run derby server on the same node as jobtracker and when you are done with the cluster, you can copy the directory on which derby is running to a permanent place. When you need the metastore again in a new cluster setup, copy the file to any node in the cluster and then start the derby server on that file.
Instructions on how to setup a derby server are here http://wiki.apache.org/hadoop/HiveDerbyServerMode (derby docs @ http://db.apache.org/derby/docs/10.4/adminguide/cadminov825266.html) Prasad ________________________________ From: Eva Tse <[email protected]> Reply-To: <[email protected]> Date: Thu, 11 Jun 2009 09:21:12 -0700 To: <[email protected]> Subject: Setting up local repository w/ derby We would like to know how/if possible to use the embedded derby driver to setup a local repository. If so, what configuration could achieve this. We want to test Hive on EC2 and don't want to provision another machine on the cloud to run mysql for the metastore just yet. Thanks! Eva.
