Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by JoydeepSensarma: http://wiki.apache.org/hadoop/Hive/HiveAws ------------------------------------------------------------------------------ We walk you through the choices involved here and show some practical case studies that contain detailed setup and configuration instructions. == Running the Hive CLI == - Hive CLI environment is completely independent of Hadoop. The CLI takes in queries, compiles them into a plan consisting of map-reduce jobs and then submits them to a Hadoop Cluster. For this reason the CLI can be run from any node that has a Hive distribution, a Java Runtime Engine and that can connect to the Hadoop cluster. The Hive CLI also needs to access table metadata. By default this is persisted by Hive via an embedded Derby database into a folder named metastore_db on the local file system (however state can be persisted in any database - including remote mysql instances). + The CLI takes in Hive queries, compiles them into a plan (commonly, but not always, consisting of map-reduce jobs) and then submits them to a Hadoop Cluster. While it depends on Hadoop libraries for this purpose - it is otherwise relatively independent of the Hadoop cluster itself. For this reason the CLI can be run from any node that has a Hive distribution, a Hadoop distribution, a Java Runtime Engine. It can submit jobs to any compatible hadoop cluster (whose version matches that of the Hadoop libraries that Hive is using) that it can connect to. The Hive CLI also needs to access table metadata. By default this is persisted by Hive via an embedded Derby database into a folder named metastore_db on the local file system (however state can be persisted in any database - including remote mysql instances). There are two choices on where to run the Hive CLI from:
