Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by AshishThusoo:
http://wiki.apache.org/hadoop/Hive/DeveloperGuide

The comment on the change is:
Start filling out the developer guide.

------------------------------------------------------------------------------
  = Developer Guide =
  == Code Organization and a brief architecture ==
  === Introduction ===
+ Hive comprises of 3 main components:
+  * Serializers/Deserializers (trunk/serde) - This component has the framework 
libraries that allow users to develop serializers and deserializers for their 
own data formats. This component also contains some builtin 
serialization/deserialization families.
+  * MetaStore (trunk/metastore) - This component implements the metadata 
server which is used to hold all the information about tables and partitions 
that are in the warehouse.
+  * Query Processor (trunk/ql) - This component implements the processing 
framework for converting SQL to a graph of map/reduce jobs and also the 
execution time framework to run those jobs in the order of dependencies.
+ 
+ Apart from these major components, Hive also contains a number of other 
components. These are as follows:
+  * Command Line Interface (trunk/cli) - This component has all the java code 
used by the Hive command line interface.
+  * Hive Server (trunk/service) - This component implements all the APIs that 
can be used by other clients (such as JDBC drivers) to talk to Hive.
+  * Common (trunk/common) - This component contains common infrastructure 
needed by the rest of the code. Currently, this contains all the java sources 
for managing and passing Hive configurations(HiveConf) to all the other code 
components.
+  * Ant Utilities (trunk/ant) - This component contains the implementation of 
some ant tasks that are used by the build infrastructure.
+  * Scripts (trunk/bin) - This component contains all the scripts provided in 
the distribution including the scripts to run the Hive cli(bin/hive).
+ 
+ The following top level directories contain helper libraries, packaged 
configuration files etc..:
+  * trunk/conf - This directory contains the packaged hive-default.xml and 
hive-site.xml.
+  * trunk/data - This directory contains some data sets and configurations 
used in the hive tests.
+  * trunk/ivy - This directory contains the ivy files used by the build 
infrastructure to manage dependencies on different hadoop versions.
+  * trunk/lib - This directory contains the run time libraries needed by Hive.
+  * trunk/testlibs - This directory contains the junit.jar used by the junit 
target in the build infrastructure.
+  * trunk/testutils (Deprecated)
+ 
  === SerDe ===
  === MetaStore ===
  === Query Processor ===

Reply via email to