[Hadoop Wiki] Update of "Hive/DeveloperGuide" by AshishThusoo

Apache Wiki Mon, 15 Dec 2008 12:21:11 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by AshishThusoo:
http://wiki.apache.org/hadoop/Hive/DeveloperGuide

The comment on the change is:
Summaries of the various query processor components.

------------------------------------------------------------------------------
  === SerDe ===
  === MetaStore ===
  === Query Processor ===
+ The following are the main components of the Hive Query Processor:
+  * Parse and SemanticAnalysis (ql/parse) - This component contains the code 
for parsing SQL, converting it into Abstract Syntax Trees, converting the 
Abstract Syntax Trees into Operator Plans and finally converting the operator 
plans into a directed graph of tasks which are executed by Driver.java.
+  * Optimizer (ql/optimizer) - This component contains some simple rule based 
optimizations like pruning non referenced columns from table scans (column 
pruning) that the Hive Query Processor does while converting SQL to a series of 
map/reduce tasks.
+  * Plan Components (ql/plan) - This component contains the classes (which are 
called descriptors), that are used by the compiler (Parser, SemanticAnalysis 
and Optimizer) to pass the information to operator trees that is used by the 
execution code.
+  * MetaData Layer (ql/metadata) - This component is used by the query 
processor to interface with the MetaStore in order to retrieve information 
about tables, partitions and the columns of the table. This information is used 
by the compiler to compile SQL to a series of map/reduce tasks.
+  * Map/Reduce Execution Engine (ql/exec) - This component contains all the 
query operators and the framework that is used to invoke those operators from 
within the map/reduces tasks.
+  * Hadoop Record Readers, Input and Output Formatters for Hive (ql/io) - This 
component contains the record readers and the input, output formatters that 
Hive registers with a Hadoop Job.
+  * Sessions (ql/session) - A rudimentary session implementation for Hive.
+  * Type interfaces (ql/typeinfo) - This component provides all the type 
information for table columns that is retrieved from the MetaStore and the 
SerDes.
+  * Hive Function Framework (ql/udf) - Framework and implementation of Hive 
operators, Functions and Aggregate Functions. This component also contains the 
interfaces that a user can implement to create user defined functions.
+  * Tools (ql/tools) - Some simple tools provided by the query processing 
framework. Currently, this component contains the implementation of the lineage 
tool that can parse the query and show the source and destination tables of the 
query.
+ 
  ==== Compiler ====
  ==== Parser ====
  ==== TypeChecking ====

[Hadoop Wiki] Update of "Hive/DeveloperGuide" by AshishThusoo

Reply via email to