Author: khorgath
Date: Thu Sep 6 08:09:04 2012
New Revision: 1381506
URL: http://svn.apache.org/viewvc?rev=1381506&view=rev
Log:
HCATALOG-431 document hcat type to java class/pig type mapping (lefty via
khorgath)
Modified:
incubator/hcatalog/trunk/CHANGES.txt
incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/inputoutput.xml
incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/loadstore.xml
Modified: incubator/hcatalog/trunk/CHANGES.txt
URL:
http://svn.apache.org/viewvc/incubator/hcatalog/trunk/CHANGES.txt?rev=1381506&r1=1381505&r2=1381506&view=diff
==============================================================================
--- incubator/hcatalog/trunk/CHANGES.txt (original)
+++ incubator/hcatalog/trunk/CHANGES.txt Thu Sep 6 08:09:04 2012
@@ -38,6 +38,8 @@ Trunk (unreleased changes)
HCAT-427 Document storage-based authorization (lefty via gates)
IMPROVEMENTS
+ HCAT-431 document hcat type to java class/pig type mapping (lefty via
khorgath)
+
HCAT-492 Document CTAS workaround for Hive with JSON serde (lefty via
khorgath)
HCAT-487 HCatalog should tolerate a user-defined amount of bad records
(traviscrawford)
Modified:
incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/inputoutput.xml
URL:
http://svn.apache.org/viewvc/incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/inputoutput.xml?rev=1381506&r1=1381505&r2=1381506&view=diff
==============================================================================
---
incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/inputoutput.xml
(original)
+++
incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/inputoutput.xml
Thu Sep 6 08:09:04 2012
@@ -149,6 +149,84 @@ will be returned.</p>
</section>
+<!-- ==================================================================== -->
+<section>
+ <title>HCatRecord</title>
+
+<p>HCatRecord is the type supported for storing values in HCatalog tables.</p>
+<p>The types in an HCatalog table schema determine the types of objects
returned for different fields in HCatRecord. This table shows the mappings
between Java classes for MapReduce programs and HCatalog data types:</p>
+
+<table>
+ <tr>
+ <th><p class="center">HCatalog Data Type</p></th>
+ <th><p class="center">Java Class in MapReduce</p></th>
+ <th><p class="cell">Values</p></th>
+ </tr>
+ <tr>
+ <td><p class="center">TINYINT</p></td>
+ <td><p class="center">java.lang.Byte</p></td>
+ <td><p class="cell">-128 to 127</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">SMALLINT</p></td>
+ <td><p class="center">java.lang.Short</p></td>
+ <td><p class="cell">-2<sup>15</sup> to 2<sup>15</sup>-1 (-32,768 to
32,767)</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">INT</p></td>
+ <td><p class="center">java.lang.Integer</p></td>
+ <td><p class="cell">-2<sup>31</sup> to 2<sup>31</sup>-1 (-2,147,483,648 to
2,147,483,647)</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">BIGINT</p></td>
+ <td><p class="center">java.lang.Long</p></td>
+ <td><p class="cell">-2<sup>63</sup> to 2<sup>63</sup>-1
(-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807)</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">BOOLEAN</p></td>
+ <td><p class="center">java.lang.Boolean</p></td>
+ <td><p class="cell">true or false</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">FLOAT</p></td>
+ <td><p class="center">java.lang.Float</p></td>
+ <td><p class="cell">single-precision floating-point value</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">DOUBLE</p></td>
+ <td><p class="center">java.lang.Double</p></td>
+ <td><p class="cell">double-precision floating-point value</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">BINARY</p></td>
+ <td><p class="center">byte[]</p></td>
+ <td><p class="cell">binary data</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">STRING</p></td>
+ <td><p class="center">java.lang.String</p></td>
+ <td><p class="cell">character string</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">STRUCT</p></td>
+ <td><p class="center">java.util.List</p></td>
+ <td><p class="cell">structured data</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">ARRAY</p></td>
+ <td><p class="center">java.util.List</p></td>
+ <td><p class="cell">values of one data type</p></td>
+ </tr>
+ <tr>
+ <td><p class="center">MAP</p></td>
+ <td><p class="center">java.util.Map</p></td>
+ <td><p class="cell">key-value pairs</p></td>
+ </tr>
+</table>
+
+</section>
+
+<!-- ==================================================================== -->
<section>
<title>Running MapReduce with HCatalog</title>
<p>
Modified:
incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/loadstore.xml
URL:
http://svn.apache.org/viewvc/incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/loadstore.xml?rev=1381506&r1=1381505&r2=1381506&view=diff
==============================================================================
---
incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/loadstore.xml
(original)
+++
incubator/hcatalog/trunk/src/docs/src/documentation/content/xdocs/loadstore.xml
Thu Sep 6 08:09:04 2012
@@ -68,51 +68,131 @@ immediately following the load statement
<section>
<title>HCatalog Data Types</title>
-<p>Restrictions apply to the types of columns HCatLoader can read.</p>
-<p>HCatLoader can read <strong>only</strong> the data types listed in the
table.
+<p>Restrictions apply to the types of columns HCatLoader can read from
HCatalog-managed tables.</p>
+<p>HCatLoader can read <em><strong>only</strong></em> the data types listed in
the table below.
The table shows how Pig will interpret the HCatalog data type.</p>
- <table>
- <tr>
- <td>
- <p><strong>HCatalog Data Type</strong></p>
- </td>
- <td>
- <p><strong>Pig Data Type</strong></p>
- </td>
- </tr>
- <tr>
- <td>
- <p>primitives (int, long, float, double, string) </p>
- </td>
- <td>
- <p>int, long, float, double, string to chararray </p>
- </td>
- </tr>
- <tr>
- <td>
- <p>map (key type should be string, valuetype must be string)</p>
- </td>
- <td>
- <p>map </p>
- </td>
- </tr>
- <tr>
- <td>
- <p>List<any type> </p>
- </td>
- <td>
- <p>bag </p>
- </td>
- </tr>
- <tr>
- <td>
- <p>struct<any type fields> </p>
- </td>
- <td>
- <p>tuple </p>
- </td>
+ <table>
+ <tr>
+ <th>
+ <p class="center">Primitives</p>
+ </th>
+ <th>
+ <p class="center"> </p>
+ </th>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell"><strong>HCatalog Data Type</strong></p>
+ </td>
+ <td>
+ <p class="cell"><strong>Pig Data Type</strong></p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">int</p>
+ </td>
+ <td>
+ <p class="cell">int</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">long</p>
+ </td>
+ <td>
+ <p class="cell">long</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">float</p>
+ </td>
+ <td>
+ <p class="cell">float</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">double</p>
+ </td>
+ <td>
+ <p class="cell">double</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">string</p>
+ </td>
+ <td>
+ <p class="cell">chararray</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">boolean</p>
+ </td>
+ <td>
+ <p class="cell">boolean</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">binary</p>
+ </td>
+ <td>
+ <p class="cell">bytearray</p>
+ </td>
+ </tr>
+</table>
+<table>
+ <tr>
+ <th>
+ <p class="center">Complex Types</p>
+ </th>
+ <th>
+ <p class="center"></p>
+ </th>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell"><strong>HCatalog Data Type</strong></p>
+ </td>
+ <td>
+ <p class="cell"><strong>Pig Data Type</strong></p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">map<br/>(key type should be string)</p>
+ </td>
+ <td>
+ <p class="cell">map </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">List<em><any type></em> </p>
+ </td>
+ <td>
+ <p class="cell">bag </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">struct<em><any type fields></em> </p>
+ </td>
+ <td>
+ <p class="cell">tuple </p>
+ </td>
</tr>
</table>
+
+<p><br/></p>
+<p>Currently HCatLoader cannot map the smallint and tinyint datatypes to Pig
data types.
+This issue exists in HCatalog version 0.4.0;
+the fix for Jira issue <a
href="https://issues.apache.org/jira/browse/HCATALOG-425">HCATALOG-425</a>
should be available in HCatalog version 0.5.0.</p>
+
</section>
<!-- ==================================================================== -->
@@ -355,51 +435,132 @@ HCatStorer with no argument:</p>
<section>
<title>HCatalog Data Types</title>
-<p>Restrictions apply to the types of columns HCatStorer can write.</p>
-<p>HCatStorer can write <strong>only</strong> the data types listed in the
table.
+<p>Restrictions apply to the types of columns HCatStorer can write to
HCatalog-managed tables.</p>
+<p>HCatStorer can write <em><strong>only</strong></em> the data types listed
in the table.
The table shows how Pig will interpret the HCatalog data type.</p>
- <table>
- <tr>
- <td>
- <p><strong>HCatalog Data Type</strong></p>
- </td>
- <td>
- <p><strong>Pig Data Type</strong></p>
- </td>
- </tr>
- <tr>
- <td>
- <p>primitives (int, long, float, double, string) </p>
- </td>
- <td>
- <p>int, long, float, double, string to chararray </p>
- </td>
- </tr>
- <tr>
- <td>
- <p>map (key type should be string, valuetype must be string)</p>
- </td>
- <td>
- <p>map </p>
- </td>
- </tr>
- <tr>
- <td>
- <p>List<any type> </p>
- </td>
- <td>
- <p>bag </p>
- </td>
- </tr>
- <tr>
- <td>
- <p>struct<any type fields> </p>
- </td>
- <td>
- <p>tuple </p>
- </td>
+
+ <table>
+ <tr>
+ <th>
+ <p class="center">Primitives</p>
+ </th>
+ <th>
+ <p class="center"> </p>
+ </th>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell"><strong>Pig Data Type</strong></p>
+ </td>
+ <td>
+ <p class="cell"><strong>HCatalog Data Type</strong></p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">int</p>
+ </td>
+ <td>
+ <p class="cell">int</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">long</p>
+ </td>
+ <td>
+ <p class="cell">long</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">float</p>
+ </td>
+ <td>
+ <p class="cell">float</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">double</p>
+ </td>
+ <td>
+ <p class="cell">double</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">chararray</p>
+ </td>
+ <td>
+ <p class="cell">string</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">boolean</p>
+ </td>
+ <td>
+ <p class="cell">boolean</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">bytearray</p>
+ </td>
+ <td>
+ <p class="cell">binary</p>
+ </td>
+ </tr>
+</table>
+<table>
+ <tr>
+ <th>
+ <p class="center">Complex Types</p>
+ </th>
+ <th>
+ <p class="center"> </p>
+ </th>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell"><strong>Pig Data Type</strong></p>
+ </td>
+ <td>
+ <p class="cell"><strong>HCatalog Data Type</strong></p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">map</p>
+ </td>
+ <td>
+ <p class="cell">map<br/>(key type should be string)</p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">bag </p>
+ </td>
+ <td>
+ <p class="cell">List<em><any type></em> </p>
+ </td>
+ </tr>
+ <tr>
+ <td>
+ <p class="cell">tuple </p>
+ </td>
+ <td>
+ <p class="cell">struct<em><any type fields></em> </p>
+ </td>
</tr>
</table>
+
+<p><br/></p>
+<p>Currently HCatLoader cannot map the smallint and tinyint datatypes to Pig
data types.
+This issue exists in HCatalog version 0.4.0;
+the fix for Jira issue <a
href="https://issues.apache.org/jira/browse/HCATALOG-425">HCATALOG-425</a>
should be available in HCatalog version 0.5.0.</p>
+
</section>
</section>