sqoop git commit: SQOOP-3293: Document SQOOP-2976

vasas Wed, 21 Mar 2018 07:53:14 -0700

Repository: sqoop
Updated Branches:
  refs/heads/trunk a7f5e0d29 -> d57f9fb06



SQOOP-3293: Document SQOOP-2976

(Fero Szabo by Szabolcs Vasas)


Project: http://git-wip-us.apache.org/repos/asf/sqoop/repo
Commit: http://git-wip-us.apache.org/repos/asf/sqoop/commit/d57f9fb0
Tree: http://git-wip-us.apache.org/repos/asf/sqoop/tree/d57f9fb0
Diff: http://git-wip-us.apache.org/repos/asf/sqoop/diff/d57f9fb0

Branch: refs/heads/trunk
Commit: d57f9fb06b55650adc75cd1972df0024d7e4dba1
Parents: a7f5e0d
Author: Szabolcs Vasas <[email protected]>
Authored: Wed Mar 21 15:51:02 2018 +0100
Committer: Szabolcs Vasas <[email protected]>
Committed: Wed Mar 21 15:51:46 2018 +0100

----------------------------------------------------------------------
 COMPILING.txt            | 10 ++++++++++
 src/docs/user/import.txt | 34 ++++++++++++++++++++++++++++++++--
 2 files changed, 42 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/sqoop/blob/d57f9fb0/COMPILING.txt
----------------------------------------------------------------------
diff --git a/COMPILING.txt b/COMPILING.txt
index 86be509..3b82250 100644
--- a/COMPILING.txt
+++ b/COMPILING.txt
@@ -411,3 +411,13 @@ To switch back to the previous version of Hadoop 0.20, for 
example, run:
 ++++
 ant test -Dhadoopversion=20
 ++++
+
+== Building the documentation
+
+Building the documentation requires that you have toxml installed.
+Also, one needs to set the XML_CATALOG_FILES environment variable.
+
+++++
+export XML_CATALOG_FILES=/usr/local/etc/xml/catalog
+ant docs
+++++

http://git-wip-us.apache.org/repos/asf/sqoop/blob/d57f9fb0/src/docs/user/import.txt
----------------------------------------------------------------------
diff --git a/src/docs/user/import.txt b/src/docs/user/import.txt
index 330d544..e91a5a8 100644
--- a/src/docs/user/import.txt
+++ b/src/docs/user/import.txt
@@ -257,7 +257,7 @@ username is +someuser+, then the import tool will write to
 the import with the +\--warehouse-dir+ argument. For example:
 
 ----
-$ sqoop import --connnect <connect-str> --table foo --warehouse-dir /shared \
+$ sqoop import --connect <connect-str> --table foo --warehouse-dir /shared \
     ...
 ----
 
@@ -266,7 +266,7 @@ This command would write to a set of files in the 
+/shared/foo/+ directory.
 You can also explicitly choose the target directory, like so:
 
 ----
-$ sqoop import --connnect <connect-str> --table foo --target-dir /dest \
+$ sqoop import --connect <connect-str> --table foo --target-dir /dest \
     ...
 ----
 
@@ -444,6 +444,27 @@ argument, or specify any Hadoop compression codec using the
 +\--compression-codec+ argument. This applies to SequenceFile, text,
 and Avro files.
 
+Enabling Logical Types in Avro and Parquet import for numbers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+To enable the use of logical types in Sqoop's avro schema generation,
+i.e. used during both avro and parquet imports, one has to use the
+sqoop.avro.logical_types.decimal.enable flag. This is necessary if one
+wants to store values as decimals in the avro file format.
+
+Padding number types in avro import
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Certain databases, such as Oracle and Postgres store number and decimal
+values without padding. For example 1.5 in a column declared
+as NUMBER (20,5) is stored as is in Oracle, while the equivalent
+DECIMAL (20, 5) is stored as 1.50000 in an SQL server instance.
+This leads to a scale mismatch during avro import.
+
+To avoid this error, one can use the sqoop.avro.decimal_padding.enable flag
+to turn on padding with 0s. This flag has to be used together with the
+sqoop.avro.logical_types.decimal.enable flag set to true.
+
 Large Objects
 ^^^^^^^^^^^^^
 
@@ -777,3 +798,12 @@ rows copied into HDFS:
 $ sqoop import --connect jdbc:mysql://db.foo.com/corp \
     --table EMPLOYEES --validate
 ----
+
+Enabling logical types in avro import and also turning on padding with 0s:
+
+----
+$ sqoop import -Dsqoop.avro.decimal_padding.enable=true 
-Dsqoop.avro.logical_types.decimal.enable=true
+    --connect $CON --username $USER --password $PASS --query "select * from 
table_name where \$CONDITIONS"
+    --target-dir hdfs://nameservice1//etl/target_path --as-avrodatafile 
--verbose -m 1
+
+----

sqoop git commit: SQOOP-3293: Document SQOOP-2976

Reply via email to