decimal data type disabled

Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/f58d360e
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/f58d360e
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/f58d360e

Branch: refs/heads/gh-pages
Commit: f58d360e43331825ea7e9885c167078e2a1c4ff1
Parents: c38e6a1
Author: Kristine Hahn <[email protected]>
Authored: Thu May 14 14:40:08 2015 -0700
Committer: Kristine Hahn <[email protected]>
Committed: Thu May 14 14:40:08 2015 -0700

----------------------------------------------------------------------
 .../020-hive-to-drill-data-type-mapping.md      | 20 ++++++++++++++++----
 .../040-parquet-format.md                       |  4 +++-
 .../050-json-data-model.md                      |  8 ++++----
 _docs/getting-started/020-why-drill.md          |  2 +-
 .../040-tableau-examples.md                     |  2 +-
 .../060-querying-the-information-schema.md      |  1 +
 .../005-querying-complex-data-introduction.md   |  2 +-
 .../data-types/010-supported-data-types.md      | 15 ++++++++++++++-
 .../030-handling-different-data-types.md        |  9 ++++-----
 .../050-aggregate-and-aggregate-statistical.md  |  2 ++
 10 files changed, 47 insertions(+), 18 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/data-sources-and-file-formats/020-hive-to-drill-data-type-mapping.md
----------------------------------------------------------------------
diff --git 
a/_docs/data-sources-and-file-formats/020-hive-to-drill-data-type-mapping.md 
b/_docs/data-sources-and-file-formats/020-hive-to-drill-data-type-mapping.md
index ed66a7a..20a921f 100644
--- a/_docs/data-sources-and-file-formats/020-hive-to-drill-data-type-mapping.md
+++ b/_docs/data-sources-and-file-formats/020-hive-to-drill-data-type-mapping.md
@@ -11,8 +11,8 @@ Using Drill you can read tables created in Hive that use data 
types compatible w
 | BIGINT             | BIGINT    | 8-byte signed integer                       
               |
 | BOOLEAN            | BOOLEAN   | TRUE (1) or FALSE (0)                       
               |
 | CHAR               | CHAR      | Character string, fixed-length max 255      
               |
-| DATE               | DATE      | Years months and days in the form in the 
form YYYY-­MM-­DD |
-| DECIMAL            | DECIMAL   | 38-digit precision                          
               |
+| DATE               | DATE      | Years months and days in the form in the 
form YYYY-­MM-­DD   |
+| DECIMAL*           | DECIMAL   | 38-digit precision                          
               |
 | FLOAT              | FLOAT     | 4-byte single precision floating point 
number              |
 | DOUBLE             | DOUBLE    | 8-byte double precision floating point 
number              |
 | INT or INTEGER     | INT       | 4-byte signed integer                       
               |
@@ -25,6 +25,8 @@ Using Drill you can read tables created in Hive that use data 
types compatible w
 | None               | STRING    | Binary string (16)                          
               |
 | VARCHAR            | VARCHAR   | Character string variable length            
               |
 
+\* In this release, Drill disables the DECIMAL data type, including casting to 
DECIMAL and reading DECIMAL types from Parquet and Hive. To enable the DECIMAL 
type, set the `planner.enable_decimal_data_type` option to `true`.
+
 ## Unsupported Types
 Drill does not support the following Hive types:
 
@@ -41,8 +43,14 @@ This example demonstrates the mapping of Hive data types to 
Drill data types. Us
 
      
8223372036854775807,true,3.5,-1231.4,3.14,42,"SomeText",2015-03-25,2015-03-25 
01:23:15 
 
-The example assumes that the CSV resides on the MapR file system (MapRFS) in 
the Drill sandbox: `/mapr/demo.mapr.com/data/`
- 
+### Example Assumptions
+The example makes the following assumptions:
+
+* The CSV resides on the MapR file system (MapRFS) in the Drill sandbox: 
`/mapr/demo.mapr.com/data/`  
+* You [enabled the DECIMAL data 
type]({{site.baseurl}}/docs/supported-data-types#enabling-the-decimal-type) in 
Drill.  
+
+### Define an External Table in Hive
+
 In Hive, you define an external table using the following query:
 
     hive> CREATE EXTERNAL TABLE types_demo ( 
@@ -59,6 +67,8 @@ In Hive, you define an external table using the following 
query:
           LINES TERMINATED BY '\n' 
           STORED AS TEXTFILE LOCATION '/mapr/demo.mapr.com/data/mytypes.csv';
 
+\* In this release, Drill disables the DECIMAL data type, including casting to 
DECIMAL and reading DECIMAL types from Parquet and Hive. To enable the DECIMAL 
type, set the `planner.enable_decimal_data_type` option to `true`.
+
 You check that Hive mapped the data from the CSV to the typed values as as 
expected:
 
     hive> SELECT * FROM types_demo;
@@ -66,6 +76,8 @@ You check that Hive mapped the data from the CSV to the typed 
values as as expec
     8223372036854775807        true    3.5     -1231.4 3.14    42      
"SomeText"      2015-03-25   2015-03-25 01:23:15
     Time taken: 0.524 seconds, Fetched: 1 row(s)
 
+### Connect Drill to Hive and Query the Data
+
 In Drill, you use the Hive storage plugin that has the following definition.
 
        {

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/data-sources-and-file-formats/040-parquet-format.md
----------------------------------------------------------------------
diff --git a/_docs/data-sources-and-file-formats/040-parquet-format.md 
b/_docs/data-sources-and-file-formats/040-parquet-format.md
index 55c4e68..cd14359 100644
--- a/_docs/data-sources-and-file-formats/040-parquet-format.md
+++ b/_docs/data-sources-and-file-formats/040-parquet-format.md
@@ -144,11 +144,13 @@ Parquet also supports logical types, fully described on 
the [Apache Parquet site
 | None                         |                                               
                                 | UINT_16              | 16 bits, unsigned     
                                                                                
                                     |
 | None                         |                                               
                                 | UINT_32              | 32 bits, unsigned     
                                                                                
                                     |
 | None                         |                                               
                                 | UINT_64              | 64 bits, unsigned     
                                                                                
                                     |
-| DECIMAL                      | 38-digit precision                            
                                 | DECIMAL              | Arbitrary-precision 
signed decimal numbers of the form unscaledValue * 10^(-scale)                  
                                       |
+| DECIMAL*                     | 38-digit precision                            
                                 | DECIMAL              | Arbitrary-precision 
signed decimal numbers of the form unscaledValue * 10^(-scale)                  
                                       |
 | TIME                         | Hours, minutes, seconds, milliseconds; 
24-hour basis                           | TIME_MILLIS          | Logical time, 
not including the date. Annotates int32. Number of milliseconds after midnight. 
                                             |
 | TIMESTAMP                    | Year, month, day, and seconds                 
                                 | TIMESTAMP_MILLIS     | Logical date and 
time. Annotates an int64 that stores the number of milliseconds from the Unix 
epoch, 00:00:00.000 on 1 January 1970, UTC. |
 | INTERVALDAY and INTERVALYEAR | Integer fields representing a period of time 
depending on the type of interval | INTERVAL             | An interval of time. 
Annotates a fixed_len_byte_array of length 12. Months, days, and ms in unsigned 
little-endian format.                 |
 
+\* In this release, Drill disables the DECIMAL data type, including casting to 
DECIMAL and reading DECIMAL types from Parquet and Hive. To enable the DECIMAL 
type, set the `planner.enable_decimal_data_type` option to `true`.
+
 ## Data Description Language Support
 Parquet supports the following data description languages:
 

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/data-sources-and-file-formats/050-json-data-model.md
----------------------------------------------------------------------
diff --git a/_docs/data-sources-and-file-formats/050-json-data-model.md 
b/_docs/data-sources-and-file-formats/050-json-data-model.md
index 28ab921..82b8a3a 100644
--- a/_docs/data-sources-and-file-formats/050-json-data-model.md
+++ b/_docs/data-sources-and-file-formats/050-json-data-model.md
@@ -40,11 +40,11 @@ The following table shows SQL-JSON data type mapping:
 By default, Drill does not support JSON lists of different types. For example, 
JSON does not enforce types or distinguish between integers and floating point 
values. When reading numerical values from a JSON file, Drill distinguishes 
integers from floating point numbers by the presence or lack of a decimal 
point. If some numbers in a JSON map or array appear with and without a decimal 
point, such as 0 and 0.0, Drill throws a schema change error. You use the 
following options to read JSON lists of different types:
 
 * `store.json.read_numbers_as_double`  
-  Reads numbers from JSON files with or without a decimal point as DOUBLE.
-* `store.json.all_text_mode`
-  Reads all data from JSON files as VARCHAR.
+  Reads numbers from JSON files with or without a decimal point as DOUBLE. You 
need to cast numbers from VARCHAR to numerical data types, such as DOUBLE or 
INTEGER.
+* `store.json.all_text_mode`  
+  Reads all data from JSON files as VARCHAR. you need to cast numbers from 
DOUBLE to other numerical types only if you cannot use the numbers as DOUBLE.
 
-The following session/system options for `store.json.all_text_mode` and 
`store.json.read_numbers_as_double` options is false. Enable the latter if the 
JSON contains integers and floating point numbers. Using either option prevents 
schema errors, but using `store.json.read_numbers_as_double` has an advantage 
over `store.json.all_text_mode`: You do not have to cast every number from 
VARCHAR to DOUBLE or BIGINT when you query the JSON file.
+The default setting of `store.json.all_text_mode` and 
`store.json.read_numbers_as_double` options is false. Using either option 
prevents schema errors, but using `store.json.read_numbers_as_double` has an 
advantage over `store.json.all_text_mode`. Using 
`store.json.read_numbers_as_double` typically involves less explicit casting 
than using `store.json.all_text_mode` because you can often use the numerical 
data as is -\-DOUBLE.
 
 ### Handling Type Differences
 Set the `store.json.read_numbers_as_double` property to true.

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/getting-started/020-why-drill.md
----------------------------------------------------------------------
diff --git a/_docs/getting-started/020-why-drill.md 
b/_docs/getting-started/020-why-drill.md
index f7f4495..d00d882 100644
--- a/_docs/getting-started/020-why-drill.md
+++ b/_docs/getting-started/020-why-drill.md
@@ -39,7 +39,7 @@ Drill's schema-free JSON model allows you to query complex, 
semi-structured data
 
 
 ## 4. Real SQL -- not "SQL-like"
-Drill supports the standard SQL:2003 syntax. No need to learn a new "SQL-like" 
language or struggle with a semi-functional BI tool. Drill supports many data 
types including DATE, INTERVALDAY/INTERVALYEAR, TIMESTAMP, VARCHAR and DECIMAL, 
as well as complex query constructs such as correlated sub-queries and joins in 
WHERE clauses. Here is an example of a TPC-H standard query that runs in Drill 
"as is":
+Drill supports the standard SQL:2003 syntax. No need to learn a new "SQL-like" 
language or struggle with a semi-functional BI tool. Drill supports many data 
types including DATE, INTERVALDAY/INTERVALYEAR, TIMESTAMP, and VARCHAR, as well 
as complex query constructs such as correlated sub-queries and joins in WHERE 
clauses. Here is an example of a TPC-H standard query that runs in Drill "as 
is":
 
 ### TPC-H query 4
 

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/odbc-jdbc-interfaces/using-odbc-on-windows/040-tableau-examples.md
----------------------------------------------------------------------
diff --git 
a/_docs/odbc-jdbc-interfaces/using-odbc-on-windows/040-tableau-examples.md 
b/_docs/odbc-jdbc-interfaces/using-odbc-on-windows/040-tableau-examples.md
index 85e8c88..9b73096 100644
--- a/_docs/odbc-jdbc-interfaces/using-odbc-on-windows/040-tableau-examples.md
+++ b/_docs/odbc-jdbc-interfaces/using-odbc-on-windows/040-tableau-examples.md
@@ -14,7 +14,7 @@ This section includes the following examples:
 
 The steps and results of these examples assume pre-configured schemas and
 source data. You configure schemas as storage plugin instances on the Storage
-tab of the [Drill Web UI]({{ site.baseurl 
}}/docs/getting-to-know-the-drill-sandbox#storage-plugin-overview).
+tab of the [Drill Web UI]({{ site.baseurl 
}}/docs/getting-to-know-the-drill-sandbox#storage-plugin-overview). Also, the 
examples assume you [enabled the DECIMAL data 
type]({{site.baseurl}}/docs/supported-data-types#enabling-the-decimal-type) in 
Drill.  
 
 ## Example: Connect to a Hive Table in Tableau
 

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/query-data/060-querying-the-information-schema.md
----------------------------------------------------------------------
diff --git a/_docs/query-data/060-querying-the-information-schema.md 
b/_docs/query-data/060-querying-the-information-schema.md
index 590fad2..fddb194 100644
--- a/_docs/query-data/060-querying-the-information-schema.md
+++ b/_docs/query-data/060-querying-the-information-schema.md
@@ -107,3 +107,4 @@ of those columns:
     | OrderTotal  | Decimal    |
     +-------------+------------+
 
+In this release, Drill disables the DECIMAL data type, including casting to 
DECIMAL and reading DECIMAL types from Parquet and Hive. [Enable the DECIMAL 
data 
type]({{site.baseurl}}/docs/supported-data-types#enabling-the-decimal-type)) if 
performance is not an issue.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/query-data/querying-complex-data/005-querying-complex-data-introduction.md
----------------------------------------------------------------------
diff --git 
a/_docs/query-data/querying-complex-data/005-querying-complex-data-introduction.md
 
b/_docs/query-data/querying-complex-data/005-querying-complex-data-introduction.md
index 2f59219..a6a8c84 100644
--- 
a/_docs/query-data/querying-complex-data/005-querying-complex-data-introduction.md
+++ 
b/_docs/query-data/querying-complex-data/005-querying-complex-data-introduction.md
@@ -43,7 +43,7 @@ The examples in this section operate on JSON data files. In 
order to write
 your own queries, you need to be aware of the basic data types in these files:
 
   * string (all data inside double quotes), such as `"0001"` or `"Cake"`
-  * number: integers, decimals, and floats, such as `0.55` or `10`
+  * number: integers and floats, such as `0.55` or `10`
   * null values
   * boolean values: true, false
 

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/sql-reference/data-types/010-supported-data-types.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/data-types/010-supported-data-types.md 
b/_docs/sql-reference/data-types/010-supported-data-types.md
index 5641dc4..c588a8b 100644
--- a/_docs/sql-reference/data-types/010-supported-data-types.md
+++ b/_docs/sql-reference/data-types/010-supported-data-types.md
@@ -22,10 +22,23 @@ Drill reads from and writes to data sources having a wide 
variety of types. Dril
 | CHARACTER VARYING, CHARACTER, CHAR, or VARCHAR*** | UTF8-encoded 
variable-length string. The default limit is 1 character. The maximum character 
limit is 2,147,483,647. | CHAR(30) casts data to a 30-character string maximum. 
                         |
 
 
-\* In this release, the NUMERIC data type is an alias for the DECIMAL data 
type.  
+\* In this release, Drill disables the DECIMAL data type, including casting to 
DECIMAL and reading DECIMAL types from Parquet and Hive. The NUMERIC data type 
is an alias for the DECIMAL data type.
 \*\* Not currently supported.  
 \*\*\* Currently, Drill supports only variable-length strings.  
 
+## Enabling the DECIMAL Type
+
+To enable the DECIMAL type, set the `planner.enable_decimal_data_type` option 
to `true`. Enable the DECIMAL data type if performance is not an issue.
+
+     ALTER SYSTEM SET `planner.enable_decimal_data_type` = true;
+
+     +------------+------------+
+     |     ok     |  summary   |
+     +------------+------------+
+     | true       | planner.enable_decimal_data_type updated. |
+     +------------+------------+
+     1 row selected (1.191 seconds)
+
 ## Casting and Converting Data Types
 
 In Drill, you cast or convert data to the required type for moving data from 
one data source to another or to make the data readable.

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/sql-reference/data-types/030-handling-different-data-types.md
----------------------------------------------------------------------
diff --git 
a/_docs/sql-reference/data-types/030-handling-different-data-types.md 
b/_docs/sql-reference/data-types/030-handling-different-data-types.md
index 64121dd..110d32d 100644
--- a/_docs/sql-reference/data-types/030-handling-different-data-types.md
+++ b/_docs/sql-reference/data-types/030-handling-different-data-types.md
@@ -11,8 +11,7 @@ In a textual file, such as CSV, Drill interprets every field 
as a VARCHAR, as pr
 ## Handling JSON and Parquet Data
 Complex and nested data structures in JSON and Parquet files are of map and 
array types.
 
-A map is a set of name/value pairs. A value in a map can be a scalar type, 
such as string or int, or a complex type, such as an array or another map.
-An array is a repeated list of values. A value in an array can be a scalar 
type, such as string or int, or an array can be a complex type, such as a map 
or another array.
+A map is a set of name/value pairs. A value in a map can be a scalar type, 
such as string or int, or a complex type, such as an array or another map. An 
array is a repeated list of values. A value in an array can be a scalar type, 
such as string or int, or an array can be a complex type, such as a map or 
another array.
 
 Drill reads/writes maps and arrays from/to JSON and Parquet files. In Drill, 
you do not cast a map or array to another type.
 
@@ -58,15 +57,15 @@ The following example shows a JSON array having complex 
type values:
 
 ## Reading numbers of different types from JSON
 
-The `store.json.read_numbers_as_double` and `store.json.all_text_mode` 
system/session options control how Drill implicitly casts JSON data. By 
default, when reading numerical values from a JSON file, Drill implicitly casts 
a number to the DOUBLE or BIGINT type depending on the presence or absence a 
decimal point. If some numbers in a JSON map or array appear with and without a 
decimal point, such as 0 and 0.0, Drill throws a schema change error. By 
default, Drill reads numbers without decimal point as BIGINT values by default. 
The range of BIGINT is -9223372036854775808 to 9223372036854775807. A BIGINT 
result outside this range produces an error. 
+The `store.json.read_numbers_as_double` and `store.json.all_text_mode` 
system/session options control how Drill implicitly casts JSON data. By 
default, when reading numerical values from a JSON file, Drill implicitly casts 
a number to the DOUBLE or BIGINT type depending on the presence or absence of a 
decimal point. If some numbers in a JSON map or array appear with and without a 
decimal point, such as 0 and 0.0, Drill throws a schema change error. By 
default, Drill reads numbers without decimal point as BIGINT values. The range 
of BIGINT is -9223372036854775808 to 9223372036854775807. A BIGINT result 
outside this range produces an error. 
 
-To prevent Drill from attempting to read such data, set 
`store.json.read_numbers_as_double` or `store.json.all_text_mode` to true. 
Using `store.json.all_text_mode` set to true, Drill implicitly casts JSON data 
to VARCHAR. You need to cast the VARCHAR values to other types you want the 
returned data to represent. Using `store.json.read_numbers_as_double` set to 
true, Drill casts numbers in the JSON file to DOUBLE. You need to cast the 
DOUBLE to any other types of numbers, such as FLOAT and INTEGER, you want the 
returned data to represent. Using `store.json.read_numbers_as_double` typically 
involves less casting on your part than using `store.json.all_text_mode`.
+To prevent Drill from attempting to read such data, set 
`store.json.read_numbers_as_double` or `store.json.all_text_mode` to true. 
Using `store.json.all_text_mode` set to true, Drill implicitly casts JSON data 
to VARCHAR. You need to cast the VARCHAR values to other types. Using 
`store.json.read_numbers_as_double` set to true, Drill implicitly casts numbers 
in the JSON file to DOUBLE. You need to cast the DOUBLE type to other types, 
such as FLOAT and INTEGER. Using `store.json.read_numbers_as_double` typically 
involves less explicit casting than using `store.json.all_text_mode` because 
you can often use the numerical data as is (DOUBLE).
 
 ## Guidelines for Using Float and Double
 
 FLOAT and DOUBLE yield approximate results. These are variable-precision 
numeric types. Drill does not cast/convert all values precisely to the internal 
format, but instead stores approximations. Slight differences can occur in the 
value stored and retrieved. The following guidelines are recommended:
 
-* For conversions involving monetary calculations, for example, that require 
precise results use the decimal type instead of float or double.
+* For conversions involving monetary calculations, for example, that require 
precise results use the DECIMAL type instead of FLOAT or DOUBLE. In this 
release, Drill disables the DECIMAL data type, including casting to DECIMAL and 
reading DECIMAL types from Parquet and Hive. [Enable the DECIMAL data 
type]({{site.baseurl}}/docs/supported-data-types#enabling-the-decimal-type)) if 
performance is not an issue.
 * For complex calculations or mission-critical applications, especially those 
involving infinity and underflow situations, carefully consider the limitations 
of type casting that involves FLOAT or DOUBLE.
 * Equality comparisons between floating-point values can produce unexpected 
results.
 

http://git-wip-us.apache.org/repos/asf/drill/blob/f58d360e/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
----------------------------------------------------------------------
diff --git 
a/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md 
b/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
index d6e66de..21e6e40 100644
--- 
a/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
+++ 
b/_docs/sql-reference/sql-functions/050-aggregate-and-aggregate-statistical.md
@@ -17,6 +17,8 @@ MAX(expression)| BINARY, DECIMAL, VARCHAR, DATE, TIME, or 
TIMESTAMP| same as arg
 MIN(expression)| BINARY, DECIMAL, VARCHAR, DATE, TIME, or TIMESTAMP| same as 
argument type
 SUM(expression)| SMALLINT, INTEGER, BIGINT, FLOAT, DOUBLE, DECIMAL, 
INTERVALDAY, or INTERVALYEAR| BIGINT for SMALLINT or INTEGER arguments, DECIMAL 
for BIGINT arguments, DOUBLE for floating-point arguments, otherwise the same 
as the argument data type
 
+\* In this release, Drill disables the DECIMAL data type, including casting to 
DECIMAL and reading DECIMAL types from Parquet and Hive. [Enable the DECIMAL 
data 
type]({{site.baseurl}}/docs/supported-data-types#enabling-the-decimal-type)) if 
performance is not an issue.
+
 MIN, MAX, COUNT, AVG, and SUM accept ALL and DISTINCT keywords. The default is 
ALL.
 
 ### Examples

Reply via email to