[ 
https://issues.apache.org/jira/browse/IMPALA-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17237555#comment-17237555
 ] 

Tim Armstrong commented on IMPALA-10350:
----------------------------------------

The above was slightly misleading since it returned a DECIMAL value. If you 
cast to double, you get the same value as you get from reading from the data 
file.
{noformat}
[localhost:21050] default> select cast(-0.43149576573887316 as double);
Query: select cast(-0.43149576573887316 as double)
Query submitted at: 2020-11-23 09:49:32 (Coordinator: 
http://tarmstrong-box:25000)
Query progress can be monitored at: 
http://tarmstrong-box:25000/query_plan?query_id=6c414c04b47d0897:10bb3cc500000000
+--------------------------------------+
| cast(-0.43149576573887316 as double) |
+--------------------------------------+
| -0.431495765739                      |
+--------------------------------------+
{noformat}

The expressions are

{noformat}
                    [0] = TExprNode {
                      01: node_type (i32) = 14,
                      02: type (struct) = TColumnType {
                        01: types (list) = list<struct>[1] {
                          [0] = TTypeNode {
                            01: type (i32) = 0,
                            02: scalar_type (struct) = TScalarType {
                              01: type (i32) = 8,
                            },
                          },
                        },
                      },
                      03: num_children (i32) = 1,
                      04: is_constant (bool) = false,
                      05: fn (struct) = TFunction {
                        01: name (struct) = TFunctionName {
                          01: db_name (string) = "_impala_builtins",
                          02: function_name (string) = "casttodouble",
                        },
                        02: binary_type (i32) = 0,
                        03: arg_types (list) = list<struct>[1] {
                          [0] = TColumnType {
                            01: types (list) = list<struct>[1] {
                              [0] = TTypeNode {
                                01: type (i32) = 0,
                                02: scalar_type (struct) = TScalarType {
                                  01: type (i32) = 14,
                                  03: precision (i32) = -1,
                                  04: scale (i32) = -1,
                                },
                              },
                            },
                          },
                        },

...
                        [0] = TExpr {
                          01: nodes (list) = list<struct>[1] {
                            [0] = TExprNode {
                              01: node_type (i32) = 5,
                              02: type (struct) = TColumnType {
                                01: types (list) = list<struct>[1] {
                                  [0] = TTypeNode {
                                    01: type (i32) = 0,
                                    02: scalar_type (struct) = TScalarType {
                                      01: type (i32) = 14,
                                      03: precision (i32) = 17,
                                      04: scale (i32) = 17,
                                    },
                                  },
                                },
                              },
                              03: num_children (i32) = 0,
                              04: is_constant (bool) = true,
                              18: decimal_literal (struct) = TDecimalLiteral {
                                01: value (string) = 
"\xfff\xb3\xb0P\x1a\xdc\xac",
                              },
                            },
                          },
{noformat}



> Impala loses double precision on the write side
> -----------------------------------------------
>
>                 Key: IMPALA-10350
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10350
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Zoltán Borók-Nagy
>            Priority: Major
>              Labels: correctness
>
> Impala might loses presision of double values. Reproduction: 
> {noformat}
> create table double_tbl (d double) stored as textfile;
> insert into double_tbl values (-0.43149576573887316);
> {noformat}
>  Then inspect the data file:
> {noformat}
> $ hdfs dfs -cat 
> /test-warehouse/double_tbl/424097c644088674-c55b910100000000_175064830_data.0.txt
>  -0.4314957657388731{noformat}
> The same happens if we store our data in Parquet.
> Hive writes don't lose precision. If the data was written by Hive then Impala 
> can read the values correctly:
> {noformat}
> $ bin/run-jdbc-client.sh -t NOSASL -q "select * from double_tbl;"
> Using JDBC Driver Name: org.apache.hive.jdbc.HiveDriver
> Connecting to: jdbc:hive2://localhost:21050/;auth=noSasl
> Executing: select * from double_tbl
> ----[START]----
> -0.43149576573887316
> ----[END]----{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to