benj created DRILL-7595:
---------------------------
Summary: Change of data type from bigint to int when parquet with
multiple fragment
Key: DRILL-7595
URL: https://issues.apache.org/jira/browse/DRILL-7595
Project: Apache Drill
Issue Type: Bug
Components: Storage - Parquet
Affects Versions: 1.17.0
Reporter: benj
like on DRILL-7104, there is a bug that change the type from BIGINT to INT
where a parquet have multiple fragment
With a file containing few row (all is fine (we store a BIGINT and really have
a BIGINT in the Parquet)
{code:sql}
apache drill> CREATE TABLE dfs.tmp.`out_pqt` AS (SELECT CAST(0 as BIGINT) AS d
FROM dfs.tmp.`fewrowfile`;
+----------+---------------------------+
| Fragment | Number of records written |
+----------+---------------------------+
| 1_0 | 1500 |
+----------+---------------------------+
apache drill> SELECT typeof(d) FROM dfs.tmp.`out_pqt`;
+--------+
| EXPR$0 |
+--------+
| BIGINT |
+--------+
{code}
With a file containing "enough" row (there is a problem (we store a BIGINT but
we unfortunatly have an INT in the Parquet)
{code:sql}
apache drill> CREATE TABLE dfs.tmp.`out_pqt` AS (SELECT CAST(0 as BIGINT) AS d
FROM dfs.tmp.`fewrowfile`;
+----------+---------------------------+
| Fragment | Number of records written |
+----------+---------------------------+
| 1_1 | 934111 |
| 1_0 | 1488743 |
+----------+---------------------------+
apache drill> SELECT typeof(d) FROM dfs.tmp.`out_pqt`;
+--------+
| EXPR$0 |
+--------+
| INT |
+--------+
{code}
It's not really satisfactory but please note that there is a Trick to avoid
this problem: using a CAST('0' AS BIGINT) instead of a CAST(0 AS BIGINT)
{code:sql}
apache drill> CREATE TABLE dfs.tmp.`out_pqt` AS (SELECT CAST('0' as BIGINT) AS
d FROM dfs.tmp.`fewrowfile`;
+----------+---------------------------+
| Fragment | Number of records written |
+----------+---------------------------+
| 1_1 | 934111 |
| 1_0 | 1488743 |
+----------+---------------------------+
apache drill> SELECT typeof(d) FROM dfs.tmp.`out_pqt`;
+--------+
| EXPR$0 |
+--------+
| BIGINT |
+--------+
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)