Raymond Wong created DRILL-5982:
-----------------------------------
Summary: CTAS creates parquet files with inconsistent nullable
column
Key: DRILL-5982
URL: https://issues.apache.org/jira/browse/DRILL-5982
Project: Apache Drill
Issue Type: Bug
Affects Versions: 1.11.0
Environment: windows 10
Reporter: Raymond Wong
Create two CTAS parquet files. One CTAS statement uses a MySQL as data source.
The other one uses {{(Values(1))}} as data source. Both files have the same
schema - same column names and data type.
The Parquet file created with MySQL data source has nullable columns and the
file created with {{(Values(1))}} has non-nullable columns.
{quote}
DROP TABLE dfs.tmp.table1;
CREATE TABLE dfs.tmp.table1 AS
SELECT 'CA' AS state, CAST(1 AS BIGINT) AS id
FROM `mysql_dw_reporting.datawarehouse1`.DW_Qualbe_Cust_And_CustPay
LIMIT 1;
DROP TABLE dfs.tmp.table2;
CREATE TABLE dfs.tmp.table2 AS
SELECT 'NY' AS state, CAST(2 AS BIGINT) AS id
FROM (Values(1))
;
{quote}
The result of this inconsistency impacts the ability to apply SQL window
function across parquet tables. Querying table1 and table2 with a SQL window
function generates an error message as follows
{quote}
SELECT id, FIRST_VALUE(state) OVER( PARTITION BY id ) AS state
FROM dfs.tmp.`table*`
SQL Error: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts
with changing schemas
{quote}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)