Krystal created DRILL-1524:
------------------------------
Summary: Data from hive parquet table is displayed as "null" when
select all columns
Key: DRILL-1524
URL: https://issues.apache.org/jira/browse/DRILL-1524
Project: Apache Drill
Issue Type: Bug
Affects Versions: 0.6.0
Reporter: Krystal
git.commit.id.abbrev=42f0a7e
>From hive-13, I created a parquet table:
hive> create table voter_parquet(voter_id int,name string,age tinyint,
registration string,contributions float,voterzone smallint,create_time string)
stored as parquet;
hive> insert overwrite table voter_parquet select * from voter;
I can select against this table from hive:
hive> select * from voter_parquet limit 5;
OK
1 nick miller 68 green 717.12 13809 2014-05-25 03:41:54
2 ulysses white 48 green 840.06 19451 2014-07-30 08:03:11
3 holly garcia 18 democrat 128.2 8750 2014-09-15
02:33:11
4 victor thompson 61 independent 721.6 20462 2014-06-17
13:04:09
5 luke allen 39 socialist 800.22 25151 2015-02-01
02:02:37
I ran the same select from sqlline and got all nulls:
0: jdbc:drill:schema=hive> select * from voter_parquet limit 5;
+------------+------------+------------+--------------+---------------+------------+-------------+
| voter_id | name | age | registration | contributions |
voterzone | create_time |
+------------+------------+------------+--------------+---------------+------------+-------------+
| null | null | null | null | null | null
| null |
| null | null | null | null | null | null
| null |
| null | null | null | null | null | null
| null |
| null | null | null | null | null | null
| null |
| null | null | null | null | null | null
| null |
+------------+------------+------------+--------------+---------------+------------+-------------+
Same if I explicitly specify all the columns:
0: jdbc:drill:schema=hive> select voter_id, name, age, registration,
contributions, voterzone, create_time from voter_parquet limit 2;
+------------+------------+------------+--------------+---------------+------------+-------------+
| voter_id | name | age | registration | contributions |
voterzone | create_time |
+------------+------------+------------+--------------+---------------+------------+-------------+
| null | null | null | null | null | null
| null |
| null | null | null | null | null | null
| null |
+------------+------------+------------+--------------+---------------+------------+-------------+
However, if I select a few columns, then the data displays correctly:
0: jdbc:drill:schema=hive> select voter_id, name, age, registration from
voter_parquet limit 5;
+------------+------------+------------+--------------+
| voter_id | name | age | registration |
+------------+------------+------------+--------------+
| 1 | nick miller | 68 | green |
| 2 | ulysses white | 48 | green |
| 3 | holly garcia | 18 | democrat |
| 4 | victor thompson | 61 | independent |
| 5 | luke allen | 39 | socialist |
+------------+------------+------------+--------------+
0: jdbc:drill:schema=hive> describe voter_parquet;
+-------------+------------+-------------+
| COLUMN_NAME | DATA_TYPE | IS_NULLABLE |
+-------------+------------+-------------+
| voter_id | INTEGER | YES |
| name | VARCHAR | YES |
| age | TINYINT | YES |
| registration | VARCHAR | YES |
| contributions | FLOAT | YES |
| voterzone | SMALLINT | YES |
| create_time | VARCHAR | YES |
+-------------+------------+-------------+
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)