Wu Tong created SPARK-28019:
-------------------------------

             Summary: Hive query gets empty result
                 Key: SPARK-28019
                 URL: https://issues.apache.org/jira/browse/SPARK-28019
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.3
            Reporter: Wu Tong


Hi,

I ran into the same issue as https://issues.apache.org/jira/browse/SPARK-26663

A hive table cannot be queried in spark 2.4.3. Spark 2.3.3 and 2.0.1 returns 
the correct result. 

The table is created using  the following script:

Hive1.2.1:
{code:java}
hive> create table recommend.test (`userid` BIGINT, `postid` BIGINT)
> PARTITIONED BY (`day` STRING)
> CLUSTERED BY (userid)
> INTO 32 BUCKETS
> ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
> STORED AS
> INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
> OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
> TBLPROPERTIES (
> 'orc.compress' = 'SNAPPY',
> 'transient_lastDdlTime' = '1515057885',
> 'NO_AUTO_COMPACTION' = 'true',
> 'transactional' = 'true'
> );
OK
Time taken: 0.344 seconds

hive> insert into recommend.test partition(day="day1") values(1, 2);
...
hive> select * from recommend.test;
OK
1 2 day1{code}
Spark 2.4.3:

 
{code:java}
scala> sql("select * from recommend.test").show()
19/06/12 15:36:09 WARN HiveMetastoreCatalog: Unable to infer schema for table 
recommend.test from file format ORC (inference mode: INFER_AND_SAVE). Using 
metastore schema.
+------+------+---+
|userid|postid|day|
+------+------+---+
+------+------+---+
{code}
Spark 2.3.3:
{code:java}
scala> sql("select * from recommend.test").show() 
+------+------+----+ 
|userid|postid| day| 
+------+------+----+ 
| 1| 2|day1| 
+------+------+----+
{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to