[GitHub] spark pull request: [SPARK-3676][Sql]spark sql hive test suite fai...

scwf Wed, 24 Sep 2014 00:56:59 -0700

GitHub user scwf opened a pull request:

    https://github.com/apache/spark/pull/2517


    [SPARK-3676][Sql]spark sql hive test suite failed in JDK 1.6

    https://issues.apache.org/jira/browse/SPARK-3676
    spark sql hive test failed in jdk 1.6, you can replay this by set jdk 
version = 1.6.0_31
    [info] - division *** FAILED ***
    [info] Results do not match for division:
    [info] SELECT 2 / 1, 1 / 2, 1 / 3, 1 / COUNT FROM src LIMIT 1
    [info] == Parsed Logical Plan ==
    [info] Limit 1
    [info] Project (2 / 1) AS c_0#692,(1 / 2) AS c_1#693,(1 / 3) AS c_2#694,(1 
/ COUNT(1)) AS c_3#695
    [info] UnresolvedRelation None, src, None
    [info] 
    [info] == Analyzed Logical Plan ==
    [info] Limit 1
    [info] Aggregate [], [(CAST(2, DoubleType) / CAST(1, DoubleType)) AS 
c_0#692,(CAST(1, DoubleType) / CAST(2, DoubleType)) AS c_1#693,(CAST(1, 
DoubleType) / CAST(3, DoubleType)) AS c_2#694,(CAST(CAST(1, LongType), Doub
    leType) / CAST(COUNT(1), DoubleType)) AS c_3#695]
    [info] MetastoreRelation default, src, None
    [info] 
    [info] == Optimized Logical Plan ==
    [info] Limit 1
    [info] Aggregate [], 2.0 AS c_0#692,0.5 AS c_1#693,0.3333333333333333 AS 
c_2#694,(1.0 / CAST(COUNT(1), DoubleType)) AS c_3#695
    [info] Project []
    [info] MetastoreRelation default, src, None
    [info] 
    [info] == Physical Plan ==
    [info] Limit 1
    [info] Aggregate false, [], 2.0 AS c_0#692,0.5 AS 
c_1#693,0.3333333333333333 AS c_2#694,(1.0 / CAST(SUM(PartialCount#699L), 
DoubleType)) AS c_3#695
    [info] Exchange SinglePartition
    [info] Aggregate true, [], COUNT(1) AS PartialCount#699L
    [info] HiveTableScan [], (MetastoreRelation default, src, None), None
    [info] 
    [info] Code Generation: false
    [info] == RDD ==
    [info] c_0 c_1 c_2 c_3
    [info] !== HIVE - 1 row(s) == == CATALYST - 1 row(s) ==
    [info] !2.0 0.5 0.3333333333333333 0.002 2.0 0.5 0.3333333333333333 0.0020 
(HiveComparisonTest.scala:370)
    [info] - timestamp cast #1 *** FAILED ***
    [info] Results do not match for timestamp cast #1:
    [info] SELECT CAST(CAST(1 AS TIMESTAMP) AS DOUBLE) FROM src LIMIT 1
    [info] == Parsed Logical Plan ==
    [info] Limit 1
    [info] Project CAST(CAST(1, TimestampType), DoubleType) AS c_0#995
    [info] UnresolvedRelation None, src, None
    [info] 
    [info] == Analyzed Logical Plan ==
    [info] Limit 1
    [info] Project CAST(CAST(1, TimestampType), DoubleType) AS c_0#995
    [info] MetastoreRelation default, src, None
    [info] 
    [info] == Optimized Logical Plan ==
    [info] Limit 1
    [info] Project 0.0010 AS c_0#995
    [info] MetastoreRelation default, src, None
    [info] 
    [info] == Physical Plan ==
    [info] Limit 1
    [info] Project 0.0010 AS c_0#995
    [info] HiveTableScan [], (MetastoreRelation default, src, None), None
    [info] 
    [info] Code Generation: false
    [info] == RDD ==
    [info] c_0
    [info] !== HIVE - 1 row(s) == == CATALYST - 1 row(s) ==
    [info] !0.001 0.0010 (HiveComparisonTest.scala:370)
    
    
    this is because jdk has different logic to operate ```double```, 
    ```System.out.println(1/500d)``` in different jdk get different result
    jdk 1.6.0(_31) ---- 0.0020
    jdk 1.7.0(_05) ---- 0.002
    this lead to HiveQuerySuite failed when generate golden answer in jdk 1.7 
and run tests in jdk 1.6, result did not matched
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/scwf/spark HiveQuerySuite

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/2517.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2517
    
----
commit 1df3964f1ff99aa93ed5f556675fe0d6d0285401
Author: w00228970 <[email protected]>
Date:   2014-09-24T06:44:54Z

    Jdk version leads to different query output for Double, this make 
HiveQuerySuite failed

commit 0cb5e8d6c45f6587497ec854353b96b2d6f536e8
Author: w00228970 <[email protected]>
Date:   2014-09-24T06:53:05Z

    delete golden answer of division-0 and timestamp cast #1

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-3676][Sql]spark sql hive test suite fai...

Reply via email to