[ 
https://issues.apache.org/jira/browse/FLINK-14900?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu updated FLINK-14900:
-------------------------------
    Description: 
the result of TPC-DS query 19 is incorrect, I analyze the simplify the sql as 
following:
{code:java}
// simplify query 19
select top 100 ca_zip,s_zip,substring(ca_zip,1,5), substring(s_zip,1,5) 
 from customer_address,store
   where substring(ca_zip,1,5) <> substring(s_zip,1,5) 
  order by ca_zip,s_zip{code}
 

flink output:
{code:java}
null|31904|null|31904
null|31904|null|31904
null|31904|null|31904
null|31904|null|31904{code}
SQL server output:
{code:java}
00601      31904      00601 31904
00601      31904      00601 31904
00601      31904      00601 31904
00601      31904      00601 31904
{code}
This issue may influent the result of tpcds query 46、68.

Another similar pattern "where a = b" in query31,  it'll will bring null value 
too.

 
{code:java}
// simplify query 31
select 
        ss1.ca_county
       ,ss1.d_year
       ,ss2.store_sales/ss1.store_sales store_q1_q2_increase   
 from
        ss ss1
       ,ss ss2  
 where
    ss1.d_qoy = 1
    and ss1.d_year = 2000
    and ss1.ca_county = ss2.ca_county
    and ss2.d_qoy = 2
    and ss2.d_year = 2000
  order by ss1.ca_county{code}
 

Flink output :

 
{code:java}
|2000|0.845635
Acadia Parish|2000|1.362160
Accomack County|2000|0.650251
{code}
 

SQL server output:

 
{code:java}
Acadia Parish|2000|1.362159
Accomack County|2000|0.650250
{code}
 

 

I tested expression test and sql ITcase,they will return correct value.
{code:java}
@Test
def testSubStr(): Unit = {
  checkQuery(
    Seq[(String, String, Double)]((null, "00601" , 100d), ("00501", "00601", 
50d)),
    "select f0, substring(f0,1,5) from  Table1 where substring(f0,1,5)  <> f1 ",
    Seq(("00501", "00501"))
  )
}
{code}
So, I need to dig more about this issue. 



 

 

  was:
the result of TPC-DS query 19 is incorrect, I analyze the simplify the sql as 
following:
{code:java}
// simplify query
select top 100 ca_zip,s_zip,substring(ca_zip,1,5), substring(s_zip,1,5) 
 from customer_address,store
   where substring(ca_zip,1,5) <> substring(s_zip,1,5) 
  order by ca_zip,s_zip{code}
 

flink output:
{code:java}
null|31904|null|31904
null|31904|null|31904
null|31904|null|31904
null|31904|null|31904{code}
 

SQL server output:
{code:java}
00601      31904      00601 31904
00601      31904      00601 31904
00601      31904      00601 31904
00601      31904      00601 31904
{code}
And this issue may influent the result of tpcds query 46、68.

I tested expression test and sql ITcase,they will return correct value.
{code:java}
@Test
def testSubStr(): Unit = {
  checkQuery(
    Seq[(String, String, Double)]((null, "00601" , 100d), ("00501", "00601", 
50d)),
    "select f0, substring(f0,1,5) from  Table1 where substring(f0,1,5)  <> f1 ",
    Seq(("00501", "00501"))
  )
}
{code}
So, I need to dig more about this issue. 



 

 


>  output contains null value  when a is null in pattern where a <> 'String' 
> ---------------------------------------------------------------------------
>
>                 Key: FLINK-14900
>                 URL: https://issues.apache.org/jira/browse/FLINK-14900
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / Planner
>    Affects Versions: 1.9.1
>            Reporter: Leonard Xu
>            Priority: Major
>             Fix For: 1.10.0
>
>
> the result of TPC-DS query 19 is incorrect, I analyze the simplify the sql as 
> following:
> {code:java}
> // simplify query 19
> select top 100 ca_zip,s_zip,substring(ca_zip,1,5), substring(s_zip,1,5) 
>  from customer_address,store
>    where substring(ca_zip,1,5) <> substring(s_zip,1,5) 
>   order by ca_zip,s_zip{code}
>  
> flink output:
> {code:java}
> null|31904|null|31904
> null|31904|null|31904
> null|31904|null|31904
> null|31904|null|31904{code}
> SQL server output:
> {code:java}
> 00601      31904      00601 31904
> 00601      31904      00601 31904
> 00601      31904      00601 31904
> 00601      31904      00601 31904
> {code}
> This issue may influent the result of tpcds query 46、68.
> Another similar pattern "where a = b" in query31,  it'll will bring null 
> value too.
>  
> {code:java}
> // simplify query 31
> select 
>         ss1.ca_county
>        ,ss1.d_year
>        ,ss2.store_sales/ss1.store_sales store_q1_q2_increase   
>  from
>         ss ss1
>        ,ss ss2  
>  where
>     ss1.d_qoy = 1
>     and ss1.d_year = 2000
>     and ss1.ca_county = ss2.ca_county
>     and ss2.d_qoy = 2
>     and ss2.d_year = 2000
>   order by ss1.ca_county{code}
>  
> Flink output :
>  
> {code:java}
> |2000|0.845635
> Acadia Parish|2000|1.362160
> Accomack County|2000|0.650251
> {code}
>  
> SQL server output:
>  
> {code:java}
> Acadia Parish|2000|1.362159
> Accomack County|2000|0.650250
> {code}
>  
>  
> I tested expression test and sql ITcase,they will return correct value.
> {code:java}
> @Test
> def testSubStr(): Unit = {
>   checkQuery(
>     Seq[(String, String, Double)]((null, "00601" , 100d), ("00501", "00601", 
> 50d)),
>     "select f0, substring(f0,1,5) from  Table1 where substring(f0,1,5)  <> f1 
> ",
>     Seq(("00501", "00501"))
>   )
> }
> {code}
> So, I need to dig more about this issue. 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to