[jira] [Comment Edited] (FLINK-4565) Support for SQL IN operator

Nikolay Vasilishin (JIRA) Tue, 08 Nov 2016 09:12:18 -0800

    [ 
https://issues.apache.org/jira/browse/FLINK-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15648140#comment-15648140
 ]


Nikolay Vasilishin edited comment on FLINK-4565 at 11/8/16 5:11 PM:
--------------------------------------------------------------------

Hi, guys, I faced some problems.
Now I have IN operator for literals, subqueries are not supported yet.
You can find my code [on my 
github|https://github.com/NickolayVasilishin/flink/tree/FLINK-4565].
So, the problems are:
#       I’m using HashSet to check entry. The code generates in 
[ScalarOperators.scala|https://github.com/apache/flink/compare/master...NickolayVasilishin:FLINK-4565#diff-423fbbd7967ec8e9feee7c1b7062b884R106].
 But creating the object of HashSet and adding elements to it is placed into 
the body of  public void flatMap(..) method, which invokes for every row, as I 
understand. The comment above the 
[CodeGenerator#generateResultExpression|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala#L305]
 says that reusable code will be reused internally, but how to check if it 
works properly?
#       The problem in 
[ExpressionParser.scala|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala].
 Since I’ve implemented matching pattern for IN operator, it conflicts with 
initCap() function ([in this 
test|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/test/scala/org/apache/flink/api/table/expressions/ScalarFunctionsTest.scala#L156].
 During the expression parsing it goes through 
[ExpressionParser#functionIdent|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala#L79]
 method (where ‘not’-checks occur on operators such as AS, COUNT, IF and “my” 
IN), where it gets into my [suffixIn 
method|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala#L194]
 and fails with an exception: 
{noformat}
Could not parse expression at column 6: `(' expected but `i' found f0.initCap().
{noformat}
I expected that expression will go to the next check if current fails. 
Also my check cannot be the last check in this chain.
So what are ways to solve this problem? Maybe there is a solution to make 
matcher not so greedy? The easiest way I think is to rename IN operator to ISIN 
operator like it is implemented in Spark.


Appreciate any help and thanks in advance.



was (Author: nvasilishin):
Hi, guys, I faced some problems.
Now I have IN operator for literals, subqueries are not supported yet.
You can find my code [on my 
github|https://github.com/NickolayVasilishin/flink/tree/FLINK-4565].
So, the problems are:
#       I’m using HashSet to check entry. The code generates in 
[ScalarOperators.scala|https://github.com/apache/flink/compare/master...NickolayVasilishin:FLINK-4565#diff-423fbbd7967ec8e9feee7c1b7062b884R106].
 But creating the object of HashSet and adding elements to it is placing into 
the body of  public void flatMap(..) method, which invokes for every row, as I 
understand. The comment above the 
[CodeGenerator#generateResultExpression|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala#L305]
 says that reusable code will be reused internally, but how to check if it 
works properly?
#       The problem in 
[ExpressionParser.scala|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala].
 Since I’ve implemented matching pattern for IN operator, it conflicts with 
initCap() function ([in this 
test|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/test/scala/org/apache/flink/api/table/expressions/ScalarFunctionsTest.scala#L156].
 During the expression parsing it goes through 
[ExpressionParser#functionIdent|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala#L79]
 method (where ‘not’-checks occur on operators such as AS, COUNT, IF and “my” 
IN), where it gets into my [suffixIn 
method|https://github.com/NickolayVasilishin/flink/blob/FLINK-4565/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/expressions/ExpressionParser.scala#L194]
 and fails with an exception: 
{noformat}
Could not parse expression at column 6: `(' expected but `i' found f0.initCap().
{noformat}
I expected that expression will go to the next check if current fails. 
Also my check cannot be the last check in this chain.
So what are ways to solve this problem? Maybe there is a solution to make 
matcher not so greedy? The easiest way I think is to rename IN operator to ISIN 
operator like it is implemented in Spark.


Appreciate any help and thanks in advance.


> Support for SQL IN operator
> ---------------------------
>
>                 Key: FLINK-4565
>                 URL: https://issues.apache.org/jira/browse/FLINK-4565
>             Project: Flink
>          Issue Type: Improvement
>          Components: Table API & SQL
>            Reporter: Timo Walther
>            Assignee: Nikolay Vasilishin
>
> It seems that Flink SQL supports the uncorrelated sub-query IN operator. But 
> it should also be available in the Table API and tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (FLINK-4565) Support for SQL IN operator

Reply via email to