[jira] [Assigned] (FLINK-9664) FlinkML Quickstart Loading Data section example doesn't work as described
[ https://issues.apache.org/jira/browse/FLINK-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-9664: Assignee: Rong Rong > FlinkML Quickstart Loading Data section example doesn't work as described > - > > Key: FLINK-9664 > URL: https://issues.apache.org/jira/browse/FLINK-9664 > Project: Flink > Issue Type: Bug > Components: Documentation, Machine Learning Library >Affects Versions: 1.5.0 >Reporter: Mano Swerts >Assignee: Rong Rong >Priority: Major > Labels: documentation-update, machine_learning, ml > Original Estimate: 1h > Remaining Estimate: 1h > > The ML documentation example isn't complete: > [https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html#loading-data] > The referred section loads data from an astroparticle binary classification > dataset to showcase SVM. The dataset uses 0 and 1 as labels, which doesn't > produce correct results. The SVM predictor expects -1 and 1 labels to > correctly predict the label. The documentation, however, doesn't mention > that. The example therefore doesn't work without a clue why. > The documentation should be updated with an explicit mention to -1 and 1 > labels and a mapping function that shows the conversion of the labels. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-9430) Support Casting of Object to Primitive types for Flink SQL UDF
[ https://issues.apache.org/jira/browse/FLINK-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506191#comment-16506191 ] Rong Rong edited comment on FLINK-9430 at 6/8/18 4:07 PM: -- Hi here, I did some integration test. There are 2 glues that makes the current implementation unable to generate any Kyro serialization for this. (A). UDF follow immediately by CAST will always get resolved to one RexCall during logical planning phase (multiple Rules can apply here) and thus will not create any serialization problem (B). when UDF returns an Object, the GenericTypeInfo return type is generally not accepted. Thus UDF not follow immediately by CAST will throw exception for sure. For now I think the implementation is pretty safe. Any thoughts on this? was (Author: walterddr): Hi here, I did some integration test. There are 2 glues that makes the current implementation unable to generate any Kyro serialization for this. (A). UDF follow immediately by CAST will always get resolved to one RexCall during logical planning phase (multiple Rules can apply here) and thus will not create any serialization problem (B). when UDF returns an Object, the GenericTypeInfo return type is generally not accepted. For now I think the implementation is pretty safe. Any thoughts on this? > Support Casting of Object to Primitive types for Flink SQL UDF > -- > > Key: FLINK-9430 > URL: https://issues.apache.org/jira/browse/FLINK-9430 > Project: Flink > Issue Type: New Feature > Components: Table API SQL >Reporter: Shuyi Chen >Assignee: Shuyi Chen >Priority: Major > > We want to add a SQL UDF to access specific element in a JSON string using > JSON path. However, the JSON element can be of different types, e.g. Int, > Float, Double, String, Boolean and etc.. Since return type is not part of the > method signature, we can not use overload. So we will end up writing a UDF > for each type, e.g. GetFloatFromJSON, GetIntFromJSON and etc., which has a > lot of duplication. > One way to unify all these UDF functions is to implement one UDF and return > java.lang.Object, and in the SQL statement, use CAST AS to cast the returned > Object into the correct type. Below is an example: > > {code:java} > object JsonPathUDF extends ScalarFunction { > def eval(jsonStr: String, path: String): Object = { >JSONParser.parse(jsonStr).read(path) > } > }{code} > {code:java} > SELECT CAST(jsonpath(json, "$.store.book.title") AS VARCHAR(32)) as > bookTitle FROM table1{code} > The current Flink SQL cast implementation does not support casting from > GenericTypeInfo to another type, I have already got a local > branch to fix this. Please comment if there are alternatives to the problem > above. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9430) Support Casting of Object to Primitive types for Flink SQL UDF
[ https://issues.apache.org/jira/browse/FLINK-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506191#comment-16506191 ] Rong Rong commented on FLINK-9430: -- Hi here, I did some integration test. There are 2 glues that makes the current implementation unable to generate any Kyro serialization for this. (A). UDF follow immediately by CAST will always get resolved to one RexCall during logical planning phase (multiple Rules can apply here) and thus will not create any serialization problem (B). when UDF returns an Object, the GenericTypeInfo return type is generally not accepted. For now I think the implementation is pretty safe. Any thoughts on this? > Support Casting of Object to Primitive types for Flink SQL UDF > -- > > Key: FLINK-9430 > URL: https://issues.apache.org/jira/browse/FLINK-9430 > Project: Flink > Issue Type: New Feature > Components: Table API SQL >Reporter: Shuyi Chen >Assignee: Shuyi Chen >Priority: Major > > We want to add a SQL UDF to access specific element in a JSON string using > JSON path. However, the JSON element can be of different types, e.g. Int, > Float, Double, String, Boolean and etc.. Since return type is not part of the > method signature, we can not use overload. So we will end up writing a UDF > for each type, e.g. GetFloatFromJSON, GetIntFromJSON and etc., which has a > lot of duplication. > One way to unify all these UDF functions is to implement one UDF and return > java.lang.Object, and in the SQL statement, use CAST AS to cast the returned > Object into the correct type. Below is an example: > > {code:java} > object JsonPathUDF extends ScalarFunction { > def eval(jsonStr: String, path: String): Object = { >JSONParser.parse(jsonStr).read(path) > } > }{code} > {code:java} > SELECT CAST(jsonpath(json, "$.store.book.title") AS VARCHAR(32)) as > bookTitle FROM table1{code} > The current Flink SQL cast implementation does not support casting from > GenericTypeInfo to another type, I have already got a local > branch to fix this. Please comment if there are alternatives to the problem > above. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9502) Use generic parameter search for user-define functions when argument contains Object type
[ https://issues.apache.org/jira/browse/FLINK-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9502: - Description: This ticket is to enable generic Object type as element type by using SQL ANY in parameter / argument type mapping. This will enable support for use case such as: {code:java} public String eval(Object objArg) { /* ... */ } {code} The changes require here: 1. Introduce wildcard search and search rules for SQL ANY type usage. 2. Enabling wildcard search when concrete type search returns no match. was: This ticket is to enable generic Object type as element type by using SQL ANY in parameter / argument type mapping. This will enable support for use case such as: {code:java} public String eval(Object objArg) { // ... } {code} The changes require here: 1. Introduce wildcard search and search rules for SQL ANY type usage. 2. Enabling wildcard search when concrete type search returns no match. > Use generic parameter search for user-define functions when argument contains > Object type > - > > Key: FLINK-9502 > URL: https://issues.apache.org/jira/browse/FLINK-9502 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This ticket is to enable generic Object type as element type by using SQL ANY > in parameter / argument type mapping. > This will enable support for use case such as: > {code:java} > public String eval(Object objArg) { /* ... */ } > {code} > The changes require here: > 1. Introduce wildcard search and search rules for SQL ANY type usage. > 2. Enabling wildcard search when concrete type search returns no match. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9502) Use generic parameter search for user-define functions when argument contains Object type
[ https://issues.apache.org/jira/browse/FLINK-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9502: - Description: This ticket is to enable generic Object type as element type by using SQL ANY in parameter / argument type mapping. This will enable support for use case such as: {code:java} public String eval(Object objArg) { // ... } {code} The changes require here: 1. Introduce wildcard search and search rules for SQL ANY type usage. 2. Enabling wildcard search when concrete type search returns no match. was: This ticket is to enable generic Object type as element type by using SQL ANY in parameter / argument type mapping. This will enable support for The changes require here: 1. Introduce wildcard search and search rules for SQL ANY type usage. 2. Enabling wildcard search when concrete type search returns no match. > Use generic parameter search for user-define functions when argument contains > Object type > - > > Key: FLINK-9502 > URL: https://issues.apache.org/jira/browse/FLINK-9502 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This ticket is to enable generic Object type as element type by using SQL ANY > in parameter / argument type mapping. > This will enable support for use case such as: > {code:java} > public String eval(Object objArg) { // ... } > {code} > The changes require here: > 1. Introduce wildcard search and search rules for SQL ANY type usage. > 2. Enabling wildcard search when concrete type search returns no match. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public String[] eval(Map mapArg) { /* ... */ } {code} should either 1. Automatically resolve that: - *{{ObjectArrayTypeInfo}}* to be the result type. - *{{MapTypeInfo}}* to be the parameter type. 2. Improved function mapping to find and locate function with such signatures. was: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public String[] eval(Map mapArg) { //... } {code} should either 1. Automatically resolve that: - *{{ObjectArrayTypeInfo}}* to be the result type. - *{{MapTypeInfo}}* to be the parameter type. 2. Improved function mapping to find and locate function with such signatures. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature, such as: > {code:java} > public String[] eval(Map mapArg) { /* ... */ } > {code} > should either > 1. Automatically resolve that: > - *{{ObjectArrayTypeInfo}}* to be the result type. > - *{{MapTypeInfo}}* to be the > parameter type. > 2. Improved function mapping to find and locate function with such signatures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9501) Allow Object or Wildcard type in user-define functions as parameter types but not result types
[ https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9501: - Description: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as {code:java} public String eval(Map mapArg) { /* ... */ } public String eval(Map mapArg) { /* ... */ } {code} The changes needed here I can think of for now are: 1. Ensure SQL ANY type is used for component/field types for composite TypeInformation with GenericTypeInfo nested fields 2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. was: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as {code:java} public String eval(Map mapArg) { //...} public String eval(Map mapArg) { //... } {code} The changes needed here I can think of for now are: 1. Ensure SQL ANY type is used for component/field types for composite TypeInformation with GenericTypeInfo nested fields 2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. > Allow Object or Wildcard type in user-define functions as parameter types but > not result types > -- > > Key: FLINK-9501 > URL: https://issues.apache.org/jira/browse/FLINK-9501 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Idea here is to treat every Java parameter objects type as SQL ANY type. > While disallowing SQL ANY type in result object. > This ticket is specifically to deal with composite generic erasure types such > as > {code:java} > public String eval(Map mapArg) { /* ... */ } > public String eval(Map mapArg) { /* ... */ } > {code} > The changes needed here I can think of for now are: > 1. Ensure SQL ANY type is used for component/field types for composite > TypeInformation with GenericTypeInfo nested fields > 2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types > happens. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public String[] eval(Map mapArg) { //... } {code} should either 1. Automatically resolve that: - *{{ObjectArrayTypeInfo}}* to be the result type. - *{{MapTypeInfo}}* to be the parameter type. 2. Improved function mapping to find and locate function with such signatures. was: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public String[] eval(Map mapArg) { //... } {code} should automatically resolve that: - *{{ObjectArrayTypeInfo}}* to be the result type. - *{{MapTypeInfo}}* to be the parameter type. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature, such as: > {code:java} > public String[] eval(Map mapArg) { > //... > } > {code} > should either > 1. Automatically resolve that: > - *{{ObjectArrayTypeInfo}}* to be the result type. > - *{{MapTypeInfo}}* to be the > parameter type. > 2. Improved function mapping to find and locate function with such signatures. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9501) Allow Object or Wildcard type in user-define functions as parameter types but not result types
[ https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9501: - Description: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as {code:java} public String eval(Map mapArg) { //...} public String eval(Map mapArg) { //... } {code} The changes needed here I can think of for now are: 1. Ensure SQL ANY type is used for component/field types for composite TypeInformation with GenericTypeInfo nested fields 2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. was: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as - eval(Map ...) - eval(Map ...) The changes needed here is 1. Ensure SQL ANY type is used for component/field types for composite TypeInformation. 2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. 3. Add checks to disallow generic type erasure in output type. > Allow Object or Wildcard type in user-define functions as parameter types but > not result types > -- > > Key: FLINK-9501 > URL: https://issues.apache.org/jira/browse/FLINK-9501 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Idea here is to treat every Java parameter objects type as SQL ANY type. > While disallowing SQL ANY type in result object. > This ticket is specifically to deal with composite generic erasure types such > as > {code:java} > public String eval(Map mapArg) { //...} > public String eval(Map mapArg) { //... } > {code} > The changes needed here I can think of for now are: > 1. Ensure SQL ANY type is used for component/field types for composite > TypeInformation with GenericTypeInfo nested fields > 2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types > happens. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9502) Use generic parameter search for user-define functions when argument contains Object type
[ https://issues.apache.org/jira/browse/FLINK-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9502: - Description: This ticket is to enable generic Object type as element type by using SQL ANY in parameter / argument type mapping. This will enable support for The changes require here: 1. Introduce wildcard search and search rules for SQL ANY type usage. 2. Enabling wildcard search when concrete type search returns no match. was: This ticket is to enable generic Object type as element type by using SQL ANY in parameter / argument type mapping. The changes require here: 1. Introduce wildcard search and search rules for SQL ANY type usage. 2. Enabling wildcard search when concrete type search returns no match. > Use generic parameter search for user-define functions when argument contains > Object type > - > > Key: FLINK-9502 > URL: https://issues.apache.org/jira/browse/FLINK-9502 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This ticket is to enable generic Object type as element type by using SQL ANY > in parameter / argument type mapping. > This will enable support for > The changes require here: > 1. Introduce wildcard search and search rules for SQL ANY type usage. > 2. Enabling wildcard search when concrete type search returns no match. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7001) Improve performance of Sliding Time Window with pane optimization
[ https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16501829#comment-16501829 ] Rong Rong commented on FLINK-7001: -- Thanks for the support [~pgrulich], we can definitely collab on this. I will start a discussion doc first if that's OK with you guys [~jark] [~StephanEwen] ? > Improve performance of Sliding Time Window with pane optimization > - > > Key: FLINK-7001 > URL: https://issues.apache.org/jira/browse/FLINK-7001 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > > Currently, the implementation of time-based sliding windows treats each > window individually and replicates records to each window. For a window of 10 > minute size that slides by 1 second the data is replicated 600 fold (10 > minutes / 1 second). We can optimize sliding window by divide windows into > panes (aligned with slide), so that we can avoid record duplication and > leverage the checkpoint. > I will attach a more detail design doc to the issue. > The following issues are similar to this issue: FLINK-5387, FLINK-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16500307#comment-16500307 ] Rong Rong commented on FLINK-9294: -- Yes you are absolutely right, sorry for the typo. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature, such as: > {code:java} > public String[] eval(Map mapArg) { > //... > } > {code} > should automatically resolve that: > - *{{ObjectArrayTypeInfo}}* to be the result type. > - *{{MapTypeInfo}}* to be the > parameter type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public String[] eval(Map mapArg) { //... } {code} should automatically resolve that: - *{{ObjectArrayTypeInfo}}* to be the result type. - *{{MapTypeInfo}}* to be the parameter type. was: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public List eval(Map mapArg) { //... } {code} should automatically resolve that: - *{{ObjectArrayTypeInfo}}* to be the result type. - *{{MapTypeInfo}}* to be the parameter type. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature, such as: > {code:java} > public String[] eval(Map mapArg) { > //... > } > {code} > should automatically resolve that: > - *{{ObjectArrayTypeInfo}}* to be the result type. > - *{{MapTypeInfo}}* to be the > parameter type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9484) Improve generic type inference for User-Defined Functions
[ https://issues.apache.org/jira/browse/FLINK-9484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9484: - Description: User-defined function has been a great extension for Flink SQL API to support much complex logics. We experienced many inconvenience when dealing with UDF with generic types and are summarized in the following [doc|https://docs.google.com/document/d/1zKSY1z0lvtQdfOgwcLnCMSRHew3weeJ6QfQjSD0zWas/edit?usp=sharing]. Detail implementation plans are listed in sub-tasks for the generic type inference / functioncatalog look up. In discussion topic: Can we optimize codegen while provide flexibility for output type generic, by the method in FLINK-9430 to chaining: generic result type UDF function with CAST operator. Initial thought here is to add an logical plan optimizer rule to create a UDFGenericResultTypeCast operator. was: User-defined function has been a great extension for Flink SQL API to support much complex logics. We experienced many inconvenience when dealing with UDF with generic types and are summarized in the following [doc|https://docs.google.com/document/d/1zKSY1z0lvtQdfOgwcLnCMSRHew3weeJ6QfQjSD0zWas/edit?usp=sharing]. We are planning to implement the generic type inference / functioncatalog look up in multiple phases. Detail tickets will be created. > Improve generic type inference for User-Defined Functions > - > > Key: FLINK-9484 > URL: https://issues.apache.org/jira/browse/FLINK-9484 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > User-defined function has been a great extension for Flink SQL API to support > much complex logics. > We experienced many inconvenience when dealing with UDF with generic types > and are summarized in the following > [doc|https://docs.google.com/document/d/1zKSY1z0lvtQdfOgwcLnCMSRHew3weeJ6QfQjSD0zWas/edit?usp=sharing]. > Detail implementation plans are listed in sub-tasks for the generic type > inference / functioncatalog look up. > In discussion topic: Can we optimize codegen while provide flexibility for > output type generic, by the method in FLINK-9430 to chaining: generic result > type UDF function with CAST operator. > Initial thought here is to add an logical plan optimizer rule to create a > UDFGenericResultTypeCast operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9502) Use generic parameter search for user-define functions when argument contains Object type
[ https://issues.apache.org/jira/browse/FLINK-9502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9502: - Description: This ticket is to enable generic Object type as element type by using SQL ANY in parameter / argument type mapping. The changes require here: 1. Introduce wildcard search and search rules for SQL ANY type usage. 2. Enabling wildcard search when concrete type search returns no match. > Use generic parameter search for user-define functions when argument contains > Object type > - > > Key: FLINK-9502 > URL: https://issues.apache.org/jira/browse/FLINK-9502 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This ticket is to enable generic Object type as element type by using SQL ANY > in parameter / argument type mapping. > The changes require here: > 1. Introduce wildcard search and search rules for SQL ANY type usage. > 2. Enabling wildcard search when concrete type search returns no match. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9502) Use generic parameter search for user-define functions when argument contains Object type
Rong Rong created FLINK-9502: Summary: Use generic parameter search for user-define functions when argument contains Object type Key: FLINK-9502 URL: https://issues.apache.org/jira/browse/FLINK-9502 Project: Flink Issue Type: Sub-task Components: Table API SQL Reporter: Rong Rong Assignee: Rong Rong -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9501) Allow Object or Wildcard type in user-define functions as parameter types but not result types
[ https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9501: - Description: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as - eval(Map ...) - eval(Map ...) The changes needed here is 1. Ensure SQL ANY type is used for component/field types for composite TypeInformation. 2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. 3. Add checks to disallow generic type erasure in output type. was: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as - eval(Map ...) - eval(Map ...) The changes needed here is 1. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. 2. Add checks to disallow generic type erasure in output type. > Allow Object or Wildcard type in user-define functions as parameter types but > not result types > -- > > Key: FLINK-9501 > URL: https://issues.apache.org/jira/browse/FLINK-9501 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Idea here is to treat every Java parameter objects type as SQL ANY type. > While disallowing SQL ANY type in result object. > This ticket is specifically to deal with composite generic erasure types such > as > - eval(Map ...) > - eval(Map ...) > The changes needed here is > 1. Ensure SQL ANY type is used for component/field types for composite > TypeInformation. > 2. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types > happens. > 3. Add checks to disallow generic type erasure in output type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9501) Allow Object.class type in user-define functions as parameter types but not result types
[ https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9501: - Description: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as - eval(Map ...) - eval(Map ...) The changes needed here is 1. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. > Allow Object.class type in user-define functions as parameter types but not > result types > > > Key: FLINK-9501 > URL: https://issues.apache.org/jira/browse/FLINK-9501 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Idea here is to treat every Java parameter objects type as SQL ANY type. > While disallowing SQL ANY type in result object. > This ticket is specifically to deal with composite generic erasure types such > as > - eval(Map ...) > - eval(Map ...) > The changes needed here is > 1. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types > happens. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9501) Allow Object.class type in user-define functions as parameter types but not result types
[ https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9501: - Description: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as - eval(Map ...) - eval(Map ...) The changes needed here is 1. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. 2. Add checks to disallow generic type erasure in output type. was: Idea here is to treat every Java parameter objects type as SQL ANY type. While disallowing SQL ANY type in result object. This ticket is specifically to deal with composite generic erasure types such as - eval(Map ...) - eval(Map ...) The changes needed here is 1. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types happens. > Allow Object.class type in user-define functions as parameter types but not > result types > > > Key: FLINK-9501 > URL: https://issues.apache.org/jira/browse/FLINK-9501 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Idea here is to treat every Java parameter objects type as SQL ANY type. > While disallowing SQL ANY type in result object. > This ticket is specifically to deal with composite generic erasure types such > as > - eval(Map ...) > - eval(Map ...) > The changes needed here is > 1. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types > happens. > 2. Add checks to disallow generic type erasure in output type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9501) Allow Object or Wildcard type in user-define functions as parameter types but not result types
[ https://issues.apache.org/jira/browse/FLINK-9501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9501: - Summary: Allow Object or Wildcard type in user-define functions as parameter types but not result types (was: Allow Object.class type in user-define functions as parameter types but not result types) > Allow Object or Wildcard type in user-define functions as parameter types but > not result types > -- > > Key: FLINK-9501 > URL: https://issues.apache.org/jira/browse/FLINK-9501 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Idea here is to treat every Java parameter objects type as SQL ANY type. > While disallowing SQL ANY type in result object. > This ticket is specifically to deal with composite generic erasure types such > as > - eval(Map ...) > - eval(Map ...) > The changes needed here is > 1. Modify FunctionCatalog lookup to use SQL ANY type if generic erasure types > happens. > 2. Add checks to disallow generic type erasure in output type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9501) Allow Object.class type in user-define functions as parameter types but not result types
Rong Rong created FLINK-9501: Summary: Allow Object.class type in user-define functions as parameter types but not result types Key: FLINK-9501 URL: https://issues.apache.org/jira/browse/FLINK-9501 Project: Flink Issue Type: Sub-task Components: Table API SQL Reporter: Rong Rong Assignee: Rong Rong -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Issue Type: Sub-task (was: Improvement) Parent: FLINK-9484 > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature, such as: > {code:java} > public List eval(Map mapArg) { > //... > } > {code} > should automatically resolve that: > - *{{ObjectArrayTypeInfo}}* to be the result type. > - *{{MapTypeInfo}}* to be the > parameter type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9484) Improve generic type inference for User-Defined Functions
Rong Rong created FLINK-9484: Summary: Improve generic type inference for User-Defined Functions Key: FLINK-9484 URL: https://issues.apache.org/jira/browse/FLINK-9484 Project: Flink Issue Type: Improvement Components: Table API SQL Reporter: Rong Rong Assignee: Rong Rong User-defined function has been a great extension for Flink SQL API to support much complex logics. We experienced many inconvenience when dealing with UDF with generic types and are summarized in the following [doc|https://docs.google.com/document/d/1zKSY1z0lvtQdfOgwcLnCMSRHew3weeJ6QfQjSD0zWas/edit?usp=sharing]. We are planning to implement the generic type inference / functioncatalog look up in multiple phases. Detail tickets will be created. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9430) Support Casting of Object to Primitive types for Flink SQL UDF
[ https://issues.apache.org/jira/browse/FLINK-9430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495510#comment-16495510 ] Rong Rong commented on FLINK-9430: -- Hi [~suez1224], I think [~twalthr] and I had this discussion in a separated doc regrading generic type inference in UDF: https://docs.google.com/document/d/1zKSY1z0lvtQdfOgwcLnCMSRHew3weeJ6QfQjSD0zWas/edit#heading=h.64s92ad5mb1. I am not exactly sure but maybe can we add an ITCase to this JIRA? I think at some point the Object is serialized using Kyro and will significantly reduce the performance. > Support Casting of Object to Primitive types for Flink SQL UDF > -- > > Key: FLINK-9430 > URL: https://issues.apache.org/jira/browse/FLINK-9430 > Project: Flink > Issue Type: New Feature > Components: Table API SQL >Reporter: Shuyi Chen >Assignee: Shuyi Chen >Priority: Major > > We want to add a SQL UDF to access specific element in a JSON string using > JSON path. However, the JSON element can be of different types, e.g. Int, > Float, Double, String, Boolean and etc.. Since return type is not part of the > method signature, we can not use overload. So we will end up writing a UDF > for each type, e.g. GetFloatFromJSON, GetIntFromJSON and etc., which has a > lot of duplication. > One way to unify all these UDF functions is to implement one UDF and return > java.lang.Object, and in the SQL statement, use CAST AS to cast the returned > Object into the correct type. Below is an example: > > {code:java} > object JsonPathUDF extends ScalarFunction { > def eval(jsonStr: String, path: String): Object = { >JSONParser.parse(jsonStr).read(path) > } > }{code} > {code:java} > SELECT CAST(jsonpath(json, "$.store.book.title") AS VARCHAR(32)) as > bookTitle FROM table1{code} > The current Flink SQL cast implementation does not support casting from > GenericTypeInfo to another type, I have already got a local > branch to fix this. Please comment if there are alternatives to the problem > above. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7001) Improve performance of Sliding Time Window with pane optimization
[ https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16492070#comment-16492070 ] Rong Rong commented on FLINK-7001: -- Hi [~jark], [~StephanEwen], I was wondering if there's any further development since the last discussion? We are investigating multiple extreme use cases (for example: customized aggregation on a 15-seconds sliding window over 7 days of data), on our side regarding the sliding window performance improvement and we would love to contribute or take the lead in this effort. To summarized some of the discussions from [~StephanEwen], [~pgrulich], [~walterddr] had: - Splitting currently generic window operator into *aligned* windows (e.g. sliding/tumble window) and *unaligned* window (e.g. session window) catagories -- Further improve performance in each catagory individually. -- Having one timer per window instead of per window/key combination if possible. - Deduplication optimization through pane split and pane merging on *aligned window* operators: -- Algorithm that handles pane optimization efficiently, and early/late firing compatibilities. -- Need to be compatible and work for RockSDB state backend. -- Backward compatibility with savepoints. - Efficiency trade-off mechanism that selects optimization methods (pane split, traditional, etc) depending on -- Split accumulat operation vs. merge operation complexity. -- Latency vs. complexity vs. Memory footprint Do you guys think this could be a good starting point for some concrete solution? I tried to summarized and collect as much information, any additional comments and suggestions are highly appreciated. Thanks, Rong > Improve performance of Sliding Time Window with pane optimization > - > > Key: FLINK-7001 > URL: https://issues.apache.org/jira/browse/FLINK-7001 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > > Currently, the implementation of time-based sliding windows treats each > window individually and replicates records to each window. For a window of 10 > minute size that slides by 1 second the data is replicated 600 fold (10 > minutes / 1 second). We can optimize sliding window by divide windows into > panes (aligned with slide), so that we can avoid record duplication and > leverage the checkpoint. > I will attach a more detail design doc to the issue. > The following issues are similar to this issue: FLINK-5387, FLINK-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9422) Dedicated operator for UNION on streaming tables with time attributes
[ https://issues.apache.org/jira/browse/FLINK-9422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16487968#comment-16487968 ] Rong Rong commented on FLINK-9422: -- +1 on "can apply windowed aggregates on the result". This would make things much easier without the unnecessary "aggregation" > Dedicated operator for UNION on streaming tables with time attributes > - > > Key: FLINK-9422 > URL: https://issues.apache.org/jira/browse/FLINK-9422 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Fabian Hueske >Assignee: Ruidong Li >Priority: Minor > > We can implement a dedicated operator for a {{UNION}} operator on tables with > time attributes. Currently, {{UNION}} is translated into a {{UNION ALL}} and > a subsequent {{GROUP BY}} on all attributes without aggregation functions. > The state of the grouping operator is only clean up using state retention > timers. > The dedicated operator would leverage the monotonicity property of the time > attribute and watermarks to automatically clean up its state. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-9398) Flink CLI list running job returns all jobs except in CREATE state
[ https://issues.apache.org/jira/browse/FLINK-9398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-9398: Assignee: Rong Rong > Flink CLI list running job returns all jobs except in CREATE state > -- > > Key: FLINK-9398 > URL: https://issues.apache.org/jira/browse/FLINK-9398 > Project: Flink > Issue Type: Bug > Components: Client >Affects Versions: 1.5.0 >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > See: > https://github.com/apache/flink/blob/4922ced71a307a26b9f5070b41f72fd5d93b0ac8/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L437-L445 > Seems like CLI command: *flink list -r* returns all jobs except jobs in > *CREATE* state. which conflicts with the CLI description: *Running/Restarting > Jobs*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9398) Flink CLI list running job returns all jobs except in CREATE state
[ https://issues.apache.org/jira/browse/FLINK-9398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16480670#comment-16480670 ] Rong Rong commented on FLINK-9398: -- Hi [~yanghua], have you started working on this task? I was hoping to get this in and work together with FLINK-8985. If you haven't started working on it, can I assign it to myself? Thanks, Rong > Flink CLI list running job returns all jobs except in CREATE state > -- > > Key: FLINK-9398 > URL: https://issues.apache.org/jira/browse/FLINK-9398 > Project: Flink > Issue Type: Bug > Components: Client >Affects Versions: 1.5.0 >Reporter: Rong Rong >Assignee: vinoyang >Priority: Major > > See: > https://github.com/apache/flink/blob/4922ced71a307a26b9f5070b41f72fd5d93b0ac8/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L437-L445 > Seems like CLI command: *flink list -r* returns all jobs except jobs in > *CREATE* state. which conflicts with the CLI description: *Running/Restarting > Jobs*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9398) Flink CLI list running job returns all jobs except in CREATE state
[ https://issues.apache.org/jira/browse/FLINK-9398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9398: - Description: See: https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L437-L445 Seems like CLI command: *flink list -r* returns all jobs except jobs in *CREATE* state. which conflicts with the CLI description: *Running/Restarting Jobs*. was: See: https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L437-L445 Seems like CLI command: *flink list -r* returns all jobs except jobs in *CREATE* state. which conflicts with the CLI description: *--Running/Restarting Jobs--*. > Flink CLI list running job returns all jobs except in CREATE state > -- > > Key: FLINK-9398 > URL: https://issues.apache.org/jira/browse/FLINK-9398 > Project: Flink > Issue Type: Bug > Components: Client >Affects Versions: 1.5.0 >Reporter: Rong Rong >Priority: Major > > See: > https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L437-L445 > Seems like CLI command: *flink list -r* returns all jobs except jobs in > *CREATE* state. which conflicts with the CLI description: *Running/Restarting > Jobs*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9398) Flink CLI list running job returns all jobs except in CREATE state
Rong Rong created FLINK-9398: Summary: Flink CLI list running job returns all jobs except in CREATE state Key: FLINK-9398 URL: https://issues.apache.org/jira/browse/FLINK-9398 Project: Flink Issue Type: Bug Components: Client Affects Versions: 1.5.0 Reporter: Rong Rong See: https://github.com/apache/flink/blob/master/flink-clients/src/main/java/org/apache/flink/client/cli/CliFrontend.java#L437-L445 Seems like CLI command: *flink list -r* returns all jobs except jobs in *CREATE* state. which conflicts with the CLI description: *--Running/Restarting Jobs--*. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8866) Create unified interfaces to configure and instatiate TableSinks
[ https://issues.apache.org/jira/browse/FLINK-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16475177#comment-16475177 ] Rong Rong commented on FLINK-8866: -- +1 on the (1) and (2) point, I found like huge trunk of component being copied when trying create TableSink[Factory/FactoryService] component with current architecture. Regrading (3), I think there might be some SinkFunction where fieldName and fieldType is necessary to validate during initialization of the Sink function (such as JDBC sink, where the underlying JDBC driver is loaded in runtime I believe). What do you think should we still consider having them as *optional* part of the configuration? > Create unified interfaces to configure and instatiate TableSinks > > > Key: FLINK-8866 > URL: https://issues.apache.org/jira/browse/FLINK-8866 > Project: Flink > Issue Type: New Feature > Components: Table API SQL >Reporter: Timo Walther >Assignee: Shuyi Chen >Priority: Major > > Similar to the efforts done in FLINK-8240. We need unified ways to configure > and instantiate TableSinks. Among other applications, this is necessary in > order to declare table sinks in an environment file of the SQL client. Such > that the sink can be used for {{INSERT INTO}} statements. > Below are a few major changes in mind. > 1) Add TableSinkFactory/TableSinkFactoryService similar to > TableSourceFactory/TableSourceFactoryService > 2) Add a common property called "type" with values (source, sink and both) > for both TableSource and TableSink. > 3) in yaml file, replace "sources" with "tables", and use tableType to > identify whether it's source or sink. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8866) Create unified interfaces to configure and instatiate TableSinks
[ https://issues.apache.org/jira/browse/FLINK-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16474463#comment-16474463 ] Rong Rong commented on FLINK-8866: -- I think they should be designed to use in both source and sink cases. It doesn't really make sense to separate them, at least not in [~twalthr]'s Kafka example and I don't really think there's any difference for others either. The only thing I can think of is security/authorization concern: E.g. whether sharing configurations for real-only access and write access in the same location is safe or not. [~suez1224] what do you think? I haven't implement the "both" part, this diff is merely trying to clear some questions I had for implementing FLINK-8880. Also, there's some components which I think can definitely be unified in Table API level, such as the Source/Sink Factory service. It can make the APIs more unified. What do you guys think? -- Rong > Create unified interfaces to configure and instatiate TableSinks > > > Key: FLINK-8866 > URL: https://issues.apache.org/jira/browse/FLINK-8866 > Project: Flink > Issue Type: New Feature > Components: Table API SQL >Reporter: Timo Walther >Assignee: Shuyi Chen >Priority: Major > > Similar to the efforts done in FLINK-8240. We need unified ways to configure > and instantiate TableSinks. Among other applications, this is necessary in > order to declare table sinks in an environment file of the SQL client. Such > that the sink can be used for {{INSERT INTO}} statements. > Below are a few major changes in mind. > 1) Add TableSinkFactory/TableSinkFactoryService similar to > TableSourceFactory/TableSourceFactoryService > 2) Add a common property called "type" with values (source, sink and both) > for both TableSource and TableSink. > 3) in yaml file, replace "sources" with "tables", and use tableType to > identify whether it's source or sink. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8866) Create unified interfaces to configure and instatiate TableSinks
[ https://issues.apache.org/jira/browse/FLINK-8866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16472571#comment-16472571 ] Rong Rong commented on FLINK-8866: -- Hi [~suez1224] I was working on FLINK-8880 and I thought it will be great to have some understanding on how this JIRA is going to be implemented. Since some of the functionalities in the description is already implemented in FLINK-9059. I had a preliminary version that only supports "source" and "sink" and was wondering if this could be part of the solution: https://github.com/apache/flink/compare/master...walterddr:FLINK-8866 Please kindly take a look, many thanks Rong > Create unified interfaces to configure and instatiate TableSinks > > > Key: FLINK-8866 > URL: https://issues.apache.org/jira/browse/FLINK-8866 > Project: Flink > Issue Type: New Feature > Components: Table API SQL >Reporter: Timo Walther >Assignee: Shuyi Chen >Priority: Major > > Similar to the efforts done in FLINK-8240. We need unified ways to configure > and instantiate TableSinks. Among other applications, this is necessary in > order to declare table sinks in an environment file of the SQL client. Such > that the sink can be used for {{INSERT INTO}} statements. > Below are a few major changes in mind. > 1) Add TableSinkFactory/TableSinkFactoryService similar to > TableSourceFactory/TableSourceFactoryService > 2) Add a common property called "type" with values (source, sink and both) > for both TableSource and TableSink. > 3) in yaml file, replace "sources" with "tables", and use tableType to > identify whether it's source or sink. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7001) Improve performance of Sliding Time Window with pane optimization
[ https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466014#comment-16466014 ] Rong Rong commented on FLINK-7001: -- Thanks [~pgrulich], This is definitely a great solution when handling high frequency, long length sliding windows. I briefly went over the paper and got a few questions regarding the use cases and compatibility. * The non-overlapping slide separator + slide manager approach is very elegant in order to save memory buffer usage and having a sole slice manager to handle out-of-order messages by updating the slices in store is definitely great. My concern is with [~StephanEwen] on this especially the backward & RocksDB compatibility. * Another point is the partial aggregates vs final aggregates complexity. There's little discussed in the paper regarding the "Window manager" and seems like the assumption is the final aggregate over the partial results will have the same amount of time/space complexity comparing with the partial aggregates. Most of the built-in aggregate functions we currently have in Flink are pretty much satisfied with this assumption, however, there are some complex aggregate functions of which the "merge" method might be much more complex than the "accumulate" methods. Would we have to consider the trade off between these two approaches? * https://issues.apache.org/jira/browse/FLINK-5387 seems to suggest there are trade-offs when using aligning window approaches. We can probably extend the discussions here. Thanks, Rong > Improve performance of Sliding Time Window with pane optimization > - > > Key: FLINK-7001 > URL: https://issues.apache.org/jira/browse/FLINK-7001 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > > Currently, the implementation of time-based sliding windows treats each > window individually and replicates records to each window. For a window of 10 > minute size that slides by 1 second the data is replicated 600 fold (10 > minutes / 1 second). We can optimize sliding window by divide windows into > panes (aligned with slide), so that we can avoid record duplication and > leverage the checkpoint. > I will attach a more detail design doc to the issue. > The following issues are similar to this issue: FLINK-5387, FLINK-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7001) Improve performance of Sliding Time Window with pane optimization
[ https://issues.apache.org/jira/browse/FLINK-7001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464239#comment-16464239 ] Rong Rong commented on FLINK-7001: -- Hi [~jark], Is there a FLIP being proposed based on the pain points discussed in here? This inefficiency in windowing has been observed more and more frequently in our day-to-day operations lately. We would like to contribute to the design and the implementation of this improvement if possible :-) Thanks, Rong > Improve performance of Sliding Time Window with pane optimization > - > > Key: FLINK-7001 > URL: https://issues.apache.org/jira/browse/FLINK-7001 > Project: Flink > Issue Type: Improvement > Components: DataStream API >Reporter: Jark Wu >Assignee: Jark Wu >Priority: Major > > Currently, the implementation of time-based sliding windows treats each > window individually and replicates records to each window. For a window of 10 > minute size that slides by 1 second the data is replicated 600 fold (10 > minutes / 1 second). We can optimize sliding window by divide windows into > panes (aligned with slide), so that we can avoid record duplication and > leverage the checkpoint. > I will attach a more detail design doc to the issue. > The following issues are similar to this issue: FLINK-5387, FLINK-6990 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public List eval(MapmapArg) { //... } {code} should automatically resolve that: - *{{ObjectArrayTypeInfo}}* to be the result type. - *{{MapTypeInfo }}* to be the parameter type. was: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public List eval(Map mapArg) { //... } {code} should automatically resolve *{{ObjectArrayTypeInfo}}* & *{{MapTypeInfo }}* to be the result type and parameter type. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature, such as: > {code:java} > public List eval(Map mapArg) { > //... > } > {code} > should automatically resolve that: > - *{{ObjectArrayTypeInfo}}* to be the result type. > - *{{MapTypeInfo }}* to be the > parameter type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature, such as: {code:java} public List eval(MapmapArg) { //... } {code} should automatically resolve *{{ObjectArrayTypeInfo}}* & *{{MapTypeInfo }}* to be the result type and parameter type. was: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature, such as: > {code:java} > public List eval(Map mapArg) { > //... > } > {code} > should automatically resolve *{{ObjectArrayTypeInfo}}* > & *{{MapTypeInfo }}* to be the > result type and parameter type. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as *{{MAP}}*, *{{ARRAY}}*, etc would require user to override *{{getParameterType}}* or *{{getResultType}}* method explicitly. It should be able to resolve the composite type based on the function signature. was: Most of the UDF function signatures that includes composite types such as {quote}MAP{quote}, {quote}ARRAY{quote}, etc would require user to override {code:java}getParameterType{code} or {code:java}getResultType{code} method explicitly. It should be able to resolve the composite type based on the function signature. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-9294: Assignee: Rong Rong > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > *{{MAP}}*, *{{ARRAY}}*, etc would require user to override > *{{getParameterType}}* or *{{getResultType}}* method explicitly. > It should be able to resolve the composite type based on the function > signature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as {quote}MAP{quote}, {quote}ARRAY{quote}, etc would require user to override {code:java}getParameterType{code} or {code:java}getResultType{code} method explicitly. It should be able to resolve the composite type based on the function signature. was: Most of the UDF function signatures that includes composite types such as {code:java}MAP{code}, {code:java}ARRAY{code}, etc would require user to override {code:java}getParameterType{code} or {code:java}getResultType{code} method explicitly. It should be able to resolve the composite type based on the function signature. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > {quote}MAP{quote}, {quote}ARRAY{quote}, etc would require user to override > {code:java}getParameterType{code} or {code:java}getResultType{code} method > explicitly. > It should be able to resolve the composite type based on the function > signature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as {code:java}MAP{code}, {code:java}ARRAY{code}, etc would require user to override {code:java}getParameterType{code} or {code:java}getResultType{code} method explicitly. It should be able to resolve the composite type based on the function signature. was: Most of the UDF function signatures that includes composite types such as {code}MAP{/code} {{ARRAY}}, etc would require user to override {{getParameterType}} or {{getResultType}} method explicitly. It should be able to resolve the composite type based on the function signature. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > {code:java}MAP{code}, {code:java}ARRAY{code}, etc would require user to > override {code:java}getParameterType{code} or {code:java}getResultType{code} > method explicitly. > It should be able to resolve the composite type based on the function > signature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: Most of the UDF function signatures that includes composite types such as {code}MAP{/code} {{ARRAY}}, etc would require user to override {{getParameterType}} or {{getResultType}} method explicitly. It should be able to resolve the composite type based on the function signature. was: For now most of the UDF function signatures that includes composite types such as {{MAP}} {{ARRAY}}, etc would require user to override {{getParameterType}} or {{getResultType}} method explicitly. It should be able to resolve the composite type based on the function signature. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Priority: Major > > Most of the UDF function signatures that includes composite types such as > {code}MAP{/code} {{ARRAY}}, etc would require user to override > {{getParameterType}} or {{getResultType}} method explicitly. > It should be able to resolve the composite type based on the function > signature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
[ https://issues.apache.org/jira/browse/FLINK-9294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9294: - Description: For now most of the UDF function signatures that includes composite types such as {{MAP}} {{ARRAY}}, etc would require user to override {{getParameterType}} or {{getResultType}} method explicitly. It should be able to resolve the composite type based on the function signature. was: For now most of the UDF function signatures that includes composite types such as `MAP` `ARRAY`, etc would require user to override `getParameterType` or `getResultType` method explicitly. It should be able to resolve the composite type based on the function signature. > Improve type inference for UDFs with composite parameter or result type > > > Key: FLINK-9294 > URL: https://issues.apache.org/jira/browse/FLINK-9294 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Rong Rong >Priority: Major > > For now most of the UDF function signatures that includes composite types > such as {{MAP}} {{ARRAY}}, etc would require user to override > {{getParameterType}} or {{getResultType}} method explicitly. > It should be able to resolve the composite type based on the function > signature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9294) Improve type inference for UDFs with composite parameter or result type
Rong Rong created FLINK-9294: Summary: Improve type inference for UDFs with composite parameter or result type Key: FLINK-9294 URL: https://issues.apache.org/jira/browse/FLINK-9294 Project: Flink Issue Type: Improvement Components: Table API SQL Reporter: Rong Rong For now most of the UDF function signatures that includes composite types such as `MAP` `ARRAY`, etc would require user to override `getParameterType` or `getResultType` method explicitly. It should be able to resolve the composite type based on the function signature. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7144) Optimize multiple LogicalAggregate into one
[ https://issues.apache.org/jira/browse/FLINK-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457685#comment-16457685 ] Rong Rong commented on FLINK-7144: -- FLINK-8689 has been merged and FLINK-8690 is on the way. Will fix this one as well :-) > Optimize multiple LogicalAggregate into one > --- > > Key: FLINK-7144 > URL: https://issues.apache.org/jira/browse/FLINK-7144 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Jark Wu >Assignee: Rong Rong >Priority: Major > > When applying multiple GROUP BY, and no aggregates or expression in the first > GROUP BY, and the second GROUP fields is a subset of first GROUP fields. > Then the first GROUP BY can be removed. > Such as the following SQL , > {code} > SELECT a FROM (SELECT a,b,c FROM MyTable GROUP BY a, b, c) GROUP BY a > {code} > should be optimized into > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > but get: > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamGroupAggregate(groupBy=[a, b, c], select=[a, b, c]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > I looked for the Calcite built-in rules, but can't find a match one. So maybe > we should implement one , and maybe we should implement it in Calcite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-7144) Optimize multiple LogicalAggregate into one
[ https://issues.apache.org/jira/browse/FLINK-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-7144: Assignee: Rong Rong > Optimize multiple LogicalAggregate into one > --- > > Key: FLINK-7144 > URL: https://issues.apache.org/jira/browse/FLINK-7144 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Jark Wu >Assignee: Rong Rong >Priority: Major > > When applying multiple GROUP BY, and no aggregates or expression in the first > GROUP BY, and the second GROUP fields is a subset of first GROUP fields. > Then the first GROUP BY can be removed. > Such as the following SQL , > {code} > SELECT a FROM (SELECT a,b,c FROM MyTable GROUP BY a, b, c) GROUP BY a > {code} > should be optimized into > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > but get: > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamGroupAggregate(groupBy=[a, b, c], select=[a, b, c]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > I looked for the Calcite built-in rules, but can't find a match one. So maybe > we should implement one , and maybe we should implement it in Calcite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457679#comment-16457679 ] Rong Rong commented on FLINK-8690: -- Thanks for the suggestion [~fhueske]. I think this is much cleaner than having 2 logical plans since they are suppose to be "logical" LOL. I created the PR based on your suggestions and it looks pretty darn good covering all the cases (and potentially it also clears the way to support distinct group aggregation w/o time window in the future). Please take a look when you have time :-) > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8986) End-to-end test: REST
[ https://issues.apache.org/jira/browse/FLINK-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16455771#comment-16455771 ] Rong Rong commented on FLINK-8986: -- Hi [~Zentol], I addressed most of your comments and created another test. Please see: https://github.com/walterddr/flink/compare/FLINK-8985...walterddr:FLINK-8986-test The idea is to get all the trigger IDs and all ingredients and then run through all of MessageHeaders with Path/Query param and RequestBody. It seems like all tests are passing. Please take another look when you have time. > End-to-end test: REST > - > > Key: FLINK-8986 > URL: https://issues.apache.org/jira/browse/FLINK-8986 > Project: Flink > Issue Type: Sub-task > Components: REST, Tests >Reporter: Till Rohrmann >Assignee: Rong Rong >Priority: Major > > We should add an end-to-end test which verifies that we can use the REST > interface to obtain information about a running job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-9232) Add harness test for AggregationCodeGenerator
[ https://issues.apache.org/jira/browse/FLINK-9232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-9232: Assignee: Rong Rong > Add harness test for AggregationCodeGenerator > -- > > Key: FLINK-9232 > URL: https://issues.apache.org/jira/browse/FLINK-9232 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Instead of relying on ITCase to cover the codegen result. We should have > direct test against that, for example using Harness test framework. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9232) Add harness test for AggregationCodeGenerator
Rong Rong created FLINK-9232: Summary: Add harness test for AggregationCodeGenerator Key: FLINK-9232 URL: https://issues.apache.org/jira/browse/FLINK-9232 Project: Flink Issue Type: Sub-task Reporter: Rong Rong Instead of relying on ITCase to cover the codegen result. We should have direct test against that, for example using Harness test framework. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8739) Optimize runtime support for distinct filter
[ https://issues.apache.org/jira/browse/FLINK-8739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8739: - Description: Possible optimizaitons: 1. Decouple distinct map and actual accumulator so that they can separately be created in codegen. 2. Reuse same distinct accumulator for filtering, e.g. `SELECT COUNT(DISTINCT(a)), SUM(DISTINCT(a))` should reuse the same distinct map. > Optimize runtime support for distinct filter > > > Key: FLINK-8739 > URL: https://issues.apache.org/jira/browse/FLINK-8739 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Priority: Major > > Possible optimizaitons: > 1. Decouple distinct map and actual accumulator so that they can separately > be created in codegen. > 2. Reuse same distinct accumulator for filtering, e.g. `SELECT > COUNT(DISTINCT(a)), SUM(DISTINCT(a))` should reuse the same distinct map. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8739) Optimize runtime support for distinct filter
[ https://issues.apache.org/jira/browse/FLINK-8739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8739: - Summary: Optimize runtime support for distinct filter (was: Optimize runtime support for distinct filter to reuse same distinct accumulator for filtering) > Optimize runtime support for distinct filter > > > Key: FLINK-8739 > URL: https://issues.apache.org/jira/browse/FLINK-8739 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-9172) Support external catalog factory that comes default with SQL-Client
[ https://issues.apache.org/jira/browse/FLINK-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-9172: Assignee: Rong Rong > Support external catalog factory that comes default with SQL-Client > --- > > Key: FLINK-9172 > URL: https://issues.apache.org/jira/browse/FLINK-9172 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > It will be great to have SQL-Client to support some external catalogs > out-of-the-box for SQL users to configure and utilize easily. I am currently > think of having an external catalog factory that spins up both streaming and > batch external catalog table sources and sinks. This could greatly unify and > provide easy access for SQL users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9199) Malfunctioning URL target in some messageheaders
[ https://issues.apache.org/jira/browse/FLINK-9199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9199: - Description: So far the following has URL exceptions based on FLINK-8986 test results. `SubtaskExecutionAttemptAccumulatorsHeaders` -> missing `:` `SubtaskExecutionAttemptDetailsHeaders` -> missing `:` `AggregatedSubtaskMetricsHeaders` -> missing `:`, extra path param not used. > Malfunctioning URL target in some messageheaders > > > Key: FLINK-9199 > URL: https://issues.apache.org/jira/browse/FLINK-9199 > Project: Flink > Issue Type: Bug > Components: REST >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > So far the following has URL exceptions based on FLINK-8986 test results. > `SubtaskExecutionAttemptAccumulatorsHeaders` -> missing `:` > `SubtaskExecutionAttemptDetailsHeaders` -> missing `:` > `AggregatedSubtaskMetricsHeaders` -> missing `:`, extra path param not used. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9199) Malfunctioning URL target in some messageheaders
[ https://issues.apache.org/jira/browse/FLINK-9199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9199: - Summary: Malfunctioning URL target in some messageheaders (was: SubtaskExecutionAttemptAccumulatorsHeaders & SubtaskExecutionAttemptDetailsHeaders has malfunctioning URL) > Malfunctioning URL target in some messageheaders > > > Key: FLINK-9199 > URL: https://issues.apache.org/jira/browse/FLINK-9199 > Project: Flink > Issue Type: Bug > Components: REST >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-9199) SubtaskExecutionAttemptAccumulatorsHeaders & SubtaskExecutionAttemptDetailsHeaders has malfunctioning URL
[ https://issues.apache.org/jira/browse/FLINK-9199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-9199: Assignee: Rong Rong > SubtaskExecutionAttemptAccumulatorsHeaders & > SubtaskExecutionAttemptDetailsHeaders has malfunctioning URL > - > > Key: FLINK-9199 > URL: https://issues.apache.org/jira/browse/FLINK-9199 > Project: Flink > Issue Type: Bug > Components: REST >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9199) SubtaskExecutionAttemptAccumulatorsHeaders & SubtaskExecutionAttemptDetailsHeaders has malfunctioning URL
Rong Rong created FLINK-9199: Summary: SubtaskExecutionAttemptAccumulatorsHeaders & SubtaskExecutionAttemptDetailsHeaders has malfunctioning URL Key: FLINK-9199 URL: https://issues.apache.org/jira/browse/FLINK-9199 Project: Flink Issue Type: Bug Components: REST Reporter: Rong Rong -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9157) Support Apache HCatalog in SQL client
[ https://issues.apache.org/jira/browse/FLINK-9157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438151#comment-16438151 ] Rong Rong commented on FLINK-9157: -- Hi [~dmitry_kober], yes, that would be great! I think for now feel free to take a look at the related tasks. it is currently blocked by FLINK-9170. Once the integration with Table/SQL API is done. we can probably start with the integration. > Support Apache HCatalog in SQL client > - > > Key: FLINK-9157 > URL: https://issues.apache.org/jira/browse/FLINK-9157 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Priority: Major > > Having SQL-Client to support first external catalog: Apache HCatalog out of > the box. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9172) Support external catalog factory that comes default with SQL-Client
[ https://issues.apache.org/jira/browse/FLINK-9172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9172: - Issue Type: Sub-task (was: New Feature) Parent: FLINK-7594 > Support external catalog factory that comes default with SQL-Client > --- > > Key: FLINK-9172 > URL: https://issues.apache.org/jira/browse/FLINK-9172 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Priority: Major > > It will be great to have SQL-Client to support some external catalogs > out-of-the-box for SQL users to configure and utilize easily. I am currently > think of having an external catalog factory that spins up both streaming and > batch external catalog table sources and sinks. This could greatly unify and > provide easy access for SQL users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9172) Support external catalog factory that comes default with SQL-Client
Rong Rong created FLINK-9172: Summary: Support external catalog factory that comes default with SQL-Client Key: FLINK-9172 URL: https://issues.apache.org/jira/browse/FLINK-9172 Project: Flink Issue Type: New Feature Reporter: Rong Rong It will be great to have SQL-Client to support some external catalogs out-of-the-box for SQL users to configure and utilize easily. I am currently think of having an external catalog factory that spins up both streaming and batch external catalog table sources and sinks. This could greatly unify and provide easy access for SQL users. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9157) Support Apache HCatalog in SQL client
[ https://issues.apache.org/jira/browse/FLINK-9157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9157: - Description: Having SQL-Client to support first external catalog: Apache HCatalog out of the box. (was: It will be great to have SQL-Client to support some external catalogs out-of-the-box for SQL users to configure and utilize easily. Such as Apache HCatalog. ) > Support Apache HCatalog in SQL client > - > > Key: FLINK-9157 > URL: https://issues.apache.org/jira/browse/FLINK-9157 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Priority: Major > > Having SQL-Client to support first external catalog: Apache HCatalog out of > the box. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-9157) Support for commonly used external catalog
[ https://issues.apache.org/jira/browse/FLINK-9157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-9157: - Summary: Support for commonly used external catalog (was: Create support for commonly used external catalog) > Support for commonly used external catalog > -- > > Key: FLINK-9157 > URL: https://issues.apache.org/jira/browse/FLINK-9157 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Rong Rong >Priority: Major > > It will be great to have SQL-Client to support some external catalogs > out-of-the-box for SQL users to configure and utilize easily. Such as Apache > HCatalog. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-9157) Create support for commonly used external catalog
Rong Rong created FLINK-9157: Summary: Create support for commonly used external catalog Key: FLINK-9157 URL: https://issues.apache.org/jira/browse/FLINK-9157 Project: Flink Issue Type: Sub-task Components: Table API SQL Reporter: Rong Rong It will be great to have SQL-Client to support some external catalogs out-of-the-box for SQL users to configure and utilize easily. Such as Apache HCatalog. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-8880) Validate configurations for SQL Client
[ https://issues.apache.org/jira/browse/FLINK-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-8880: Assignee: Rong Rong > Validate configurations for SQL Client > -- > > Key: FLINK-8880 > URL: https://issues.apache.org/jira/browse/FLINK-8880 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Xingcan Cui >Assignee: Rong Rong >Priority: Major > > Currently, the configuration items for SQL client are stored in maps and > accessed with default values. They should be validated when creating the > client. Also, logger warnings should be shown when using default values. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-9104) Re-generate REST API documentation for FLIP-6
[ https://issues.apache.org/jira/browse/FLINK-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419211#comment-16419211 ] Rong Rong edited comment on FLINK-9104 at 3/29/18 3:37 PM: --- Seems like several handlers are missing. I will try going down the path [~till.rohrmann] suggested and track down all newly added APIs using derived classes from {{MessageHeaders}}. was (Author: walterddr): Seems like several handlers are missing. I will try going down the path [~till.rohrmann] suggested and track down all newly added APIs from {{MessageHeaders}}. > Re-generate REST API documentation for FLIP-6 > -- > > Key: FLINK-9104 > URL: https://issues.apache.org/jira/browse/FLINK-9104 > Project: Flink > Issue Type: Bug > Components: REST >Affects Versions: 1.5.0 >Reporter: Gary Yao >Assignee: Rong Rong >Priority: Blocker > Labels: flip-6 > > The API documentation is missing for several handlers, e.g., > {{SavepointHandlers}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-9104) Re-generate REST API documentation for FLIP-6
[ https://issues.apache.org/jira/browse/FLINK-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419211#comment-16419211 ] Rong Rong commented on FLINK-9104: -- Seems like several handlers are missing. I will try going down the path [~till.rohrmann] suggested and track down all newly added APIs from {{MessageHeaders}}. > Re-generate REST API documentation for FLIP-6 > -- > > Key: FLINK-9104 > URL: https://issues.apache.org/jira/browse/FLINK-9104 > Project: Flink > Issue Type: Bug > Components: REST >Affects Versions: 1.5.0 >Reporter: Gary Yao >Assignee: Rong Rong >Priority: Blocker > Labels: flip-6 > > The API documentation is missing for several handlers, e.g., > {{SavepointHandlers}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-9104) Re-generate REST API documentation for FLIP-6
[ https://issues.apache.org/jira/browse/FLINK-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-9104: Assignee: Rong Rong > Re-generate REST API documentation for FLIP-6 > -- > > Key: FLINK-9104 > URL: https://issues.apache.org/jira/browse/FLINK-9104 > Project: Flink > Issue Type: Bug > Components: REST >Affects Versions: 1.5.0 >Reporter: Gary Yao >Assignee: Rong Rong >Priority: Blocker > Labels: flip-6 > > The API documentation is missing for several handlers, e.g., > {{SavepointHandlers}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8986) End-to-end test: REST
[ https://issues.apache.org/jira/browse/FLINK-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419207#comment-16419207 ] Rong Rong commented on FLINK-8986: -- Thanks [~till.rohrmann] for the info. Yup I can upgrade the doc first then add e2e test over all REST calls. Much appreciate the pointers. > End-to-end test: REST > - > > Key: FLINK-8986 > URL: https://issues.apache.org/jira/browse/FLINK-8986 > Project: Flink > Issue Type: Sub-task > Components: REST, Tests >Reporter: Till Rohrmann >Assignee: Rong Rong >Priority: Major > > We should add an end-to-end test which verifies that we can use the REST > interface to obtain information about a running job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8985) End-to-end test: CLI
[ https://issues.apache.org/jira/browse/FLINK-8985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419202#comment-16419202 ] Rong Rong commented on FLINK-8985: -- Thanks [~till.rohrmann] for the confirmation. Yes, I saw you already have the PR up for FLINK-9109. Will start working and incorporate changes accordingly. > End-to-end test: CLI > > > Key: FLINK-8985 > URL: https://issues.apache.org/jira/browse/FLINK-8985 > Project: Flink > Issue Type: Sub-task > Components: Client, Tests >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Rong Rong >Priority: Major > > We should an end-to-end test which verifies that all client commands are > working correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8880) Validate configurations for SQL Client
[ https://issues.apache.org/jira/browse/FLINK-8880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417893#comment-16417893 ] Rong Rong commented on FLINK-8880: -- Hi [~xccui]. I think I can help with this task. There are still some pending changes to the configurations such as FLINK-8866 & FLINK-9059. I am planning to add those as blocking issues to this one and start working on the validation framework, what do you think? > Validate configurations for SQL Client > -- > > Key: FLINK-8880 > URL: https://issues.apache.org/jira/browse/FLINK-8880 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Xingcan Cui >Priority: Major > > Currently, the configuration items for SQL client are stored in maps and > accessed with default values. They should be validated when creating the > client. Also, logger warnings should be shown when using default values. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-8985) End-to-end test: CLI
[ https://issues.apache.org/jira/browse/FLINK-8985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-8985: Assignee: Rong Rong > End-to-end test: CLI > > > Key: FLINK-8985 > URL: https://issues.apache.org/jira/browse/FLINK-8985 > Project: Flink > Issue Type: Sub-task > Components: Client, Tests >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Assignee: Rong Rong >Priority: Major > > We should an end-to-end test which verifies that all client commands are > working correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8986) End-to-end test: REST
[ https://issues.apache.org/jira/browse/FLINK-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411958#comment-16411958 ] Rong Rong commented on FLINK-8986: -- Hi [~till.rohrmann], I can help with this e2e test. Is this specifically referring to the monitoring APIs discussed in the official docs: https://ci.apache.org/projects/flink/flink-docs-release-1.4/monitoring/rest_api.html? Any additional REST APIs we should be focusing on? > End-to-end test: REST > - > > Key: FLINK-8986 > URL: https://issues.apache.org/jira/browse/FLINK-8986 > Project: Flink > Issue Type: Sub-task > Components: REST, Tests >Reporter: Till Rohrmann >Priority: Major > > We should add an end-to-end test which verifies that we can use the REST > interface to obtain information about a running job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-8986) End-to-end test: REST
[ https://issues.apache.org/jira/browse/FLINK-8986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-8986: Assignee: Rong Rong > End-to-end test: REST > - > > Key: FLINK-8986 > URL: https://issues.apache.org/jira/browse/FLINK-8986 > Project: Flink > Issue Type: Sub-task > Components: REST, Tests >Reporter: Till Rohrmann >Assignee: Rong Rong >Priority: Major > > We should add an end-to-end test which verifies that we can use the REST > interface to obtain information about a running job. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8985) End-to-end test: CLI
[ https://issues.apache.org/jira/browse/FLINK-8985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411952#comment-16411952 ] Rong Rong commented on FLINK-8985: -- Hi [~till.rohrmann]. I can help contributing to the REST & CLI e2e testing. Looking at some of the current scripts in `flink-end-to-end-test` module there are already some CLI/REST utilized. do you think starting with the list of commands in official docs: https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/cli.html is a good idea? > End-to-end test: CLI > > > Key: FLINK-8985 > URL: https://issues.apache.org/jira/browse/FLINK-8985 > Project: Flink > Issue Type: Sub-task > Components: Client, Tests >Affects Versions: 1.5.0 >Reporter: Till Rohrmann >Priority: Major > > We should an end-to-end test which verifies that all client commands are > working correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394806#comment-16394806 ] Rong Rong commented on FLINK-8690: -- Sweet. Thank you [~hequn8128]! I am still a bit skeptical putting a "LOGICAL_RULESET" which is not entirely logical, I guess I will receive more feedback once the PR is out. :-P Much appreciate the feedback and suggestion Hequn. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394379#comment-16394379 ] Rong Rong commented on FLINK-8690: -- Thanks [~hequn8128] for the valuable feedback :-) I did some research and came up with a change somewhat close to what you expected. Please see: https://github.com/walterddr/flink/commit/6c25d178e7acff57c598e70a8f4a2c0f3b74332e I had a hard time trying to find a place to put the two different LogicalAggregateConverter(s), so I put them separately in 2 different LOGICAL_OPT_RULESET, for Stream and Batch each. Any comments or suggestions are highly appreciated :-) > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-7144) Optimize multiple LogicalAggregate into one
[ https://issues.apache.org/jira/browse/FLINK-7144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394377#comment-16394377 ] Rong Rong commented on FLINK-7144: -- Hi [~jark], I am working on FLINK-8688 and I think I can use the fix for this JIRA first before I consolidate all DISTINCT AGG, if you haven't started, can I try implementing the optimization for this JIRA together? > Optimize multiple LogicalAggregate into one > --- > > Key: FLINK-7144 > URL: https://issues.apache.org/jira/browse/FLINK-7144 > Project: Flink > Issue Type: Improvement > Components: Table API SQL >Reporter: Jark Wu >Priority: Major > > When applying multiple GROUP BY, and no aggregates or expression in the first > GROUP BY, and the second GROUP fields is a subset of first GROUP fields. > Then the first GROUP BY can be removed. > Such as the following SQL , > {code} > SELECT a FROM (SELECT a,b,c FROM MyTable GROUP BY a, b, c) GROUP BY a > {code} > should be optimized into > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > but get: > {code} > DataStreamGroupAggregate(groupBy=[a], select=[a]) > DataStreamCalc(select=[a]) > DataStreamGroupAggregate(groupBy=[a, b, c], select=[a, b, c]) > DataStreamScan(table=[[_DataStreamTable_0]]) > {code} > I looked for the Calcite built-in rules, but can't find a match one. So maybe > we should implement one , and maybe we should implement it in Calcite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394068#comment-16394068 ] Rong Rong commented on FLINK-8690: -- You are right. Haven't thought about that. I will verify the plan cost. However I am less worry about the DataStream side as the Calcite rule will generate 2 operations that requires state backend while using MapView will generate only 1. My worry is on DataSet, as per my initial implementation of splitting FlinkLogicalAggregate into 2, both of them ignores the plan generated by {{AggregateExpandDistinctAggregatesRule}} > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394046#comment-16394046 ] Rong Rong edited comment on FLINK-8690 at 3/10/18 5:54 AM: --- That should resolve our problem partially. The real reason why I introduced another node before logical plan is because **AggregateExpandDistinctAggregatesRule** is actually calcite specific and will apply globally regardless of whether it is on DataSet or DataStream and that's the rule we want to avoid applying in DataStream API. Basically it converts {code:java} COUNT (DISTINCT f1) {code} Into {code:java} COUNT (DIST_f1) FROM (SELECT f1 AS DIST_f1 GROUP BY f1) {code} There's always a way to unite these two operators together using FlinkLogicalAggregateConverter. But it seems counter-intuitive first split then unite them together. Another possibility is to introduce logical plan RuleSet based on whether it's on stream or batch. But that seems to disagree with the purpose of "logical" plan was (Author: walterddr): That should resolve our problem partially. The real reason why I introduced another node before logical plan is because **AggregateExpandDistinctAggregatesRule** is actually calcite specific and will apply globally regardless of whether it is on DataSet or DataStream and that's the rule we want to avoid applying in DataStream API. Basically it converts {code:java} COUNT (DISTINCT f1) {code} Into {code:java} COUNT (DIST_f1) FROM (SELECT f1 AS DIST_f1 GROUP BY f1) {code} There's always a way to unite these two operators together using FlinkLogicalAggregateConverter. But it seems counter-intuitive first split then unite them together. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394046#comment-16394046 ] Rong Rong edited comment on FLINK-8690 at 3/10/18 5:52 AM: --- That should resolve our problem partially. The real reason why I introduced another node before logical plan is because **AggregateExpandDistinctAggregatesRule** is actually calcite specific and will apply globally regardless of whether it is on DataSet or DataStream and that's the rule we want to avoid applying in DataStream API. Basically it converts {code:java} COUNT (DISTINCT f1) {code} Into {code:java} COUNT (DIST_f1) FROM (SELECT f1 AS DIST_f1 GROUP BY f1) {code} There's always a way to unite these two operators together using FlinkLogicalAggregateConverter. But it seems counter-intuitive first split then unite them together. was (Author: walterddr): That should resolve our problem partially. The real reason why I introduced another node before logical plan is because **AggregateExpandDistinctAggregatesRule** is actually calcite specific and will apply globally regardless of whether it is on DataSet or DataStream and that's the rule we want to avoid applying in DataStream API. Basically it converts {code:java} /COUNT (DISTINCT f1) {code} Into {code:java} COUNT (DIST_A) FROM (SELECT A AS DIST_A GROUP BY A) {code} There's always a way to unite these two operators together using FlinkLogicalAggregateConverter. But it seems counter-intuitive first split then unite them together. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394046#comment-16394046 ] Rong Rong commented on FLINK-8690: -- That should resolve our problem partially. The real reason why I introduced another node before logical plan is because **AggregateExpandDistinctAggregatesRule** is actually calcite specific and will apply globally regardless of whether it is on DataSet or DataStream and that's the rule we want to avoid applying in DataStream API. Basically it converts {code:java} /COUNT (DISTINCT f1) {code} Into {code:java} COUNT (DIST_A) FROM (SELECT A AS DIST_A GROUP BY A) {code} There's always a way to unite these two operators together using FlinkLogicalAggregateConverter. But it seems counter-intuitive first split then unite them together. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8863) Add user-defined function support in SQL Client
[ https://issues.apache.org/jira/browse/FLINK-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16394042#comment-16394042 ] Rong Rong commented on FLINK-8863: -- Sounds good. We can deal with that in future. Thanks for the quick reply > Add user-defined function support in SQL Client > --- > > Key: FLINK-8863 > URL: https://issues.apache.org/jira/browse/FLINK-8863 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Xingcan Cui >Priority: Major > > This issue is a subtask of part two "Full Embedded SQL Client" of the > implementation plan mentioned in > [FLIP-24|https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client]. > It should be possible to declare user-defined functions in the SQL client. > For now, we limit the registration to classes that implement > {{ScalarFunction}}, {{TableFunction}}, {{AggregateFunction}}. Functions that > are implemented in SQL are not part of this issue. > I would suggest to introduce a {{functions}} top-level property. The > declaration could look similar to: > {code} > functions: > - name: testFunction > from: class <-- optional, default: class > class: org.my.MyScalarFunction > constructor: <-- optional, needed for > certain types of functions > - 42.0 > - class: org.my.Class <-- possibility to create > objects via properties > constructor: > - 1 > - true > - false > - "whatever" > - type: INT > value: 1 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8863) Add user-defined function support in SQL Client
[ https://issues.apache.org/jira/browse/FLINK-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393966#comment-16393966 ] Rong Rong commented on FLINK-8863: -- Thanks [~xccui] for the quick reply. I am assuming users would have to provide the JAR files when launch the SQL client. one question I have is if there are multiple version of the same class was found, or multiple function signature list were found in several different JARs. This can clearly be avoided if we only limit one UDF JAR file to search functions from. Another question is related to our use case in FLINK-7373, where we have dynamic UDF / JAR file declaration in SQL itself, so I was wondering if there can be a optional JAR file field to specify which JAR file function should be loading from. But, this use case could also be categorized as "Functions that are implemented in SQL" as per [~twalthr]'s description of the JIRA. What do you think? > Add user-defined function support in SQL Client > --- > > Key: FLINK-8863 > URL: https://issues.apache.org/jira/browse/FLINK-8863 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Xingcan Cui >Priority: Major > > This issue is a subtask of part two "Full Embedded SQL Client" of the > implementation plan mentioned in > [FLIP-24|https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client]. > It should be possible to declare user-defined functions in the SQL client. > For now, we limit the registration to classes that implement > {{ScalarFunction}}, {{TableFunction}}, {{AggregateFunction}}. Functions that > are implemented in SQL are not part of this issue. > I would suggest to introduce a {{functions}} top-level property. The > declaration could look similar to: > {code} > functions: > - name: testFunction > from: class <-- optional, default: class > class: org.my.MyScalarFunction > constructor: <-- optional, needed for > certain types of functions > - 42.0 > - class: org.my.Class <-- possibility to create > objects via properties > constructor: > - 1 > - true > - false > - "whatever" > - type: INT > value: 1 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8863) Add user-defined function support in SQL Client
[ https://issues.apache.org/jira/browse/FLINK-8863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16393942#comment-16393942 ] Rong Rong commented on FLINK-8863: -- Hi [~twalthr], in the task description, there's no specification regarding where the UDF is loaded from; According to FLIP-24, seems like there's only MyToolBox.jar is described as containing UDF. Are we going to always assume UDFs are contained in a pre-defined JAR file? > Add user-defined function support in SQL Client > --- > > Key: FLINK-8863 > URL: https://issues.apache.org/jira/browse/FLINK-8863 > Project: Flink > Issue Type: Sub-task > Components: Table API SQL >Reporter: Timo Walther >Assignee: Xingcan Cui >Priority: Major > > This issue is a subtask of part two "Full Embedded SQL Client" of the > implementation plan mentioned in > [FLIP-24|https://cwiki.apache.org/confluence/display/FLINK/FLIP-24+-+SQL+Client]. > It should be possible to declare user-defined functions in the SQL client. > For now, we limit the registration to classes that implement > {{ScalarFunction}}, {{TableFunction}}, {{AggregateFunction}}. Functions that > are implemented in SQL are not part of this issue. > I would suggest to introduce a {{functions}} top-level property. The > declaration could look similar to: > {code} > functions: > - name: testFunction > from: class <-- optional, default: class > class: org.my.MyScalarFunction > constructor: <-- optional, needed for > certain types of functions > - 42.0 > - class: org.my.Class <-- possibility to create > objects via properties > constructor: > - 1 > - true > - false > - "whatever" > - type: INT > value: 1 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16390467#comment-16390467 ] Rong Rong commented on FLINK-8690: -- I created one of the initial (not at all perfect) support trying to create special treatment by adding a distinct rule on DataStreamNormRuleSet, so logical plan for DataStream directly coverts distinct logicalAggregate to FlinkLogicalAggregates, while DataSet still goes through AggregateExpandDistinctAggregatesRule. Please see them here: https://github.com/walterddr/flink/commit/ef9777cd8859180f38900393967deaabf39b7453 [~hequn8128] mentioned that we can override the match() function to achieve this. However, in order to separately generate logical plan for DataSet and DataStream, current solution I can think off is to add new rule with the NormRuleSet. Do you think this is the right path? Any comments or suggestions are highly appreciate :-) > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16376938#comment-16376938 ] Rong Rong commented on FLINK-8690: -- Yes, changed the description > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8690: - Description: Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not allow distinct aggregate. We are proposing to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on datastream as well. was: Currently*, FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not allow distinct aggregate. We are proposing to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on datastream as well. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently, *FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8690: - Description: Currently*, FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not allow distinct aggregate. We are proposing to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on datastream as well. was: **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. We are proposing to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support unbounded distinct aggregation on datastream as well. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > Currently*, FlinkLogicalAggregate / FlinkLogicalWindowAggregate* does not > allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate*, to support unbounded distinct aggregation on > datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8690: - Description: **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. We are proposing to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support unbounded distinct aggregation on datastream as well. was: **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. We are proposing to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support unbounded distinct aggregation as well. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support > unbounded distinct aggregation on datastream as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8689) Add runtime support of distinct filter using MapView
[ https://issues.apache.org/jira/browse/FLINK-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8689: - Summary: Add runtime support of distinct filter using MapView (was: Add runtime support of distinct filter using MapView for GenerateAggregation) > Add runtime support of distinct filter using MapView > - > > Key: FLINK-8689 > URL: https://issues.apache.org/jira/browse/FLINK-8689 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This ticket should cover distinct aggregate function support to codegen for > *AggregateCall*, where *isDistinct* fields is set to true. > This can be verified using the following SQL, which is not currently > producing correct results. > {code:java} > SELECT > a, > SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND > CURRENT ROW) > FROM > MyTable{code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (FLINK-8739) Optimize runtime support for distinct filter to reuse same distinct accumulator for filtering
Rong Rong created FLINK-8739: Summary: Optimize runtime support for distinct filter to reuse same distinct accumulator for filtering Key: FLINK-8739 URL: https://issues.apache.org/jira/browse/FLINK-8739 Project: Flink Issue Type: Sub-task Reporter: Rong Rong -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8690: - Description: **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. We are propose to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support unbounded distinct aggregation as well. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. > We are propose to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support > unbounded distinct aggregation as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8690: - Description: **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. We are proposing to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support unbounded distinct aggregation as well. was: **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. We are propose to reuse distinct aggregate codegen work designed for *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support unbounded distinct aggregation as well. > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > **Currently, *FlinkLogicalAggregate* does not allow distinct aggregate. > We are proposing to reuse distinct aggregate codegen work designed for > *FlinkLogicalOverAggregate* / *FlinkLogicalWindowAggregate*, to support > unbounded distinct aggregation as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8689) Add runtime support of distinct filter using MapView for GenerateAggregation
[ https://issues.apache.org/jira/browse/FLINK-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8689: - Description: This ticket should cover distinct aggregate function support to codegen for *AggregateCall*, where *isDistinct* fields is set to true. This can be verified using the following SQL, which is not currently producing correct results. {code:java} SELECT a, SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) FROM MyTable{code} was: This ticket should cover distinct aggregate function support to codegen for `AggregateCall`, where `isDistinct` fields is set to true. This can be verified using the following SQL, which is not currently producing correct results. {code:java} SELECT a, SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) FROM MyTable{code} > Add runtime support of distinct filter using MapView for GenerateAggregation > > > Key: FLINK-8689 > URL: https://issues.apache.org/jira/browse/FLINK-8689 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This ticket should cover distinct aggregate function support to codegen for > *AggregateCall*, where *isDistinct* fields is set to true. > This can be verified using the following SQL, which is not currently > producing correct results. > {code:java} > SELECT > a, > SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND > CURRENT ROW) > FROM > MyTable{code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8690: - Summary: Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct agg on DataStream (was: Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct operator ) > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct agg on DataStream > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8689) Add runtime support of distinct filter using MapView for GenerateAggregation
[ https://issues.apache.org/jira/browse/FLINK-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8689: - Description: This should add distinct aggregate function support to codegen for `AggregateCall`, where `isDistinct` fields is set to true. This can be verified using the following SQL, which is not currently producing correct results. {code:java} SELECT a, SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) FROM MyTable{code} > Add runtime support of distinct filter using MapView for GenerateAggregation > > > Key: FLINK-8689 > URL: https://issues.apache.org/jira/browse/FLINK-8689 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This should add distinct aggregate function support to codegen for > `AggregateCall`, where `isDistinct` fields is set to true. > This can be verified using the following SQL, which is not currently > producing correct results. > > {code:java} > SELECT > a, > SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND > CURRENT ROW) > FROM > MyTable{code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8689) Add runtime support of distinct filter using MapView for GenerateAggregation
[ https://issues.apache.org/jira/browse/FLINK-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8689: - Description: This ticket should cover distinct aggregate function support to codegen for `AggregateCall`, where `isDistinct` fields is set to true. This can be verified using the following SQL, which is not currently producing correct results. {code:java} SELECT a, SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) FROM MyTable{code} was: This should add distinct aggregate function support to codegen for `AggregateCall`, where `isDistinct` fields is set to true. This can be verified using the following SQL, which is not currently producing correct results. {code:java} SELECT a, SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) FROM MyTable{code} > Add runtime support of distinct filter using MapView for GenerateAggregation > > > Key: FLINK-8689 > URL: https://issues.apache.org/jira/browse/FLINK-8689 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This ticket should cover distinct aggregate function support to codegen for > `AggregateCall`, where `isDistinct` fields is set to true. > This can be verified using the following SQL, which is not currently > producing correct results. > {code:java} > SELECT > a, > SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND > CURRENT ROW) > FROM > MyTable{code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (FLINK-8689) Add runtime support of distinct filter using MapView for GenerateAggregation
[ https://issues.apache.org/jira/browse/FLINK-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong updated FLINK-8689: - Description: This should add distinct aggregate function support to codegen for `AggregateCall`, where `isDistinct` fields is set to true. This can be verified using the following SQL, which is not currently producing correct results. {code:java} SELECT a, SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) FROM MyTable{code} was: This should add distinct aggregate function support to codegen for `AggregateCall`, where `isDistinct` fields is set to true. This can be verified using the following SQL, which is not currently producing correct results. {code:java} SELECT a, SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) FROM MyTable{code} > Add runtime support of distinct filter using MapView for GenerateAggregation > > > Key: FLINK-8689 > URL: https://issues.apache.org/jira/browse/FLINK-8689 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > > This should add distinct aggregate function support to codegen for > `AggregateCall`, where `isDistinct` fields is set to true. > This can be verified using the following SQL, which is not currently > producing correct results. > {code:java} > SELECT > a, > SUM(b) OVER (PARTITION BY a ORDER BY proctime ROWS BETWEEN 5 PRECEDING AND > CURRENT ROW) > FROM > MyTable{code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-8689) Add runtime support of distinct filter using MapView for GenerateAggregation
[ https://issues.apache.org/jira/browse/FLINK-8689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-8689: Assignee: Rong Rong > Add runtime support of distinct filter using MapView for GenerateAggregation > > > Key: FLINK-8689 > URL: https://issues.apache.org/jira/browse/FLINK-8689 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (FLINK-8690) Update logical rule set to generate FlinkLogicalAggregate explicitly allow distinct operator
[ https://issues.apache.org/jira/browse/FLINK-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rong Rong reassigned FLINK-8690: Assignee: Rong Rong > Update logical rule set to generate FlinkLogicalAggregate explicitly allow > distinct operator > - > > Key: FLINK-8690 > URL: https://issues.apache.org/jira/browse/FLINK-8690 > Project: Flink > Issue Type: Sub-task >Reporter: Rong Rong >Assignee: Rong Rong >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)