[jira] [Resolved] (BEAM-4723) Enhance Datetime*Expression Datetime Type
[ https://issues.apache.org/jira/browse/BEAM-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang resolved BEAM-4723. - Resolution: Fixed Fix Version/s: 2.7.0 > Enhance Datetime*Expression Datetime Type > - > > Key: BEAM-4723 > URL: https://issues.apache.org/jira/browse/BEAM-4723 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Fix For: 2.7.0 > > Time Spent: 7h 20m > Remaining Estimate: 0h > > Datetime*Expression only supports timestamp type for first operand now. We > should let it accept all Datetime_Types -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-2466) Add Kafka Streams runner
[ https://issues.apache.org/jira/browse/BEAM-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16587801#comment-16587801 ] Kai Jiang commented on BEAM-2466: - I kicked off with working at PoC of kafka streams runner. > Add Kafka Streams runner > > > Key: BEAM-2466 > URL: https://issues.apache.org/jira/browse/BEAM-2466 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Lorand Peter Kasler >Assignee: Kai Jiang >Priority: Minor > > Kafka Streams (https://kafka.apache.org/documentation/streams) has more and > more features that could make it a viable candidate for a streaming runner. > It uses DataFlow-like model -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-2466) Add Kafka Streams runner
[ https://issues.apache.org/jira/browse/BEAM-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-2466: --- Assignee: Kai Jiang > Add Kafka Streams runner > > > Key: BEAM-2466 > URL: https://issues.apache.org/jira/browse/BEAM-2466 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Lorand Peter Kasler >Assignee: Kai Jiang >Priority: Minor > > Kafka Streams (https://kafka.apache.org/documentation/streams) has more and > more features that could make it a viable candidate for a streaming runner. > It uses DataFlow-like model -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3558) aggregation expression can't apply to math or arithmetic expressions
[ https://issues.apache.org/jira/browse/BEAM-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16581748#comment-16581748 ] Kai Jiang commented on BEAM-3558: - Calcite logical optimization rule set has one rule to solve it. > aggregation expression can't apply to math or arithmetic expressions > > > Key: BEAM-3558 > URL: https://issues.apache.org/jira/browse/BEAM-3558 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Priority: Major > > fails when executing sql > 'select sum(c1)+2 from PCOLLECTION group by c2' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-3518) Support regr_* functions
[ https://issues.apache.org/jira/browse/BEAM-3518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang resolved BEAM-3518. - Resolution: Won't Fix Fix Version/s: Not applicable reopen if needed > Support regr_* functions > > > Key: BEAM-3518 > URL: https://issues.apache.org/jira/browse/BEAM-3518 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Fix For: Not applicable > > > Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, > regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4509) Support ROW_NUMBER() over
[ https://issues.apache.org/jira/browse/BEAM-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-4509: Summary: Support ROW_NUMBER() over (was: Implement ROW_NUMBER) > Support ROW_NUMBER() over > - > > Key: BEAM-4509 > URL: https://issues.apache.org/jira/browse/BEAM-4509 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Anton Kedin >Priority: Major > > Design and implement ROW_NUMBER() OVER window. It is supported by Calcite and > we should look at feasibility of supporting it in Beam SQL > [StackOverflow > Post|https://stackoverflow.com/questions/50724531/implement-row-number-in-beamsql] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4509) Implement ROW_NUMBER
[ https://issues.apache.org/jira/browse/BEAM-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-4509: Issue Type: Sub-task (was: Improvement) Parent: BEAM-5046 > Implement ROW_NUMBER > > > Key: BEAM-4509 > URL: https://issues.apache.org/jira/browse/BEAM-4509 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Anton Kedin >Priority: Major > > Design and implement ROW_NUMBER() OVER window. It is supported by Calcite and > we should look at feasibility of supporting it in Beam SQL > [StackOverflow > Post|https://stackoverflow.com/questions/50724531/implement-row-number-in-beamsql] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-5111) SUM0/SUM
[ https://issues.apache.org/jira/browse/BEAM-5111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16575067#comment-16575067 ] Kai Jiang commented on BEAM-5111: - I can investigate and fix this. > SUM0/SUM > > > Key: BEAM-5111 > URL: https://issues.apache.org/jira/browse/BEAM-5111 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Rui Wang >Assignee: Rui Wang >Priority: Major > > SUM and SUM0 share the same code, either one could be wrong. Should fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-5050) [SQL] NULLs are aggregated incorrectly
[ https://issues.apache.org/jira/browse/BEAM-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-5050: --- Assignee: Xu Mingmin > [SQL] NULLs are aggregated incorrectly > -- > > Key: BEAM-5050 > URL: https://issues.apache.org/jira/browse/BEAM-5050 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Anton Kedin >Assignee: Xu Mingmin >Priority: Major > Fix For: Not applicable > > Time Spent: 50m > Remaining Estimate: 0h > > For example, COUNT(field) should not count records with NULL field. We also > should handle and test on other aggregation functions (like AVG, SUM, MIN, > MAX, VAR_POP, VAR_SAMP, etc.) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-5050) [SQL] NULLs are aggregated incorrectly
[ https://issues.apache.org/jira/browse/BEAM-5050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-5050: Description: For example, COUNT(field) should not count records with NULL field. We also should handle and test on other aggregation functions (like AVG, SUM, MIN, MAX, VAR_POP, VAR_SAMP, etc.) (was: For example, COUNT(field) should not count records with NULL field) > [SQL] NULLs are aggregated incorrectly > -- > > Key: BEAM-5050 > URL: https://issues.apache.org/jira/browse/BEAM-5050 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Reporter: Anton Kedin >Priority: Major > > For example, COUNT(field) should not count records with NULL field. We also > should handle and test on other aggregation functions (like AVG, SUM, MIN, > MAX, VAR_POP, VAR_SAMP, etc.) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-2277) IllegalArgumentException when using Hadoop file system for WordCount example.
[ https://issues.apache.org/jira/browse/BEAM-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561070#comment-16561070 ] Kai Jiang commented on BEAM-2277: - I also encountered this issue in 2.6.0. > IllegalArgumentException when using Hadoop file system for WordCount example. > - > > Key: BEAM-2277 > URL: https://issues.apache.org/jira/browse/BEAM-2277 > Project: Beam > Issue Type: Bug > Components: z-do-not-use-sdk-java-extensions >Reporter: Aviem Zur >Assignee: Aviem Zur >Priority: Blocker > Fix For: 2.0.0 > > > IllegalArgumentException when using Hadoop file system for WordCount example. > Occurred when running WordCount example using Spark runner on a YARN cluster. > Command-line arguments: > {code:none} > --runner=SparkRunner --inputFile=hdfs:///user/myuser/kinglear.txt > --output=hdfs:///user/myuser/wc/wc > {code} > Stack trace: > {code:none} > java.lang.IllegalArgumentException: Expect srcResourceIds and destResourceIds > have the same scheme, but received file, hdfs. > at > org.apache.beam.sdk.repackaged.com.google.common.base.Preconditions.checkArgument(Preconditions.java:122) > at > org.apache.beam.sdk.io.FileSystems.validateSrcDestLists(FileSystems.java:394) > at org.apache.beam.sdk.io.FileSystems.copy(FileSystems.java:236) > at > org.apache.beam.sdk.io.FileBasedSink$WriteOperation.copyToOutputFiles(FileBasedSink.java:626) > at > org.apache.beam.sdk.io.FileBasedSink$WriteOperation.finalize(FileBasedSink.java:516) > at > org.apache.beam.sdk.io.WriteFiles$2.processElement(WriteFiles.java:592) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-3386) Dependency conflict when Calcite is included in a project.
[ https://issues.apache.org/jira/browse/BEAM-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-3386: --- Assignee: Kai Jiang > Dependency conflict when Calcite is included in a project. > -- > > Key: BEAM-3386 > URL: https://issues.apache.org/jira/browse/BEAM-3386 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0 >Reporter: Austin Haas >Assignee: Kai Jiang >Priority: Critical > > When Calcite (v. 1.13.0) is included in a project that also includes Beam and > the Beam SQL extension, then the following error is thrown when trying to run > Beam code. > ClassCastException > org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem cannot > be cast to org.apache.calcite.rel.type.RelDataTypeSystem > org.apache.calcite.jdbc.CalciteConnectionImpl. > (CalciteConnectionImpl.java:120) > > org.apache.calcite.jdbc.CalciteJdbc41Factory$CalciteJdbc41Connection. > (CalciteJdbc41Factory.java:114) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:59) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:44) > org.apache.calcite.jdbc.CalciteFactory.newConnection > (CalciteFactory.java:53) > org.apache.calcite.avatica.UnregisteredDriver.connect > (UnregisteredDriver.java:138) > java.sql.DriverManager.getConnection (DriverManager.java:664) > java.sql.DriverManager.getConnection (DriverManager.java:208) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPrepare > (Frameworks.java:145) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPlanner > (Frameworks.java:106) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.ready > (PlannerImpl.java:140) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.parse > (PlannerImpl.java:170) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-2466) Add Kafka Streams runner
[ https://issues.apache.org/jira/browse/BEAM-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-2466: Description: Kafka Streams (https://kafka.apache.org/documentation/streams) has more and more features that could make it a viable candidate for a streaming runner. It uses DataFlow-like model (was: _emphasized text_Kafka Streams (https://kafka.apache.org/documentation/streams) has more and more features that could make it a viable candidate for a streaming runner. ) > Add Kafka Streams runner > > > Key: BEAM-2466 > URL: https://issues.apache.org/jira/browse/BEAM-2466 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Lorand Peter Kasler >Priority: Minor > > Kafka Streams (https://kafka.apache.org/documentation/streams) has more and > more features that could make it a viable candidate for a streaming runner. > It uses DataFlow-like model -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-2466) Add Kafka Streams runner
[ https://issues.apache.org/jira/browse/BEAM-2466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-2466: Description: _emphasized text_Kafka Streams (https://kafka.apache.org/documentation/streams) has more and more features that could make it a viable candidate for a streaming runner. (was: Kafka Streams (https://kafka.apache.org/documentation/streams) has more and more features that could make it a viable candidate for a streaming runner. ) > Add Kafka Streams runner > > > Key: BEAM-2466 > URL: https://issues.apache.org/jira/browse/BEAM-2466 > Project: Beam > Issue Type: Wish > Components: runner-ideas >Reporter: Lorand Peter Kasler >Priority: Minor > > _emphasized text_Kafka Streams > (https://kafka.apache.org/documentation/streams) has more and more features > that could make it a viable candidate for a streaming runner. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3386) Dependency conflict when Calcite is included in a project.
[ https://issues.apache.org/jira/browse/BEAM-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553880#comment-16553880 ] Kai Jiang commented on BEAM-3386: - An investigation was sent to dev list. https://lists.apache.org/thread.html/3c762fe6fa547a35e550f1c7f5aab2d35ae4e7531d8ffccd7920@%3Cdev.beam.apache.org%3E > Dependency conflict when Calcite is included in a project. > -- > > Key: BEAM-3386 > URL: https://issues.apache.org/jira/browse/BEAM-3386 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0 >Reporter: Austin Haas >Priority: Critical > > When Calcite (v. 1.13.0) is included in a project that also includes Beam and > the Beam SQL extension, then the following error is thrown when trying to run > Beam code. > ClassCastException > org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem cannot > be cast to org.apache.calcite.rel.type.RelDataTypeSystem > org.apache.calcite.jdbc.CalciteConnectionImpl. > (CalciteConnectionImpl.java:120) > > org.apache.calcite.jdbc.CalciteJdbc41Factory$CalciteJdbc41Connection. > (CalciteJdbc41Factory.java:114) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:59) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:44) > org.apache.calcite.jdbc.CalciteFactory.newConnection > (CalciteFactory.java:53) > org.apache.calcite.avatica.UnregisteredDriver.connect > (UnregisteredDriver.java:138) > java.sql.DriverManager.getConnection (DriverManager.java:664) > java.sql.DriverManager.getConnection (DriverManager.java:208) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPrepare > (Frameworks.java:145) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPlanner > (Frameworks.java:106) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.ready > (PlannerImpl.java:140) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.parse > (PlannerImpl.java:170) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3386) Dependency conflict when Calcite is included in a project.
[ https://issues.apache.org/jira/browse/BEAM-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-3386: Affects Version/s: 2.6.0 > Dependency conflict when Calcite is included in a project. > -- > > Key: BEAM-3386 > URL: https://issues.apache.org/jira/browse/BEAM-3386 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0, 2.6.0 >Reporter: Austin Haas >Priority: Critical > > When Calcite (v. 1.13.0) is included in a project that also includes Beam and > the Beam SQL extension, then the following error is thrown when trying to run > Beam code. > ClassCastException > org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem cannot > be cast to org.apache.calcite.rel.type.RelDataTypeSystem > org.apache.calcite.jdbc.CalciteConnectionImpl. > (CalciteConnectionImpl.java:120) > > org.apache.calcite.jdbc.CalciteJdbc41Factory$CalciteJdbc41Connection. > (CalciteJdbc41Factory.java:114) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:59) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:44) > org.apache.calcite.jdbc.CalciteFactory.newConnection > (CalciteFactory.java:53) > org.apache.calcite.avatica.UnregisteredDriver.connect > (UnregisteredDriver.java:138) > java.sql.DriverManager.getConnection (DriverManager.java:664) > java.sql.DriverManager.getConnection (DriverManager.java:208) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPrepare > (Frameworks.java:145) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPlanner > (Frameworks.java:106) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.ready > (PlannerImpl.java:140) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.parse > (PlannerImpl.java:170) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4807) Upgrade calcite to 1.17.0
[ https://issues.apache.org/jira/browse/BEAM-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551230#comment-16551230 ] Kai Jiang commented on BEAM-4807: - I also want to see if BEAM-3386 is still a problem when we upgrade to 1.17.0 > Upgrade calcite to 1.17.0 > - > > Key: BEAM-4807 > URL: https://issues.apache.org/jira/browse/BEAM-4807 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > > We should upgrade calcite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-4807) Upgrade calcite to 1.17.0
[ https://issues.apache.org/jira/browse/BEAM-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551230#comment-16551230 ] Kai Jiang edited comment on BEAM-4807 at 7/20/18 8:29 PM: -- I also want to see if BEAM-3386 is still a problem after we upgrade to 1.17.0 was (Author: vectorijk): I also want to see if BEAM-3386 is still a problem when we upgrade to 1.17.0 > Upgrade calcite to 1.17.0 > - > > Key: BEAM-4807 > URL: https://issues.apache.org/jira/browse/BEAM-4807 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Andrew Pilloud >Assignee: Andrew Pilloud >Priority: Major > > We should upgrade calcite. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4547) Implement sum0 aggregation function
[ https://issues.apache.org/jira/browse/BEAM-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-4547: Summary: Implement sum0 aggregation function (was: implement sum0 aggregation function) > Implement sum0 aggregation function > --- > > Key: BEAM-4547 > URL: https://issues.apache.org/jira/browse/BEAM-4547 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kenneth Knowles >Priority: Major > Fix For: 2.6.0 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4547) Implement sum0 aggregation function
[ https://issues.apache.org/jira/browse/BEAM-4547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16531795#comment-16531795 ] Kai Jiang commented on BEAM-4547: - Sounds good! we should get testing on it > Implement sum0 aggregation function > --- > > Key: BEAM-4547 > URL: https://issues.apache.org/jira/browse/BEAM-4547 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kenneth Knowles >Priority: Major > Fix For: 2.6.0 > > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4723) Enhance Datetime*Expression Datetime Type
Kai Jiang created BEAM-4723: --- Summary: Enhance Datetime*Expression Datetime Type Key: BEAM-4723 URL: https://issues.apache.org/jira/browse/BEAM-4723 Project: Beam Issue Type: Bug Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang Datetime*Expression only supports timestamp type for first operand now. We should let it accept all Datetime_Types -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4663) Implement Cost calculations for Cost-Based Optimization (CBO)
Kai Jiang created BEAM-4663: --- Summary: Implement Cost calculations for Cost-Based Optimization (CBO) Key: BEAM-4663 URL: https://issues.apache.org/jira/browse/BEAM-4663 Project: Beam Issue Type: Sub-task Components: dsl-sql Reporter: Kai Jiang To support CBO, we should implement methods in each Beam*Rel.java. computeSelfCost(...) as our first step. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4602) Implement Date Comparison in BeamSqlCompareExpression
[ https://issues.apache.org/jira/browse/BEAM-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang resolved BEAM-4602. - Resolution: Fixed Fix Version/s: 2.6.0 > Implement Date Comparison in BeamSqlCompareExpression > - > > Key: BEAM-4602 > URL: https://issues.apache.org/jira/browse/BEAM-4602 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Fix For: 2.6.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4602) Implement Date Comparison in BeamSqlCompareExpression
[ https://issues.apache.org/jira/browse/BEAM-4602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-4602: --- Assignee: Kai Jiang > Implement Date Comparison in BeamSqlCompareExpression > - > > Key: BEAM-4602 > URL: https://issues.apache.org/jira/browse/BEAM-4602 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4602) Implement Date Comparison in BeamSqlCompareExpression
Kai Jiang created BEAM-4602: --- Summary: Implement Date Comparison in BeamSqlCompareExpression Key: BEAM-4602 URL: https://issues.apache.org/jira/browse/BEAM-4602 Project: Beam Issue Type: Sub-task Components: dsl-sql Reporter: Kai Jiang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3386) Dependency conflict when Calcite is included in a project.
[ https://issues.apache.org/jira/browse/BEAM-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517522#comment-16517522 ] Kai Jiang commented on BEAM-3386: - Since I tried, Class.forName("org.apache.beam.repackaged.beam_sdks_java_extensions_sql.org.apache.calcite.rel.type.RelDataTypeSystem").cast(`BeamRelDataTypeSystem object`); It is working when adding relocation prefix. > Dependency conflict when Calcite is included in a project. > -- > > Key: BEAM-3386 > URL: https://issues.apache.org/jira/browse/BEAM-3386 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0 >Reporter: Austin Haas >Priority: Critical > > When Calcite (v. 1.13.0) is included in a project that also includes Beam and > the Beam SQL extension, then the following error is thrown when trying to run > Beam code. > ClassCastException > org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem cannot > be cast to org.apache.calcite.rel.type.RelDataTypeSystem > org.apache.calcite.jdbc.CalciteConnectionImpl. > (CalciteConnectionImpl.java:120) > > org.apache.calcite.jdbc.CalciteJdbc41Factory$CalciteJdbc41Connection. > (CalciteJdbc41Factory.java:114) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:59) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:44) > org.apache.calcite.jdbc.CalciteFactory.newConnection > (CalciteFactory.java:53) > org.apache.calcite.avatica.UnregisteredDriver.connect > (UnregisteredDriver.java:138) > java.sql.DriverManager.getConnection (DriverManager.java:664) > java.sql.DriverManager.getConnection (DriverManager.java:208) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPrepare > (Frameworks.java:145) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPlanner > (Frameworks.java:106) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.ready > (PlannerImpl.java:140) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.parse > (PlannerImpl.java:170) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3386) Dependency conflict when Calcite is included in a project.
[ https://issues.apache.org/jira/browse/BEAM-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517508#comment-16517508 ] Kai Jiang commented on BEAM-3386: - When I was working on NexmarkLauncher and include Beam SQL, I came across this error and blocks me. I have done some investigation on it. I think it should be some repackage thing while casting `BeamRelDataTypeSystem` to `RelDataTypeSystem` in Calcite > Dependency conflict when Calcite is included in a project. > -- > > Key: BEAM-3386 > URL: https://issues.apache.org/jira/browse/BEAM-3386 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0 >Reporter: Austin Haas >Priority: Critical > > When Calcite (v. 1.13.0) is included in a project that also includes Beam and > the Beam SQL extension, then the following error is thrown when trying to run > Beam code. > ClassCastException > org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem cannot > be cast to org.apache.calcite.rel.type.RelDataTypeSystem > org.apache.calcite.jdbc.CalciteConnectionImpl. > (CalciteConnectionImpl.java:120) > > org.apache.calcite.jdbc.CalciteJdbc41Factory$CalciteJdbc41Connection. > (CalciteJdbc41Factory.java:114) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:59) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:44) > org.apache.calcite.jdbc.CalciteFactory.newConnection > (CalciteFactory.java:53) > org.apache.calcite.avatica.UnregisteredDriver.connect > (UnregisteredDriver.java:138) > java.sql.DriverManager.getConnection (DriverManager.java:664) > java.sql.DriverManager.getConnection (DriverManager.java:208) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPrepare > (Frameworks.java:145) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPlanner > (Frameworks.java:106) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.ready > (PlannerImpl.java:140) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.parse > (PlannerImpl.java:170) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3386) Dependency conflict when Calcite is included in a project.
[ https://issues.apache.org/jira/browse/BEAM-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-3386: Affects Version/s: 2.5.0 2.4.0 > Dependency conflict when Calcite is included in a project. > -- > > Key: BEAM-3386 > URL: https://issues.apache.org/jira/browse/BEAM-3386 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.2.0, 2.3.0, 2.4.0, 2.5.0 >Reporter: Austin Haas >Priority: Critical > > When Calcite (v. 1.13.0) is included in a project that also includes Beam and > the Beam SQL extension, then the following error is thrown when trying to run > Beam code. > ClassCastException > org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem cannot > be cast to org.apache.calcite.rel.type.RelDataTypeSystem > org.apache.calcite.jdbc.CalciteConnectionImpl. > (CalciteConnectionImpl.java:120) > > org.apache.calcite.jdbc.CalciteJdbc41Factory$CalciteJdbc41Connection. > (CalciteJdbc41Factory.java:114) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:59) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:44) > org.apache.calcite.jdbc.CalciteFactory.newConnection > (CalciteFactory.java:53) > org.apache.calcite.avatica.UnregisteredDriver.connect > (UnregisteredDriver.java:138) > java.sql.DriverManager.getConnection (DriverManager.java:664) > java.sql.DriverManager.getConnection (DriverManager.java:208) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPrepare > (Frameworks.java:145) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPlanner > (Frameworks.java:106) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.ready > (PlannerImpl.java:140) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.parse > (PlannerImpl.java:170) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4325) Enforce ErrorProne analysis in the SQL project
[ https://issues.apache.org/jira/browse/BEAM-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516590#comment-16516590 ] Kai Jiang commented on BEAM-4325: - [~cademarkegard] I think it's a good way to follow errorprone pattern. Looking for your PR > Enforce ErrorProne analysis in the SQL project > -- > > Key: BEAM-4325 > URL: https://issues.apache.org/jira/browse/BEAM-4325 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Scott Wegner >Assignee: Cade Markegard >Priority: Minor > Labels: errorprone, starter > Time Spent: 1h > Remaining Estimate: 0h > > Java ErrorProne static analysis was [recently > enabled|https://github.com/apache/beam/pull/5161] in the Gradle build > process, but only as warnings. ErrorProne errors are generally useful and > easy to fix. Some work was done to [make sdks-java-core > ErrorProne-clean|https://github.com/apache/beam/pull/5319] and add > enforcement. This task is clean ErrorProne warnings and add enforcement in > {{beam-sdks-java-extensions-sql}}. Additional context discussed on the [dev > list|https://lists.apache.org/thread.html/95aae2785c3cd728c2d3378cbdff2a7ba19caffcd4faa2049d2e2f46@%3Cdev.beam.apache.org%3E]. > Fixing this issue will involve: > # Follow instructions in the [Contribution > Guide|https://beam.apache.org/contribute/] to set up a {{beam}} development > environment. > # Run the following command to compile and run ErrorProne analysis on the > project: {{./gradlew :beam-sdks-java-extensions-sql:assemble}} > # Fix each ErrorProne warning from the {{sdks/java/extensions/sql}} project. > # In {{sdks/java/extensions/sql/build.gradle}}, add {{failOnWarning: true}} > to the call the {{applyJavaNature()}} > ([example|https://github.com/apache/beam/pull/5319/files#diff-9390c20635aed5f42f83b97506a87333R20]). > This starter issue is sponsored by [~swegner]. Feel free to [reach > out|https://beam.apache.org/community/contact-us/] with questions or code > review: > * JIRA: [~swegner] > * GitHub: [@swegner|https://github.com/swegner] > * Slack: [@Scott Wegner|https://s.apache.org/beam-slack-channel] > * Email: swegner at google dot com -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-3386) Dependency conflict when Calcite is included in a project.
[ https://issues.apache.org/jira/browse/BEAM-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516569#comment-16516569 ] Kai Jiang commented on BEAM-3386: - I came across this error also. > Dependency conflict when Calcite is included in a project. > -- > > Key: BEAM-3386 > URL: https://issues.apache.org/jira/browse/BEAM-3386 > Project: Beam > Issue Type: Bug > Components: dsl-sql >Affects Versions: 2.2.0, 2.3.0 >Reporter: Austin Haas >Priority: Critical > > When Calcite (v. 1.13.0) is included in a project that also includes Beam and > the Beam SQL extension, then the following error is thrown when trying to run > Beam code. > ClassCastException > org.apache.beam.sdk.extensions.sql.impl.planner.BeamRelDataTypeSystem cannot > be cast to org.apache.calcite.rel.type.RelDataTypeSystem > org.apache.calcite.jdbc.CalciteConnectionImpl. > (CalciteConnectionImpl.java:120) > > org.apache.calcite.jdbc.CalciteJdbc41Factory$CalciteJdbc41Connection. > (CalciteJdbc41Factory.java:114) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:59) > org.apache.calcite.jdbc.CalciteJdbc41Factory.newConnection > (CalciteJdbc41Factory.java:44) > org.apache.calcite.jdbc.CalciteFactory.newConnection > (CalciteFactory.java:53) > org.apache.calcite.avatica.UnregisteredDriver.connect > (UnregisteredDriver.java:138) > java.sql.DriverManager.getConnection (DriverManager.java:664) > java.sql.DriverManager.getConnection (DriverManager.java:208) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPrepare > (Frameworks.java:145) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.tools.Frameworks.withPlanner > (Frameworks.java:106) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.ready > (PlannerImpl.java:140) > > org.apache.beam.sdks.java.extensions.sql.repackaged.org.apache.calcite.prepare.PlannerImpl.parse > (PlannerImpl.java:170) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4325) Enforce ErrorProne analysis in the SQL project
[ https://issues.apache.org/jira/browse/BEAM-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515288#comment-16515288 ] Kai Jiang commented on BEAM-4325: - [~cademarkegard] Any process on this one? Let me know if you have any questions. > Enforce ErrorProne analysis in the SQL project > -- > > Key: BEAM-4325 > URL: https://issues.apache.org/jira/browse/BEAM-4325 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Scott Wegner >Assignee: Cade Markegard >Priority: Minor > Labels: errorprone, starter > > Java ErrorProne static analysis was [recently > enabled|https://github.com/apache/beam/pull/5161] in the Gradle build > process, but only as warnings. ErrorProne errors are generally useful and > easy to fix. Some work was done to [make sdks-java-core > ErrorProne-clean|https://github.com/apache/beam/pull/5319] and add > enforcement. This task is clean ErrorProne warnings and add enforcement in > {{beam-sdks-java-extensions-sql}}. Additional context discussed on the [dev > list|https://lists.apache.org/thread.html/95aae2785c3cd728c2d3378cbdff2a7ba19caffcd4faa2049d2e2f46@%3Cdev.beam.apache.org%3E]. > Fixing this issue will involve: > # Follow instructions in the [Contribution > Guide|https://beam.apache.org/contribute/] to set up a {{beam}} development > environment. > # Run the following command to compile and run ErrorProne analysis on the > project: {{./gradlew :beam-sdks-java-extensions-sql:assemble}} > # Fix each ErrorProne warning from the {{sdks/java/extensions/sql}} project. > # In {{sdks/java/extensions/sql/build.gradle}}, add {{failOnWarning: true}} > to the call the {{applyJavaNature()}} > ([example|https://github.com/apache/beam/pull/5319/files#diff-9390c20635aed5f42f83b97506a87333R20]). > This starter issue is sponsored by [~swegner]. Feel free to [reach > out|https://beam.apache.org/community/contact-us/] with questions or code > review: > * JIRA: [~swegner] > * GitHub: [@swegner|https://github.com/swegner] > * Slack: [@Scott Wegner|https://s.apache.org/beam-slack-channel] > * Email: swegner at google dot com -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4561) Create unit tests for BeamQueryPlanner.convertToBeamRel
Kai Jiang created BEAM-4561: --- Summary: Create unit tests for BeamQueryPlanner.convertToBeamRel Key: BEAM-4561 URL: https://issues.apache.org/jira/browse/BEAM-4561 Project: Beam Issue Type: Sub-task Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang As discussion on PR#5481, we should consider a concrete unit test for [BeamQueryPlanner.convertToBeamRel|https://github.com/apache/beam/blob/master/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/BeamQueryPlanner.java#L116] when apply BeamRuleSets to optimize logical plan. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4388) Support optimized logical plan
[ https://issues.apache.org/jira/browse/BEAM-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-4388: Issue Type: New Feature (was: Sub-task) Parent: (was: BEAM-3783) > Support optimized logical plan > -- > > Key: BEAM-4388 > URL: https://issues.apache.org/jira/browse/BEAM-4388 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Time Spent: 5h 10m > Remaining Estimate: 0h > > Before converting into Beam Pipeline physical plan, logical plan should be > optimized and it will be super helpful for efficiently executing Beam > PTransforms pipeline. > Calcite has two ways for optimizing logical plan (HepPlanner and > VolcanoPlanner). We can support VolcanoPlanner first and apply calcite > builtin optimize rules (like > FilterJoinRule.FILTER_ON_JOIN) to sql query optimize plans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (BEAM-4537) CASE expression output type mismatch
[ https://issues.apache.org/jira/browse/BEAM-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510614#comment-16510614 ] Kai Jiang edited comment on BEAM-4537 at 6/13/18 7:50 AM: -- It should not be a problem after [PR#5544|https://github.com/apache/beam/pull/5544] (BEAM-4449) was (Author: vectorijk): It should not be problem after [PR#5544|https://github.com/apache/beam/pull/5544] (BEAM-4449) > CASE expression output type mismatch > > > Key: BEAM-4537 > URL: https://issues.apache.org/jira/browse/BEAM-4537 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Fix For: 2.6.0 > > > TPC-DS query 84 involves with keyword coalesce(). coalesce will expand into > case expression. > output type of CASE expression should match its family type with input types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4537) CASE expression output type mismatch
[ https://issues.apache.org/jira/browse/BEAM-4537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang resolved BEAM-4537. - Resolution: Not A Bug Fix Version/s: 2.6.0 It should not be problem after [PR#5544|https://github.com/apache/beam/pull/5544] (BEAM-4449) > CASE expression output type mismatch > > > Key: BEAM-4537 > URL: https://issues.apache.org/jira/browse/BEAM-4537 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Fix For: 2.6.0 > > > TPC-DS query 84 involves with keyword coalesce(). coalesce will expand into > case expression. > output type of CASE expression should match its family type with input types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4476) Syntax Features Unsupported
[ https://issues.apache.org/jira/browse/BEAM-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16510610#comment-16510610 ] Kai Jiang commented on BEAM-4476: - part of BEAM-2281 > Syntax Features Unsupported > --- > > Key: BEAM-4476 > URL: https://issues.apache.org/jira/browse/BEAM-4476 > Project: Beam > Issue Type: Task > Components: dsl-sql >Reporter: Kai Jiang >Priority: Major > > Based on current version (56dc4cf), a coverage test was done with TPC-DS > queries and TPC-H queries. All the tests were running on DirectRunner. We > noticed that there are some features Beam SQL is now not supported. > This issue is used for an umbrella ticket to keep track of features we need > to implement and support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4385) Support LIKE operator
[ https://issues.apache.org/jira/browse/BEAM-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang resolved BEAM-4385. - Resolution: Fixed Fix Version/s: 2.6.0 > Support LIKE operator > - > > Key: BEAM-4385 > URL: https://issues.apache.org/jira/browse/BEAM-4385 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kenneth Knowles >Assignee: Rui Wang >Priority: Major > Fix For: 2.6.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Currently the LIKE operator is not supported. It is pretty important. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4547) implement sum0 aggregation function
Kai Jiang created BEAM-4547: --- Summary: implement sum0 aggregation function Key: BEAM-4547 URL: https://issues.apache.org/jira/browse/BEAM-4547 Project: Beam Issue Type: Sub-task Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4537) CASE expression output type mismatch
Kai Jiang created BEAM-4537: --- Summary: CASE expression output type mismatch Key: BEAM-4537 URL: https://issues.apache.org/jira/browse/BEAM-4537 Project: Beam Issue Type: Sub-task Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang TPC-DS query 84 involves with keyword coalesce(). coalesce will expand into case expression. output type of CASE expression should match its family type with input types. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (BEAM-4477) Support EXISTS operator
[ https://issues.apache.org/jira/browse/BEAM-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang resolved BEAM-4477. - Resolution: Fixed Fix Version/s: 2.6.0 > Support EXISTS operator > --- > > Key: BEAM-4477 > URL: https://issues.apache.org/jira/browse/BEAM-4477 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Fix For: 2.6.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (BEAM-4509) Implement ROW_NUMBER
[ https://issues.apache.org/jira/browse/BEAM-4509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16504917#comment-16504917 ] Kai Jiang commented on BEAM-4509: - It also be interesting to see we could get support on `rank() OVER window` > Implement ROW_NUMBER > > > Key: BEAM-4509 > URL: https://issues.apache.org/jira/browse/BEAM-4509 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Anton Kedin >Priority: Major > > Design and implement ROW_NUMBER() OVER window. It is supported by Calcite and > we should look at feasibility of supporting it in Beam SQL > [StackOverflow > Post|https://stackoverflow.com/questions/50724531/implement-row-number-in-beamsql] > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4477) Support EXISTS operator
Kai Jiang created BEAM-4477: --- Summary: Support EXISTS operator Key: BEAM-4477 URL: https://issues.apache.org/jira/browse/BEAM-4477 Project: Beam Issue Type: Sub-task Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (BEAM-4476) Syntax Features Unsupported
[ https://issues.apache.org/jira/browse/BEAM-4476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-4476: --- Assignee: (was: Xu Mingmin) > Syntax Features Unsupported > --- > > Key: BEAM-4476 > URL: https://issues.apache.org/jira/browse/BEAM-4476 > Project: Beam > Issue Type: Task > Components: dsl-sql >Reporter: Kai Jiang >Priority: Major > > Based on current version (56dc4cf), a coverage test was done with TPC-DS > queries and TPC-H queries. All the tests were running on DirectRunner. We > noticed that there are some features Beam SQL is now not supported. > This issue is used for an umbrella ticket to keep track of features we need > to implement and support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4476) Syntax Features Unsupported
Kai Jiang created BEAM-4476: --- Summary: Syntax Features Unsupported Key: BEAM-4476 URL: https://issues.apache.org/jira/browse/BEAM-4476 Project: Beam Issue Type: Task Components: dsl-sql Reporter: Kai Jiang Assignee: Xu Mingmin Based on current version (56dc4cf), a coverage test was done with TPC-DS queries and TPC-H queries. All the tests were running on DirectRunner. We noticed that there are some features Beam SQL is now not supported. This issue is used for an umbrella ticket to keep track of features we need to implement and support. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-4388) Support optimized logical plan
[ https://issues.apache.org/jira/browse/BEAM-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-4388: Description: Before converting into Beam Pipeline physical plan, logical plan should be optimized and it will be super helpful for efficiently executing Beam PTransforms pipeline. Calcite has two ways for optimizing logical plan (HepPlanner and VolcanoPlanner). We can support VolcanoPlanner first and apply calcite builtin optimize rules (like FilterJoinRule.FILTER_ON_JOIN) to sql query optimize plans. was: Before converting into Beam Pipeline physical plan, logical plan should be optimized and it will be super helpful for efficiently executing Beam PTransforms pipeline. Calcite has two way for optimizing logical plan (HepPlanner and VolcanoPlanner). We can support VolcanoPlanner first and apply calcite builtin optimize rules (like FilterJoinRule.FILTER_ON_JOIN) to optimize plans. > Support optimized logical plan > -- > > Key: BEAM-4388 > URL: https://issues.apache.org/jira/browse/BEAM-4388 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Blocker > > Before converting into Beam Pipeline physical plan, logical plan should be > optimized and it will be super helpful for efficiently executing Beam > PTransforms pipeline. > Calcite has two ways for optimizing logical plan (HepPlanner and > VolcanoPlanner). We can support VolcanoPlanner first and apply calcite > builtin optimize rules (like > FilterJoinRule.FILTER_ON_JOIN) to sql query optimize plans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-4388) Support optimized logical plan
Kai Jiang created BEAM-4388: --- Summary: Support optimized logical plan Key: BEAM-4388 URL: https://issues.apache.org/jira/browse/BEAM-4388 Project: Beam Issue Type: Sub-task Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang Before converting into Beam Pipeline physical plan, logical plan should be optimized and it will be super helpful for efficiently executing Beam PTransforms pipeline. Calcite has two way for optimizing logical plan (HepPlanner and VolcanoPlanner). We can support VolcanoPlanner first and apply calcite builtin optimize rules (like FilterJoinRule.FILTER_ON_JOIN) to optimize plans. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3558) aggregation expression can't apply to math or arithmetic expressions
[ https://issues.apache.org/jira/browse/BEAM-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-3558: Issue Type: Sub-task (was: Bug) Parent: BEAM-3517 > aggregation expression can't apply to math or arithmetic expressions > > > Key: BEAM-3558 > URL: https://issues.apache.org/jira/browse/BEAM-3558 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Xu Mingmin >Priority: Major > > fails when executing sql > 'select sum(c1)+2 from PCOLLECTION group by c2' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3558) aggregation expression can't apply to math or arithmetic expressions
Kai Jiang created BEAM-3558: --- Summary: aggregation expression can't apply to math or arithmetic expressions Key: BEAM-3558 URL: https://issues.apache.org/jira/browse/BEAM-3558 Project: Beam Issue Type: Bug Components: dsl-sql Reporter: Kai Jiang Assignee: Xu Mingmin fails when executing sql 'select sum(c1)+2 from PCOLLECTION group by c2' -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-2612) support variance builtin aggregation function
[ https://issues.apache.org/jira/browse/BEAM-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-2612: Issue Type: Sub-task (was: New Feature) Parent: BEAM-3517 > support variance builtin aggregation function > - > > Key: BEAM-2612 > URL: https://issues.apache.org/jira/browse/BEAM-2612 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > Fix For: Not applicable > > > two builtin aggregate functions > VAR_POP > the population variance (square of the population standard deviation) > VAR_SAMP > the sample variance (square of the sample standard deviation) > https://calcite.apache.org/docs/reference.html#aggregate-functions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (BEAM-3476) Implement Covariance built-in aggregation functions
[ https://issues.apache.org/jira/browse/BEAM-3476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-3476: Issue Type: Sub-task (was: New Feature) Parent: BEAM-3517 > Implement Covariance built-in aggregation functions > --- > > Key: BEAM-3476 > URL: https://issues.apache.org/jira/browse/BEAM-3476 > Project: Beam > Issue Type: Sub-task > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang >Priority: Major > > implement covar_pop(x,y) and covar_samp(x,y) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3518) Support regr_* functions
Kai Jiang created BEAM-3518: --- Summary: Support regr_* functions Key: BEAM-3518 URL: https://issues.apache.org/jira/browse/BEAM-3518 Project: Beam Issue Type: Sub-task Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2, regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3517) Support Built-in Aggregation Functions
Kai Jiang created BEAM-3517: --- Summary: Support Built-in Aggregation Functions Key: BEAM-3517 URL: https://issues.apache.org/jira/browse/BEAM-3517 Project: Beam Issue Type: Task Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang Support UDAF listed in Calcite. https://calcite.apache.org/docs/reference.html#aggregate-functions -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (BEAM-3476) Implement Covariance built-in aggregation functions
Kai Jiang created BEAM-3476: --- Summary: Implement Covariance built-in aggregation functions Key: BEAM-3476 URL: https://issues.apache.org/jira/browse/BEAM-3476 Project: Beam Issue Type: New Feature Components: dsl-sql Reporter: Kai Jiang Assignee: Kai Jiang implement covar_pop(x,y) and covar_samp(x,y) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2852) Add support for Kafka as source/sink on Nexmark
[ https://issues.apache.org/jira/browse/BEAM-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-2852: --- Assignee: Kai Jiang > Add support for Kafka as source/sink on Nexmark > --- > > Key: BEAM-2852 > URL: https://issues.apache.org/jira/browse/BEAM-2852 > Project: Beam > Issue Type: Improvement > Components: testing >Reporter: Ismaël Mejía >Assignee: Kai Jiang >Priority: Minor > Labels: newbie, starter > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2804) support TIMESTAMP in Sort
[ https://issues.apache.org/jira/browse/BEAM-2804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16159226#comment-16159226 ] Kai Jiang commented on BEAM-2804: - [~mingmxu] I think the scope of this issue should be bigger. Not only support timestamp in [BeamSortRel|https://github.com/apache/beam/blob/97ad2f8609230861f6804ae2cba6cada42f4c865/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamSortRel.java#L77], also we should map {code:java} SQL_TYPE_TO_JAVA_CLASS.put(Types.TIMESTAMP, Timestamp.class); {code} in [BeamRecordType |https://github.com/apache/beam/blob/c2acb54f6756786bcfe12e9faf32950fd6260c5e/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/BeamRecordSqlType.java#L71] Correct me if I am wrong. > support TIMESTAMP in Sort > - > > Key: BEAM-2804 > URL: https://issues.apache.org/jira/browse/BEAM-2804 > Project: Beam > Issue Type: Improvement > Components: dsl-sql >Reporter: Xu Mingmin >Assignee: Shayang Zang >Priority: Minor > Labels: beginner > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-1630) Add Splittable DoFn to Python SDK
[ https://issues.apache.org/jira/browse/BEAM-1630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149391#comment-16149391 ] Kai Jiang commented on BEAM-1630: - Any process on this? > Add Splittable DoFn to Python SDK > - > > Key: BEAM-1630 > URL: https://issues.apache.org/jira/browse/BEAM-1630 > Project: Beam > Issue Type: Improvement > Components: sdk-py >Reporter: Chamikara Jayalath >Assignee: Chamikara Jayalath > > Splittable DoFn [1] is currently being implemented for Java SDK [2]. We > should add this to Python SDK as well. > Following document proposes an API for this. > https://docs.google.com/document/d/1h_zprJrOilivK2xfvl4L42vaX4DMYGfH1YDmi-s_ozM/edit?usp=sharing > [1] https://s.apache.org/splittable-do-fn > [2] > https://lists.apache.org/thread.html/0ce61ac162460a149d5c93cdface37cc383f8030fe86ca09e5699b18@%3Cdev.beam.apache.org%3E -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2760) Disable testMergingCustomWindows* validatesRunner tests in Gearpump runner
[ https://issues.apache.org/jira/browse/BEAM-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16142236#comment-16142236 ] Kai Jiang commented on BEAM-2760: - I test `testMergingCustomWindows` in Spark runner. It also fails. Flink runner works. > Disable testMergingCustomWindows* validatesRunner tests in Gearpump runner > -- > > Key: BEAM-2760 > URL: https://issues.apache.org/jira/browse/BEAM-2760 > Project: Beam > Issue Type: Test > Components: runner-gearpump >Reporter: Etienne Chauchot >Assignee: Etienne Chauchot > Fix For: Not applicable > > > Disable these tests until it is supported by the runner -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (BEAM-1440) Create a BigQuery source (that implements iobase.BoundedSource) for Python SDK
[ https://issues.apache.org/jira/browse/BEAM-1440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-1440: Description: Currently we have a BigQuery native source for Python SDK [1]. This can only be used by Dataflow runner. We should implement a Beam BigQuery source that implements iobase.BoundedSource [2] interface so that other runners that try to use Python SDK can read from BigQuery as well. Java SDK already has a Beam BigQuery source [3]. [1] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py [2] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/iobase.py#L70 [3] https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1189 was: Currently we have a BigQuery native source for Python SDK [1]. This can only be used by Dataflow runner. We should implement a Beam BigQuery source that implements iobase.BoundedSource [2] interface so that other runners that try to use Python SDK can read from BigQuery as well. Java SDK already has a Beam BigQuery source [3]. [1] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/bigquery.py [2] https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/iobase.py#L70 [3] https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1189 > Create a BigQuery source (that implements iobase.BoundedSource) for Python SDK > -- > > Key: BEAM-1440 > URL: https://issues.apache.org/jira/browse/BEAM-1440 > Project: Beam > Issue Type: New Feature > Components: sdk-py >Reporter: Chamikara Jayalath > > Currently we have a BigQuery native source for Python SDK [1]. > This can only be used by Dataflow runner. > We should implement a Beam BigQuery source that implements > iobase.BoundedSource [2] interface so that other runners that try to use > Python SDK can read from BigQuery as well. Java SDK already has a Beam > BigQuery source [3]. > [1] > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/gcp/bigquery.py > [2] > https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/iobase.py#L70 > [3] > https://github.com/apache/beam/blob/master/sdks/java/io/google-cloud-platform/src/main/java/org/apache/beam/sdk/io/gcp/bigquery/BigQueryIO.java#L1189 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-1628) Flink runner: logic around --flinkMaster is error-prone
[ https://issues.apache.org/jira/browse/BEAM-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16138763#comment-16138763 ] Kai Jiang commented on BEAM-1628: - [~mxm] i am off. go ahead. > Flink runner: logic around --flinkMaster is error-prone > --- > > Key: BEAM-1628 > URL: https://issues.apache.org/jira/browse/BEAM-1628 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Davor Bonaci >Priority: Minor > Labels: newbie, starter > > The logic for handling {{--flinkMaster}} seems not particularly user-friendly. > https://github.com/apache/beam/blob/fbcde4cdc7d68de8734bf540c079b2747631a854/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java#L132 > {code} > if (masterUrl.equals("[local]")) { > } else if (masterUrl.equals("[collection]")) { > } else if (masterUrl.equals("[auto]")) { > } else if (masterUrl.matches(".*:\\d*")) { > } else { > // use auto. > } > {code} > The options are constructed with "auto" set as default. > I think we should do the following: > * I assume there's a default port for the Flink master. We should default to > it. > * We should treat a string without a colon as a host name. (Not default to > local execution.) > This is super easy fix, hopefully someone can pick it up quickly ;-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1628) Flink runner: logic around --flinkMaster is error-prone
[ https://issues.apache.org/jira/browse/BEAM-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-1628: --- Assignee: Aljoscha Krettek (was: Kai Jiang) > Flink runner: logic around --flinkMaster is error-prone > --- > > Key: BEAM-1628 > URL: https://issues.apache.org/jira/browse/BEAM-1628 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Davor Bonaci >Assignee: Aljoscha Krettek >Priority: Minor > Labels: newbie, starter > > The logic for handling {{--flinkMaster}} seems not particularly user-friendly. > https://github.com/apache/beam/blob/fbcde4cdc7d68de8734bf540c079b2747631a854/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java#L132 > {code} > if (masterUrl.equals("[local]")) { > } else if (masterUrl.equals("[collection]")) { > } else if (masterUrl.equals("[auto]")) { > } else if (masterUrl.matches(".*:\\d*")) { > } else { > // use auto. > } > {code} > The options are constructed with "auto" set as default. > I think we should do the following: > * I assume there's a default port for the Flink master. We should default to > it. > * We should treat a string without a colon as a host name. (Not default to > local execution.) > This is super easy fix, hopefully someone can pick it up quickly ;-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-1628) Flink runner: logic around --flinkMaster is error-prone
[ https://issues.apache.org/jira/browse/BEAM-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-1628: --- Assignee: (was: Aljoscha Krettek) > Flink runner: logic around --flinkMaster is error-prone > --- > > Key: BEAM-1628 > URL: https://issues.apache.org/jira/browse/BEAM-1628 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Davor Bonaci >Priority: Minor > Labels: newbie, starter > > The logic for handling {{--flinkMaster}} seems not particularly user-friendly. > https://github.com/apache/beam/blob/fbcde4cdc7d68de8734bf540c079b2747631a854/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java#L132 > {code} > if (masterUrl.equals("[local]")) { > } else if (masterUrl.equals("[collection]")) { > } else if (masterUrl.equals("[auto]")) { > } else if (masterUrl.matches(".*:\\d*")) { > } else { > // use auto. > } > {code} > The options are constructed with "auto" set as default. > I think we should do the following: > * I assume there's a default port for the Flink master. We should default to > it. > * We should treat a string without a colon as a host name. (Not default to > local execution.) > This is super easy fix, hopefully someone can pick it up quickly ;-) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2779) PipelineOptionsFactory should prevent non PipelineOptions interfaces from being constructed.
[ https://issues.apache.org/jira/browse/BEAM-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16136000#comment-16136000 ] Kai Jiang commented on BEAM-2779: - I would like to take a look at this. Working on this. > PipelineOptionsFactory should prevent non PipelineOptions interfaces from > being constructed. > > > Key: BEAM-2779 > URL: https://issues.apache.org/jira/browse/BEAM-2779 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.0.0 >Reporter: Luke Cwik >Priority: Minor > Labels: starter > > PipelineOptions currently serializes information about all getter/setter > pairs on interfaces which don't extend PipelineOptions. > For example: > {code:java} > interface Foo extends PipelineOptions, Bar { > String getFoo(); > void setFoo(String value); > } > interface Bar { > String getBar(); > void setBar(String value); > } > {code} > The serialization of the above (when both *foo* and *bar* are set) will > produce JSON where we only include display data for *foo* but data for both > *foo* and *bar*. During validation of an interface in > *PipelineOptionsFactory*, we should throw an error if one of the users > interfaces doesn't extend *PipelineOptions* (note that we should ignore the > HasDisplayData interface). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2779) PipelineOptionsFactory should prevent non PipelineOptions interfaces from being constructed.
[ https://issues.apache.org/jira/browse/BEAM-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-2779: --- Assignee: Kai Jiang > PipelineOptionsFactory should prevent non PipelineOptions interfaces from > being constructed. > > > Key: BEAM-2779 > URL: https://issues.apache.org/jira/browse/BEAM-2779 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.0.0 >Reporter: Luke Cwik >Assignee: Kai Jiang >Priority: Minor > Labels: starter > > PipelineOptions currently serializes information about all getter/setter > pairs on interfaces which don't extend PipelineOptions. > For example: > {code:java} > interface Foo extends PipelineOptions, Bar { > String getFoo(); > void setFoo(String value); > } > interface Bar { > String getBar(); > void setBar(String value); > } > {code} > The serialization of the above (when both *foo* and *bar* are set) will > produce JSON where we only include display data for *foo* but data for both > *foo* and *bar*. During validation of an interface in > *PipelineOptionsFactory*, we should throw an error if one of the users > interfaces doesn't extend *PipelineOptions* (note that we should ignore the > HasDisplayData interface). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (BEAM-2779) PipelineOptionsFactory should prevent non PipelineOptions interfaces from being constructed.
[ https://issues.apache.org/jira/browse/BEAM-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-2779: Description: PipelineOptions currently serializes information about all getter/setter pairs on interfaces which don't extend PipelineOptions. For example: {code:java} interface Foo extends PipelineOptions, Bar { String getFoo(); void setFoo(String value); } interface Bar { String getBar(); void setBar(String value); } {code} The serialization of the above (when both *foo* and *bar* are set) will produce JSON where we only include display data for *foo* but data for both *foo* and *bar*. During validation of an interface in *PipelineOptionsFactory*, we should throw an error if one of the users interfaces doesn't extend *PipelineOptions* (note that we should ignore the HasDisplayData interface). was: PipelineOptions currently serializes information about all getter/setter pairs on interfaces which don't extend PipelineOptions. For example: {code:java} interface Foo extends PipelineOptions, Bar { String getFoo(); void setFoo(String value); } interface Bar { String getBar(); void setBar(); } {code} The serialization of the above (when both *foo* and *bar* are set) will produce JSON where we only include display data for *foo* but data for both *foo* and *bar*. During validation of an interface in *PipelineOptionsFactory*, we should throw an error if one of the users interfaces doesn't extend *PipelineOptions* (note that we should ignore the HasDisplayData interface). > PipelineOptionsFactory should prevent non PipelineOptions interfaces from > being constructed. > > > Key: BEAM-2779 > URL: https://issues.apache.org/jira/browse/BEAM-2779 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Affects Versions: 2.0.0 >Reporter: Luke Cwik >Priority: Minor > Labels: starter > > PipelineOptions currently serializes information about all getter/setter > pairs on interfaces which don't extend PipelineOptions. > For example: > {code:java} > interface Foo extends PipelineOptions, Bar { > String getFoo(); > void setFoo(String value); > } > interface Bar { > String getBar(); > void setBar(String value); > } > {code} > The serialization of the above (when both *foo* and *bar* are set) will > produce JSON where we only include display data for *foo* but data for both > *foo* and *bar*. During validation of an interface in > *PipelineOptionsFactory*, we should throw an error if one of the users > interfaces doesn't extend *PipelineOptions* (note that we should ignore the > HasDisplayData interface). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2612) support variance builtin aggregation function
[ https://issues.apache.org/jira/browse/BEAM-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086340#comment-16086340 ] Kai Jiang commented on BEAM-2612: - Yes, I am working on this file. > support variance builtin aggregation function > - > > Key: BEAM-2612 > URL: https://issues.apache.org/jira/browse/BEAM-2612 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang > > two builtin aggregate functions > VAR_POP > the population variance (square of the population standard deviation) > VAR_SAMP > the sample variance (square of the sample standard deviation) > https://calcite.apache.org/docs/reference.html#aggregate-functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (BEAM-2612) support variance builtin aggregation function
[ https://issues.apache.org/jira/browse/BEAM-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085347#comment-16085347 ] Kai Jiang edited comment on BEAM-2612 at 7/13/17 8:17 AM: -- cc [~mingmxu] After this ticket, I can work on STDDEV_POP, STDDEV_SAMP, COVAR_POP, COVAR_SAMP was (Author: vectorijk): After this ticket, I can work on STDDEV_POP, STDDEV_SAMP, COVAR_POP, COVAR_SAMP > support variance builtin aggregation function > - > > Key: BEAM-2612 > URL: https://issues.apache.org/jira/browse/BEAM-2612 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang > > two builtin aggregate functions > VAR_POP > the population variance (square of the population standard deviation) > VAR_SAMP > the sample variance (square of the sample standard deviation) > https://calcite.apache.org/docs/reference.html#aggregate-functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2612) support variance builtin aggregation function
[ https://issues.apache.org/jira/browse/BEAM-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085347#comment-16085347 ] Kai Jiang commented on BEAM-2612: - After this ticket, I can work on STDDEV_POP, STDDEV_SAMP, COVAR_POP, COVAR_SAMP > support variance builtin aggregation function > - > > Key: BEAM-2612 > URL: https://issues.apache.org/jira/browse/BEAM-2612 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang > > two builtin aggregate functions > VAR_POP > the population variance (square of the population standard deviation) > VAR_SAMP > the sample variance (square of the sample standard deviation) > https://calcite.apache.org/docs/reference.html#aggregate-functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2612) support variance builtin aggregation function
[ https://issues.apache.org/jira/browse/BEAM-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085345#comment-16085345 ] Kai Jiang commented on BEAM-2612: - I have been working on this ticket. I will open a PR very soon. > support variance builtin aggregation function > - > > Key: BEAM-2612 > URL: https://issues.apache.org/jira/browse/BEAM-2612 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang > > two builtin aggregate functions > VAR_POP > the population variance (square of the population standard deviation) > VAR_SAMP > the sample variance (square of the sample standard deviation) > https://calcite.apache.org/docs/reference.html#aggregate-functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (BEAM-2612) support variance builtin aggregation function
Kai Jiang created BEAM-2612: --- Summary: support variance builtin aggregation function Key: BEAM-2612 URL: https://issues.apache.org/jira/browse/BEAM-2612 Project: Beam Issue Type: New Feature Components: dsl-sql Reporter: Kai Jiang Assignee: Xu Mingmin two builtin aggregate functions VAR_POP the population variance (square of the population standard deviation) VAR_SAMP the sample variance (square of the sample standard deviation) https://calcite.apache.org/docs/reference.html#aggregate-functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (BEAM-2612) support variance builtin aggregation function
[ https://issues.apache.org/jira/browse/BEAM-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-2612: --- Assignee: Kai Jiang (was: Xu Mingmin) > support variance builtin aggregation function > - > > Key: BEAM-2612 > URL: https://issues.apache.org/jira/browse/BEAM-2612 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kai Jiang >Assignee: Kai Jiang > > two builtin aggregate functions > VAR_POP > the population variance (square of the population standard deviation) > VAR_SAMP > the sample variance (square of the sample standard deviation) > https://calcite.apache.org/docs/reference.html#aggregate-functions -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-2203) Arithmetic operators: support DATETIME & DATETIME_INTERVAL
[ https://issues.apache.org/jira/browse/BEAM-2203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079517#comment-16079517 ] Kai Jiang commented on BEAM-2203: - [~xumingming] I would take a try on this one. Should we implement these two first? TIMESTAMPADD(timeUnit, integer, datetime) TIMESTAMPDIFF(timeUnit, datetime, datetime2) > Arithmetic operators: support DATETIME & DATETIME_INTERVAL > -- > > Key: BEAM-2203 > URL: https://issues.apache.org/jira/browse/BEAM-2203 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: James Xu > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (BEAM-1927) Add tfrecord io to built in IO list
[ https://issues.apache.org/jira/browse/BEAM-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-1927: Labels: docuentation newbie starter (was: newbie starter) > Add tfrecord io to built in IO list > --- > > Key: BEAM-1927 > URL: https://issues.apache.org/jira/browse/BEAM-1927 > Project: Beam > Issue Type: Bug > Components: sdk-py, website >Reporter: Ahmet Altay >Priority: Minor > Labels: docuentation, newbie, starter > > Add tfrecordio > (https://github.com/apache/beam/blob/master/sdks/python/apache_beam/io/tfrecordio.py) > to built-in io list (https://beam.apache.org/documentation/io/built-in/). > cc: [~melap] -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (BEAM-1628) Flink runner: logic around --flinkMaster is error-prone
[ https://issues.apache.org/jira/browse/BEAM-1628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15899307#comment-15899307 ] Kai Jiang commented on BEAM-1628: - [~davor] I would like to take this issue. On investigation, I think default port should be 6123 (https://ci.apache.org/projects/flink/flink-docs-release-0.8/config.html#common-options). But, I am not sure what host name should be, localhost or remote host name? > Flink runner: logic around --flinkMaster is error-prone > --- > > Key: BEAM-1628 > URL: https://issues.apache.org/jira/browse/BEAM-1628 > Project: Beam > Issue Type: Bug > Components: runner-flink >Reporter: Davor Bonaci >Assignee: Aljoscha Krettek >Priority: Minor > Labels: newbie, starter > > The logic for handling {{--flinkMaster}} seems not particularly user-friendly. > https://github.com/apache/beam/blob/fbcde4cdc7d68de8734bf540c079b2747631a854/runners/flink/runner/src/main/java/org/apache/beam/runners/flink/FlinkPipelineExecutionEnvironment.java#L132 > {code} > if (masterUrl.equals("[local]")) { > } else if (masterUrl.equals("[collection]")) { > } else if (masterUrl.equals("[auto]")) { > } else if (masterUrl.matches(".*:\\d*")) { > } else { > // use auto. > } > {code} > The options are constructed with "auto" set as default. > I think we should do the following: > * I assume there's a default port for the Flink master. We should default to > it. > * We should treat a string without a colon as a host name. (Not default to > local execution.) > This is super easy fix, hopefully someone can pick it up quickly ;-) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (BEAM-842) dependency.py: package not found when running on Windows
[ https://issues.apache.org/jira/browse/BEAM-842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851475#comment-15851475 ] Kai Jiang commented on BEAM-842: [~matthiasa4] Could you paste more crash error here? Since I built Juliaset example on Windows and ran it on Dataflow, it seems all good to me. setuptools version is 34.1.0 > dependency.py: package not found when running on Windows > > > Key: BEAM-842 > URL: https://issues.apache.org/jira/browse/BEAM-842 > Project: Beam > Issue Type: Bug > Components: sdk-py > Environment: Windows 10, Python 2.7.11 >Reporter: Matthias Baetens >Assignee: Ahmet Altay >Priority: Minor > Labels: newbie > > When having splitting your pipeline into multiple files and configuring your > project according to the Juliaset example > (https://cloud.google.com/dataflow/pipelines/dependencies-python#multiple-file-dependencies), > the Pipeline still crashes when using Windows. > This is caused by setuptools defaulting to a .zip on Windows, and the current > Beam code looks for a .tar.gz (dependency.py, line 400). When changing this > line to: output_files = glob.glob(os.path.join(temp_dir, '*.zip')), it works. > Suggestion: checking the OS would probably solve this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (BEAM-1248) Combine with side inputs API should match ParDo
[ https://issues.apache.org/jira/browse/BEAM-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang reassigned BEAM-1248: --- Assignee: Kai Jiang > Combine with side inputs API should match ParDo > --- > > Key: BEAM-1248 > URL: https://issues.apache.org/jira/browse/BEAM-1248 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Kai Jiang >Priority: Minor > Labels: easy, starter > Fix For: 0.5.0 > > > From user@beam, the methods for adding side inputs to a Combine transform do > not fully match those for adding side inputs to ParDo. > "ParDo has the .withSideInputs(Iterable>) version but > also a varargs version withSideInputs(PCollectionView...) but Combine only > has the Iterable version." -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (BEAM-1248) Combine with side inputs API should match ParDo
[ https://issues.apache.org/jira/browse/BEAM-1248?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Jiang updated BEAM-1248: Fix Version/s: 0.5.0 > Combine with side inputs API should match ParDo > --- > > Key: BEAM-1248 > URL: https://issues.apache.org/jira/browse/BEAM-1248 > Project: Beam > Issue Type: Bug > Components: sdk-java-core >Reporter: Kenneth Knowles >Assignee: Kai Jiang >Priority: Minor > Labels: easy, starter > Fix For: 0.5.0 > > > From user@beam, the methods for adding side inputs to a Combine transform do > not fully match those for adding side inputs to ParDo. > "ParDo has the .withSideInputs(Iterable>) version but > also a varargs version withSideInputs(PCollectionView...) but Combine only > has the Iterable version." -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-519) fileio.CompressionType requires a __ne__ method
[ https://issues.apache.org/jira/browse/BEAM-519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15814156#comment-15814156 ] Kai Jiang commented on BEAM-519: This has been implemented. https://github.com/apache/beam/blob/81e44b833ce54e44e6506eb028afe8564c46cf18/sdks/python/apache_beam/io/fileio.py#L59 And This issue could be marked as 'RESOLVED' > fileio.CompressionType requires a __ne__ method > > > Key: BEAM-519 > URL: https://issues.apache.org/jira/browse/BEAM-519 > Project: Beam > Issue Type: Bug > Components: sdk-py >Reporter: Ahmet Altay >Priority: Minor > Labels: starter > > This code: > https://github.com/apache/incubator-beam/blob/python-sdk/sdks/python/apache_beam/io/fileio.py#L279 > Without the __ne__ operator instances of this class cannot be used in != > expressions (only ==). -- This message was sent by Atlassian JIRA (v6.3.4#6332)