[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=329993=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329993 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 17/Oct/19 16:54 Start Date: 17/Oct/19 16:54 Worklog Time Spent: 10m Work Description: apilloud commented on issue #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-543266218 LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329993) Time Spent: 8h 10m (was: 8h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 8h 10m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=329994=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329994 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 17/Oct/19 16:54 Start Date: 17/Oct/19 16:54 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329994) Time Spent: 8h 20m (was: 8h 10m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 8h 20m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=329977=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329977 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 17/Oct/19 16:22 Start Date: 17/Oct/19 16:22 Worklog Time Spent: 10m Work Description: apilloud commented on issue #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-543253309 Run Sql Postcommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329977) Time Spent: 8h (was: 7h 50m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 8h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=329461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329461 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 16/Oct/19 22:32 Start Date: 16/Oct/19 22:32 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335738488 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLiteral; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for (RelDataTypeField field : ioSourceRel.getRowType().getFieldList()) { + if
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=329456=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329456 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 16/Oct/19 22:15 Start Date: 16/Oct/19 22:15 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335733407 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import static org.apache.beam.vendor.guava.v26_0_jre.com.google.common.base.Preconditions.checkArgument; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLiteral; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for (RelDataTypeField field : ioSourceRel.getRowType().getFieldList()) { + if
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=329453=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-329453 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 16/Oct/19 22:11 Start Date: 16/Oct/19 22:11 Worklog Time Spent: 10m Work Description: apilloud commented on issue #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-542912906 cc: @amaliujia This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 329453) Time Spent: 7.5h (was: 7h 20m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 7.5h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328875=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328875 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 16/Oct/19 00:17 Start Date: 16/Oct/19 00:17 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-542456459 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328875) Time Spent: 7h 20m (was: 7h 10m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 7h 20m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328874=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328874 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 16/Oct/19 00:17 Start Date: 16/Oct/19 00:17 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-542456459 Run Java_Examples_Dataflow PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328874) Time Spent: 7h 10m (was: 7h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 7h 10m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328858 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 16/Oct/19 00:00 Start Date: 16/Oct/19 00:00 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-542452951 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328858) Time Spent: 6h 50m (was: 6h 40m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 6h 50m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328860=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328860 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 16/Oct/19 00:00 Start Date: 16/Oct/19 00:00 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-542452951 Run JavaPortabilityApi PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328860) Time Spent: 7h (was: 6h 50m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 7h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328846 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:32 Start Date: 15/Oct/19 23:32 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335221391 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328842=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328842 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:19 Start Date: 15/Oct/19 23:19 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335218379 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIOSourceRel.java ## @@ -109,10 +137,11 @@ public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) { @Override public BeamCostModel beamComputeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) { NodeStats estimates = BeamSqlRelUtils.getNodeStats(this, mq); -return BeamCostModel.FACTORY.makeCost(estimates.getRowCount(), estimates.getRate()); +return BeamCostModel.FACTORY.makeCost( +estimates.getRowCount() * getRowType().getFieldCount(), estimates.getRate()); } Review comment: Updated to multiplying total cost by `getRowType().getFieldCount()` stead of just `getRowCount`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328842) Time Spent: 6.5h (was: 6h 20m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 6.5h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328836=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328836 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:15 Start Date: 15/Oct/19 23:15 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335217549 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIOSourceRel.java ## @@ -94,7 +116,19 @@ public NodeStats estimateNodeStats(RelMetadataQuery mq) { "Should not have received input for %s: %s", BeamIOSourceRel.class.getSimpleName(), input); - return beamTable.buildIOReader(input.getPipeline().begin()); + + PBegin begin = input.getPipeline().begin(); Review comment: Changed local variables to be `final`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328836) Time Spent: 6h 20m (was: 6h 10m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 6h 20m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328831=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328831 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:15 Start Date: 15/Oct/19 23:15 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335217375 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328833 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:15 Start Date: 15/Oct/19 23:15 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335217402 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328834=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328834 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:15 Start Date: 15/Oct/19 23:15 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335217433 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); Review comment: Done, moved before the first use. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328830=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328830 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:15 Start Date: 15/Oct/19 23:15 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335217355 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -157,16 +169,18 @@ public BeamTableStatistics getTableStatistics(PipelineOptions options) { public PCollection buildIOReader( PBegin begin, BeamSqlTableFilter filters, List fieldNames) { PCollection withAllFields = buildIOReader(begin); - if (fieldNames.isEmpty() && filters instanceof DefaultTableFilter) { + if (options == PushDownOptions.NONE) { // needed for testing purposes return withAllFields; } Review comment: Added checks that throw a `RuntimeException` when an invalid scenario is encountered. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328830) Time Spent: 5h 40m (was: 5.5h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328828 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:11 Start Date: 15/Oct/19 23:11 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335213030 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328829 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:11 Start Date: 15/Oct/19 23:11 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335213030 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328827 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 23:10 Start Date: 15/Oct/19 23:10 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335213030 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328820 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 22:56 Start Date: 15/Oct/19 22:56 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335213030 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328727=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328727 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 18:26 Start Date: 15/Oct/19 18:26 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335101512 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328726=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328726 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 18:26 Start Date: 15/Oct/19 18:26 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335099291 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); Review comment: nit: This could be somewhat expensive, I'd move it to just before the first use. (Same with `calcInputRowType` and `relBuilder`.) This is an automated message from the
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328729=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328729 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 18:26 Start Date: 15/Oct/19 18:26 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335102598 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328730=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328730 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 18:26 Start Date: 15/Oct/19 18:26 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335087133 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIOSourceRel.java ## @@ -94,7 +116,19 @@ public NodeStats estimateNodeStats(RelMetadataQuery mq) { "Should not have received input for %s: %s", BeamIOSourceRel.class.getSimpleName(), input); - return beamTable.buildIOReader(input.getPipeline().begin()); + + PBegin begin = input.getPipeline().begin(); Review comment: This is probably worth making `final`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328730) Time Spent: 4h 50m (was: 4h 40m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 4h 50m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328731=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328731 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 18:26 Start Date: 15/Oct/19 18:26 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335099406 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rule/BeamIOPushDownRule.java ## @@ -0,0 +1,236 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.beam.sdk.extensions.sql.impl.rule; + +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.stream.Collectors; +import org.apache.beam.sdk.extensions.sql.impl.rel.BeamIOSourceRel; +import org.apache.beam.sdk.extensions.sql.impl.utils.CalciteUtils; +import org.apache.beam.sdk.extensions.sql.meta.BeamSqlTable; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor; +import org.apache.beam.sdk.schemas.FieldAccessDescriptor.FieldDescriptor; +import org.apache.beam.sdk.schemas.Schema; +import org.apache.beam.sdk.schemas.utils.SelectHelpers; +import org.apache.beam.vendor.calcite.v1_20_0.com.google.common.collect.ImmutableList; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRule; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.plan.RelOptRuleCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.RelNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.Calc; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.core.RelFactories; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelDataTypeField; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rel.type.RelRecordType; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexCall; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexInputRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexLocalRef; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexNode; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexProgram; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilder; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.tools.RelBuilderFactory; +import org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.util.Pair; +import org.apache.beam.vendor.guava.v26_0_jre.com.google.common.annotations.VisibleForTesting; + +public class BeamIOPushDownRule extends RelOptRule { + // ~ Static fields/initializers - + + public static final BeamIOPushDownRule INSTANCE = + new BeamIOPushDownRule(RelFactories.LOGICAL_BUILDER); + + // ~ Constructors --- + + public BeamIOPushDownRule(RelBuilderFactory relBuilderFactory) { +super(operand(Calc.class, operand(BeamIOSourceRel.class, any())), relBuilderFactory, null); + } + + // ~ Methods + + @Override + public void onMatch(RelOptRuleCall call) { +final Calc calc = call.rel(0); +final BeamIOSourceRel ioSourceRel = call.rel(1); +final BeamSqlTable beamSqlTable = ioSourceRel.getBeamSqlTable(); +final RexProgram program = calc.getProgram(); +final Pair, ImmutableList> projectFilter = program.split(); +final RelDataType calcInputRowType = program.getInputRowType(); +RelBuilder relBuilder = call.builder(); + +if (!beamSqlTable.supportsProjects()) { + return; +} + +// Nested rows are not supported at the moment +for
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328728=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328728 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 18:26 Start Date: 15/Oct/19 18:26 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335108780 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -157,16 +169,18 @@ public BeamTableStatistics getTableStatistics(PipelineOptions options) { public PCollection buildIOReader( PBegin begin, BeamSqlTableFilter filters, List fieldNames) { PCollection withAllFields = buildIOReader(begin); - if (fieldNames.isEmpty() && filters instanceof DefaultTableFilter) { + if (options == PushDownOptions.NONE) { // needed for testing purposes return withAllFields; } Review comment: In this function you should throw an exception if the filter is set but filter push-down isn't enabled. Same thing with project push down. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328728) Time Spent: 4h 40m (was: 4.5h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328725=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328725 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 18:26 Start Date: 15/Oct/19 18:26 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335092961 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIOSourceRel.java ## @@ -109,10 +137,11 @@ public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) { @Override public BeamCostModel beamComputeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) { NodeStats estimates = BeamSqlRelUtils.getNodeStats(this, mq); -return BeamCostModel.FACTORY.makeCost(estimates.getRowCount(), estimates.getRate()); +return BeamCostModel.FACTORY.makeCost( +estimates.getRowCount() * getRowType().getFieldCount(), estimates.getRate()); } Review comment: This is a good question. This needs to apply to the rate as well. I think this should probably be passed into `makeCost` as `dIo` and factored into the cost model. See here: https://github.com/apache/beam/blob/031b3789c4191bc82d0e97f4cabd0ccbee6c9902/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamCostModel.java#L121 If just multiplying it into the sum in `getCostCombination` passes all the tests that is probably good enough for now. I would expect we actually want it to be a smaller factor but we can figure that out later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328725) Time Spent: 4h 40m (was: 4.5h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328710=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328710 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 18:00 Start Date: 15/Oct/19 18:00 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335097020 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIOSourceRel.java ## @@ -94,7 +116,19 @@ public NodeStats estimateNodeStats(RelMetadataQuery mq) { "Should not have received input for %s: %s", BeamIOSourceRel.class.getSimpleName(), input); - return beamTable.buildIOReader(input.getPipeline().begin()); + + PBegin begin = input.getPipeline().begin(); + BeamSqlTableFilter filters = beamTable.constructFilter(ImmutableList.of()); + + if (usedFields.isEmpty() && filters instanceof DefaultTableFilter) { +return beamTable.buildIOReader(begin); + } + + Schema newBeamSchema = CalciteUtils.toSchema(getRowType()); + + return beamTable + .buildIOReader(input.getPipeline().begin(), filters, usedFields) Review comment: Nice catch! Updated to use local variable `begin`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328710) Time Spent: 4.5h (was: 4h 20m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328699=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328699 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 15/Oct/19 17:37 Start Date: 15/Oct/19 17:37 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r335086826 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIOSourceRel.java ## @@ -94,7 +116,19 @@ public NodeStats estimateNodeStats(RelMetadataQuery mq) { "Should not have received input for %s: %s", BeamIOSourceRel.class.getSimpleName(), input); - return beamTable.buildIOReader(input.getPipeline().begin()); + + PBegin begin = input.getPipeline().begin(); + BeamSqlTableFilter filters = beamTable.constructFilter(ImmutableList.of()); + + if (usedFields.isEmpty() && filters instanceof DefaultTableFilter) { +return beamTable.buildIOReader(begin); + } + + Schema newBeamSchema = CalciteUtils.toSchema(getRowType()); + + return beamTable + .buildIOReader(input.getPipeline().begin(), filters, usedFields) Review comment: Replace `input.getPipeline().begin()` with `begin` (or drop the variable). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328699) Time Spent: 4h 20m (was: 4h 10m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328176=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328176 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 22:04 Start Date: 14/Oct/19 22:04 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r334680521 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamIOSourceRel.java ## @@ -109,10 +137,11 @@ public RelOptCost computeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) { @Override public BeamCostModel beamComputeSelfCost(RelOptPlanner planner, RelMetadataQuery mq) { NodeStats estimates = BeamSqlRelUtils.getNodeStats(this, mq); -return BeamCostModel.FACTORY.makeCost(estimates.getRowCount(), estimates.getRate()); +return BeamCostModel.FACTORY.makeCost( +estimates.getRowCount() * getRowType().getFieldCount(), estimates.getRate()); } Review comment: Is this the most optimal way to ensure that the table with projects pushed-down is favored? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328176) Time Spent: 4h 10m (was: 4h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328162=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328162 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 21:24 Start Date: 14/Oct/19 21:24 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541930227 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328162) Time Spent: 4h (was: 3h 50m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328161=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328161 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 21:23 Start Date: 14/Oct/19 21:23 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541930227 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328161) Time Spent: 3h 50m (was: 3h 40m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328106=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328106 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 20:15 Start Date: 14/Oct/19 20:15 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r334645199 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -150,12 +165,40 @@ public BeamTableStatistics getTableStatistics(PipelineOptions options) { return begin.apply(Create.of(tableWithRows.rows).withCoder(rowCoder())); } +@Override +public PCollection buildIOReader( +PBegin begin, BeamSqlTableFilter filters, List fieldNames) { + PCollection withAllFields = buildIOReader(begin); + if (options == PushDownOptions.NONE + || (fieldNames.isEmpty() && filters instanceof DefaultTableFilter)) { +return withAllFields; + } Review comment: Moved the default case to `Transform#expand`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328106) Time Spent: 3h 40m (was: 3.5h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328008=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328008 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 17:46 Start Date: 14/Oct/19 17:46 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541820040 Run JavaBeamZetaSQL PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328008) Time Spent: 3h 20m (was: 3h 10m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=328010=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-328010 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 17:46 Start Date: 14/Oct/19 17:46 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541820040 Run JavaBeamZetaSQL PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 328010) Time Spent: 3.5h (was: 3h 20m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327988=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327988 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 17:28 Start Date: 14/Oct/19 17:28 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541810496 Run JavaBeamZetaSQL PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327988) Time Spent: 3h 10m (was: 3h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 3h 10m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327987=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327987 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 17:27 Start Date: 14/Oct/19 17:27 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541810496 Run JavaBeamZetaSQL PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327987) Time Spent: 3h (was: 2h 50m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327981 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 17:08 Start Date: 14/Oct/19 17:08 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327981) Time Spent: 2h 50m (was: 2h 40m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327980=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327980 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 17:07 Start Date: 14/Oct/19 17:07 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r334568107 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -150,12 +165,40 @@ public BeamTableStatistics getTableStatistics(PipelineOptions options) { return begin.apply(Create.of(tableWithRows.rows).withCoder(rowCoder())); } +@Override +public PCollection buildIOReader( +PBegin begin, BeamSqlTableFilter filters, List fieldNames) { + PCollection withAllFields = buildIOReader(begin); + if (options == PushDownOptions.NONE + || (fieldNames.isEmpty() && filters instanceof DefaultTableFilter)) { +return withAllFields; + } Review comment: Some portion of this `if` statement can be moved to `BeamIOSourceRel`, more specifically into the `Transform#expand` method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327980) Time Spent: 2h 40m (was: 2.5h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327963=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327963 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 14/Oct/19 16:40 Start Date: 14/Oct/19 16:40 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#discussion_r334568107 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -150,12 +165,40 @@ public BeamTableStatistics getTableStatistics(PipelineOptions options) { return begin.apply(Create.of(tableWithRows.rows).withCoder(rowCoder())); } +@Override +public PCollection buildIOReader( +PBegin begin, BeamSqlTableFilter filters, List fieldNames) { + PCollection withAllFields = buildIOReader(begin); + if (options == PushDownOptions.NONE + || (fieldNames.isEmpty() && filters instanceof DefaultTableFilter)) { +return withAllFields; + } Review comment: Some portion of this `if` statement can be moved to `BeamIOSourceRel`, more specifically the `Transform#expand` method. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327963) Time Spent: 2.5h (was: 2h 20m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327220=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327220 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 12/Oct/19 06:23 Start Date: 12/Oct/19 06:23 Worklog Time Spent: 10m Work Description: amaliujia commented on issue #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#issuecomment-541291027 I don't have further comment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327220) Time Spent: 2h 20m (was: 2h 10m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327114=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327114 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 11/Oct/19 22:46 Start Date: 11/Oct/19 22:46 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541248215 R: @apilloud This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327114) Time Spent: 2h 10m (was: 2h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327054=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327054 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 11/Oct/19 20:19 Start Date: 11/Oct/19 20:19 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541210027 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327054) Time Spent: 1h 50m (was: 1h 40m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=327055=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327055 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 11/Oct/19 20:19 Start Date: 11/Oct/19 20:19 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764#issuecomment-541210027 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 327055) Time Spent: 2h (was: 1h 50m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=326640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326640 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 10/Oct/19 23:48 Start Date: 10/Oct/19 23:48 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9764: [BEAM-8365] [WIP] Project push-down for TestTableProvider URL: https://github.com/apache/beam/pull/9764 1. Added an option to `InMemoryTable`, which enables choosing what to push-down: none (default), projects, filters, both. With the purpose to simplify unit testing. 2. Create a rule to push fields used by a `Calc` (projects and a condition) down into `InMemoryTable` IO. 3. Updating that same `Calc` (from previous step) to have a proper input and output schemes, remove unused fields. - Remove `Calc` completely when it only renames field names and update `RowType` of the `IOSourceRel`. 4. Update cost model to favor IO with projects pushed-down. - Right now it is accomplished by multiplying row count by the number of projected fields. Still needs to be done: 1. Refactoring (currently in progress). 2. Add JavaDoc comments. 3. Potentially add more test (ex: select id+1). 4. Break this PR into 2 or more (currently it is pretty large). Based on top of #9743 Design doc [link](https://docs.google.com/document/d/1-ysD7U7qF3MAmSfkbXZO_5PLJBevAL9bktlLCerd_jE/edit?usp=sharing). Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=326498=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326498 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 10/Oct/19 18:41 Start Date: 10/Oct/19 18:41 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#issuecomment-540719113 Run Java PreCommit This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 326498) Time Spent: 1.5h (was: 1h 20m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=326470=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326470 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 10/Oct/19 17:30 Start Date: 10/Oct/19 17:30 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#issuecomment-540690358 > You probably want a few more test cases here: > > 1. Empty list. > 2. All the columns. > 3. Duplicate columns. > 4. Invalid columns. > > Otherwise LGTM Added tests for 1-3. Passing invalid columns beaks things and is hard to test. A list of selected columns should be generated by the rule and passed to the table provider via BeamIOSourceRel. It should never get into an invalid state, where false column names are being extracted. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 326470) Time Spent: 1h 20m (was: 1h 10m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=326446=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-326446 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 10/Oct/19 16:54 Start Date: 10/Oct/19 16:54 Worklog Time Spent: 10m Work Description: apilloud commented on issue #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#issuecomment-540677027 @amaliujia You marked as changes required. Can you take another look? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 326446) Time Spent: 1h 10m (was: 1h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=325976=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325976 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 09/Oct/19 22:52 Start Date: 09/Oct/19 22:52 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#discussion_r332278680 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -90,7 +94,38 @@ public void dropTable(String tableName) { @Override public synchronized BeamSqlTable buildBeamSqlTable(Table table) { -return new InMemoryTable(tables().get(table.getName())); +BeamSqlTable beamSqlTable; + +if (table.getProperties().containsKey(PUSH_DOWN)) { Review comment: > It seems like table should have a property with if it supports PUSH_DOWN or not? Why it can not be a filterable interface detection and has a BeamIOFilterableSource? The reason I decided to make it configurable with a property is to not break existing code while working on push-down. I have updated the code to utilize the existing `InMemoryTable` class. I am not sure if interface detection would would still be applicable here, could you please take another look at the updated file? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325976) Time Spent: 1h (was: 50m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 1h > Remaining Estimate: 0h > > * InMemoryTable should implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Add a property "push_down" to TestTableProvider, which should allow user > to select version of InMemoryProvider to use: with project push-down, with > predicate push-down (will be implemented later), both (will be implemented > later), or without (default). > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=325929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325929 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 09/Oct/19 21:17 Start Date: 09/Oct/19 21:17 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#discussion_r333237487 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -150,12 +153,30 @@ public BeamTableStatistics getTableStatistics(PipelineOptions options) { return begin.apply(Create.of(tableWithRows.rows).withCoder(rowCoder())); } +@Override +public PCollection buildIOReader( +PBegin begin, BeamSqlTableFilter filters, List fieldNames) { + Preconditions.checkNotNull(fieldNames); Review comment: In `BaseBeamTable` it is acceptable for fieldNames to be null. You should be consistent in what you accept. Should that interface also be checking that these are not null? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325929) Time Spent: 50m (was: 40m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > * InMemoryTable should implement implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Add a property "push_down" to TestTableProvider, which should allow user > to select version of InMemoryProvider to use: with project push-down, with > predicate push-down (will be implemented later), both (will be implemented > later), or without (default). > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=325930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-325930 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 09/Oct/19 21:17 Start Date: 09/Oct/19 21:17 Worklog Time Spent: 10m Work Description: apilloud commented on pull request #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#discussion_r333236734 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -150,12 +153,30 @@ public BeamTableStatistics getTableStatistics(PipelineOptions options) { return begin.apply(Create.of(tableWithRows.rows).withCoder(rowCoder())); } +@Override +public PCollection buildIOReader( +PBegin begin, BeamSqlTableFilter filters, List fieldNames) { Review comment: It would be good to add an assert that `filters instanceof DefaultTableFilter` here as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 325930) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 50m > Remaining Estimate: 0h > > * InMemoryTable should implement implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Add a property "push_down" to TestTableProvider, which should allow user > to select version of InMemoryProvider to use: with project push-down, with > predicate push-down (will be implemented later), both (will be implemented > later), or without (default). > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=324689=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324689 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 07/Oct/19 23:20 Start Date: 07/Oct/19 23:20 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#discussion_r332278680 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -90,7 +94,38 @@ public void dropTable(String tableName) { @Override public synchronized BeamSqlTable buildBeamSqlTable(Table table) { -return new InMemoryTable(tables().get(table.getName())); +BeamSqlTable beamSqlTable; + +if (table.getProperties().containsKey(PUSH_DOWN)) { Review comment: The reason I decided to make it configurable with a property is to not break existing code while working on push-down. I have updated the code to utilize the existing `InMemoryTable` class. I am not sure if interface detection would would still be applicable here, could you please take another look at the updated file? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 324689) Time Spent: 40m (was: 0.5h) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 40m > Remaining Estimate: 0h > > * Create a class extending InMemoryTable and implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Add a property "push_down" to TestTableProvider, which should allow user > to select version of InMemoryProvider to use: with project push-down, with > predicate push-down (will be implemented later), both (will be implemented > later), or without (default). > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=324672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324672 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 07/Oct/19 22:37 Start Date: 07/Oct/19 22:37 Worklog Time Spent: 10m Work Description: amaliujia commented on pull request #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#discussion_r332267954 ## File path: sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/test/TestTableProvider.java ## @@ -90,7 +94,38 @@ public void dropTable(String tableName) { @Override public synchronized BeamSqlTable buildBeamSqlTable(Table table) { -return new InMemoryTable(tables().get(table.getName())); +BeamSqlTable beamSqlTable; + +if (table.getProperties().containsKey(PUSH_DOWN)) { Review comment: It seems like table should have a property with if it supports PUSH_DOWN or not? Why it can not be a filterable interface detection and has a BeamIOFilterableSource? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 324672) Time Spent: 0.5h (was: 20m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 0.5h > Remaining Estimate: 0h > > * Create a class extending InMemoryTable and implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Add a property "push_down" to TestTableProvider, which should allow user > to select version of InMemoryProvider to use: with project push-down, with > predicate push-down (will be implemented later), both (will be implemented > later), or without (default). > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=324626=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324626 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 07/Oct/19 21:38 Start Date: 07/Oct/19 21:38 Worklog Time Spent: 10m Work Description: 11moon11 commented on pull request #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743 - Create a class extending InMemoryTable. - Add a property "push_down" to TestTableProvider, which should allow user to select version of InMemoryProvider to use: with project push-down, with predicate push-down (will be implemented later), both (will be implemented later), or none (default). Based on top of #9731 Design doc [link](https://docs.google.com/document/d/1-ysD7U7qF3MAmSfkbXZO_5PLJBevAL9bktlLCerd_jE/edit?usp=sharing). Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`). - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf). Post-Commit Tests Status (on master branch) Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark --- | --- | --- | --- | --- | --- | --- | --- Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/) Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)[![Build
[jira] [Work logged] (BEAM-8365) Add project push-down capability to IO APIs
[ https://issues.apache.org/jira/browse/BEAM-8365?focusedWorklogId=324628=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-324628 ] ASF GitHub Bot logged work on BEAM-8365: Author: ASF GitHub Bot Created on: 07/Oct/19 21:38 Start Date: 07/Oct/19 21:38 Worklog Time Spent: 10m Work Description: 11moon11 commented on issue #9743: [BEAM-8365] Project push-down for test table provider URL: https://github.com/apache/beam/pull/9743#issuecomment-539215707 R: @apilloud This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 324628) Time Spent: 20m (was: 10m) > Add project push-down capability to IO APIs > --- > > Key: BEAM-8365 > URL: https://issues.apache.org/jira/browse/BEAM-8365 > Project: Beam > Issue Type: New Feature > Components: dsl-sql >Reporter: Kirill Kozlov >Assignee: Kirill Kozlov >Priority: Minor > Time Spent: 20m > Remaining Estimate: 0h > > * Create a class extending InMemoryTable and implement a following method: > {code:java} > public PCollection buildIOReader( > PBegin begin, BeamSqlTableFilter filters, List fieldNames);{code} > Which should return a `PCollection` with fields specified in `fieldNames` > list. > * Add a property "push_down" to TestTableProvider, which should allow user > to select version of InMemoryProvider to use: with project push-down, with > predicate push-down (will be implemented later), both (will be implemented > later), or without (default). > * Create a rule to push fields used by a Calc (in projects and in a > condition) down into TestTable IO. > * Updating that same Calc (from previous step) to have a proper input and > output schemes, remove unused fields. -- This message was sent by Atlassian Jira (v8.3.4#803005)