[
https://issues.apache.org/jira/browse/BEAM-7545?focusedWorklogId=276964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276964
]
ASF GitHub Bot logged work on BEAM-7545:
----------------------------------------
Author: ASF GitHub Bot
Created on: 15/Jul/19 20:18
Start Date: 15/Jul/19 20:18
Worklog Time Spent: 10m
Work Description: akedin commented on pull request #9040: [BEAM-7545]
Reordering Beam Joins
URL: https://github.com/apache/beam/pull/9040#discussion_r303617190
##########
File path:
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java
##########
@@ -103,6 +106,10 @@
// join rules
JoinPushExpressionsRule.INSTANCE,
+ JoinCommuteRule.INSTANCE,
Review comment:
In general we should have tests. However in this case the inputs/outputs are
not changing depending on the inputs. We should make sure that we have a couple
of tests that we know trigger these rules but we cannot enforce it, i.e. I
don't know how we can ensure that we even hit the rule by looking at the
results. If we don't know what a rule breaks, can we come up with a meaningful
test for it?
I also don't think we should check the plans. It will be too brittle. Rules
can be enabled and disabled independently and probably interact in
hard-to-predict ways. If VolcanoPlanner (or in general Calcite stuff) is
tested, then we should not re-test it and accept that it applies the rules
correctly. We add some rules that are applied the same way as all other rules
and inherit from them. In this case the rules behave the same way as built-in
Calcite rules, and reject the plans that we already don't support in the join
rel.
It feels like the only issue can be is the applicability of the reordering
operation itself. Looking at what join operations we support and how it works
(e.g. window is inherited for non-merging windows, or the input has to be
explicitly re-windowed for merging windows, and that we only allow the default
trigger that has itself as a continuation trigger), it looks safe and from the
top of the head I cannot come up with a test case that we accept but that would
break with the rule / without the rule.
Probably a unit test of the join rejection would make sense but I don't
think it's really valuable as we literally call the same code that we already
do in the existing join rel and that probably already have tests (this should
be confirmed) that trigger that logic.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 276964)
Time Spent: 7h 20m (was: 7h 10m)
> Row Count Estimation for CSV TextTable
> --------------------------------------
>
> Key: BEAM-7545
> URL: https://issues.apache.org/jira/browse/BEAM-7545
> Project: Beam
> Issue Type: New Feature
> Components: dsl-sql
> Reporter: Alireza Samadianzakaria
> Assignee: Alireza Samadianzakaria
> Priority: Major
> Fix For: Not applicable
>
> Time Spent: 7h 20m
> Remaining Estimate: 0h
>
> Implementing Row Count Estimation for CSV Tables by reading the first few
> lines of the file and estimating the number of records based on the length of
> these lines and the total length of the file.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)