[ 
https://issues.apache.org/jira/browse/BEAM-7545?focusedWorklogId=276964&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-276964
 ]

ASF GitHub Bot logged work on BEAM-7545:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Jul/19 20:18
            Start Date: 15/Jul/19 20:18
    Worklog Time Spent: 10m 
      Work Description: akedin commented on pull request #9040: [BEAM-7545] 
Reordering Beam Joins
URL: https://github.com/apache/beam/pull/9040#discussion_r303617190
 
 

 ##########
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/planner/BeamRuleSets.java
 ##########
 @@ -103,6 +106,10 @@
 
           // join rules
           JoinPushExpressionsRule.INSTANCE,
+          JoinCommuteRule.INSTANCE,
 
 Review comment:
   In general we should have tests. However in this case the inputs/outputs are 
not changing depending on the inputs. We should make sure that we have a couple 
of tests that we know trigger these rules but we cannot enforce it, i.e. I 
don't know how we can ensure that we even hit the rule by looking at the 
results. If we don't know what a rule breaks, can we come up with a meaningful 
test for it?
   
   I also don't think we should check the plans. It will be too brittle. Rules 
can be enabled and disabled independently and probably interact in 
hard-to-predict ways. If VolcanoPlanner (or in general Calcite stuff) is 
tested, then we should not re-test it and accept that it applies the rules 
correctly. We add some rules that are applied the same way as all other rules 
and inherit from them. In this case the rules behave the same way as built-in 
Calcite rules, and reject the plans that we already don't support in the join 
rel.
   
   It feels like the only issue can be is the applicability of the reordering 
operation itself. Looking at what join operations we support and how it works 
(e.g. window is inherited for non-merging windows, or the input has to be 
explicitly re-windowed for merging windows, and that we only allow the default 
trigger that has itself as a continuation trigger), it looks safe and from the 
top of the head I cannot come up with a test case that we accept but that would 
break with the rule / without the rule.
   
   Probably a unit test of the join rejection would make sense but I don't 
think it's  really valuable as we literally call the same code that we already 
do in the existing join rel and that probably already have tests (this should 
be confirmed) that trigger that logic.
   
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 276964)
    Time Spent: 7h 20m  (was: 7h 10m)

> Row Count Estimation for CSV TextTable
> --------------------------------------
>
>                 Key: BEAM-7545
>                 URL: https://issues.apache.org/jira/browse/BEAM-7545
>             Project: Beam
>          Issue Type: New Feature
>          Components: dsl-sql
>            Reporter: Alireza Samadianzakaria
>            Assignee: Alireza Samadianzakaria
>            Priority: Major
>             Fix For: Not applicable
>
>          Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Implementing Row Count Estimation for CSV Tables by reading the first few 
> lines of the file and estimating the number of records based on the length of 
> these lines and the total length of the file.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to