[GitHub] spark pull request: [SPARK-7824][SQL] Extracting and/or condition ...

marmbrus Tue, 26 May 2015 18:30:54 -0700

Github user marmbrus commented on the pull request:

    https://github.com/apache/spark/pull/6351#issuecomment-105712920
  
    That is not fundamentally a problem.  Honestly some more thought probably 
needs to be put into the batches.  Really the only reasons for splitting are 
the following:
     - Large batches are inherently more costly as you must go through every 
rule, even if only a small number are making changes.  So if rules will never 
interact they can be in separate batches.
     - However, large batches are more powerful as there is more opportunity 
for rules to interact
     - Its possible for rules to undo the result of other rules.  In this case 
they *must* be in separate batches or it will go back and forth till the limit 
is reached
     - Another reason for batches is satisfying preconditions. (i.e. a plan 
must be analyzed before optimizing it).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-7824][SQL] Extracting and/or condition ...

Reply via email to