Hi Julian,

thanks for the reply.
I have to think about that, I think.

But as I understand the Spool Operator this is to factor out multiple 
calculations of the same issue.
In our Situation we aim more on collapsing "overlapping" but not equal requests.

Consider 8 bits which form physically a byte.
If I read 8 BOOLEANs I have 8 different request which mask one bit, return it 
(padded) as byte. So 8 requests and 8 bytes data transfer (plus masking on the 
PLC).
If I would optimize it to read the byte in one request and do the masking 
afterwards I would have one request and only 1 byte transferred (plus no 
masking on the PLC which keeps pressure low there).

This could be modelled by introducing respective "RelNodes" and Planner Rules, 
I think but I do not fully understand how Spool fits in here?

Julian

Am 19.08.19, 20:42 schrieb "Julian Hyde" <[email protected]>:

    One tricky aspect is to optimize a *batch* of requests.
    
    The trick is to tie together the batch so that it is costed as one request. 
We don’t have an operator specifically for that, but you could for instance use 
UNION ALL. E.g. given Q1 and Q2, you could generate a plan for
    
      select count(*) from Q1 union all select count(*) from Q2
    
    If the plan for the batch is be a DAG (i.e. sharing work between the 
components of the batch by creating something akin to “temporary tables”) then 
you are in the territory for which we created the Spool operator (see 
discussion in https://issues.apache.org/jira/browse/CALCITE-481 
<https://issues.apache.org/jira/browse/CALCITE-481>).
    
    Julian
    
    
    > On Aug 19, 2019, at 6:34 AM, Julian Feinauer 
<[email protected]> wrote:
    > 
    > Hi Danny,
    > 
    > thanks for the quick reply.
    > Cost calculation we can of course provide (but it could be a bit 
different as we have not only CPU and Memory but also Network or something).
    > 
    > And also something like the RelNodes could be provided. In our case this 
would be "Requests" which are at first "Logical" and are then transformed to 
"Physical" Requests. For example the API allows you to request many fields per 
single request but some PLCs only allow one field per request. So this would be 
one task of this layer.
    > 
    > Julian
    > 
    > Am 19.08.19, 14:44 schrieb "Danny Chan" <[email protected]>:
    > 
    >    Cool idea ! Julian Feinauer ~
    > 
    >    I think the volcano model can be used the base of the cost algorithm. 
As long as you define all the metadata that you care about. Another thing is 
that you should have a struct like RelNode and a method like #computeSelfCost.
    > 
    >    Best,
    >    Danny Chan
    >    在 2019年8月19日 +0800 PM5:20,Julian Feinauer 
<[email protected]>,写道:
    >> Hi folks,
    >> 
    >> I’m here again with another PLC4X related question 
(https://plc4x.apache.org).
    >> As we have more and more usecases we encounter situations where we send 
LOTS of replies to PLCs which one could sometimes optimize.
    >> This has multiple reasons upstream (like multiple different Services 
sending, or you want two logically different addresses which could be 
physically equal).
    >> 
    >> So, we consider to add some kind of optimizer which takes a Batch of 
requests and tries to arrange them in an “optimal” way with regard to som cost 
function.
    >> The cost functions would of course be given by each Driver but the 
optimizer could / should be rather general (possibly with pluggable rules).
    >> 
    >> As Calcites Planner already includes all of that I ask myself if it 
could be possible (and make sense) to use that in PLC4X.
    >> Generally speaking, this raises the question if the Volcano approach can 
be suitable for such problems.
    >> The other alternative would be to start with some kind of heuristic 
based planning or with other optimization algorithms (genetic algs, cross 
entropy,…).
    >> 
    >> Any thoughs or feedbacks are welcome!
    >> 
    >> Julian
    > 
    > 
    
    

Reply via email to