One tricky aspect is to optimize a *batch* of requests.

The trick is to tie together the batch so that it is costed as one request. We 
don’t have an operator specifically for that, but you could for instance use 
UNION ALL. E.g. given Q1 and Q2, you could generate a plan for

  select count(*) from Q1 union all select count(*) from Q2

If the plan for the batch is be a DAG (i.e. sharing work between the components 
of the batch by creating something akin to “temporary tables”) then you are in 
the territory for which we created the Spool operator (see discussion in 
https://issues.apache.org/jira/browse/CALCITE-481 
<https://issues.apache.org/jira/browse/CALCITE-481>).

Julian


> On Aug 19, 2019, at 6:34 AM, Julian Feinauer <j.feina...@pragmaticminds.de> 
> wrote:
> 
> Hi Danny,
> 
> thanks for the quick reply.
> Cost calculation we can of course provide (but it could be a bit different as 
> we have not only CPU and Memory but also Network or something).
> 
> And also something like the RelNodes could be provided. In our case this 
> would be "Requests" which are at first "Logical" and are then transformed to 
> "Physical" Requests. For example the API allows you to request many fields 
> per single request but some PLCs only allow one field per request. So this 
> would be one task of this layer.
> 
> Julian
> 
> Am 19.08.19, 14:44 schrieb "Danny Chan" <yuzhao....@gmail.com>:
> 
>    Cool idea ! Julian Feinauer ~
> 
>    I think the volcano model can be used the base of the cost algorithm. As 
> long as you define all the metadata that you care about. Another thing is 
> that you should have a struct like RelNode and a method like #computeSelfCost.
> 
>    Best,
>    Danny Chan
>    在 2019年8月19日 +0800 PM5:20,Julian Feinauer 
> <j.feina...@pragmaticminds.de>,写道:
>> Hi folks,
>> 
>> I’m here again with another PLC4X related question 
>> (https://plc4x.apache.org).
>> As we have more and more usecases we encounter situations where we send LOTS 
>> of replies to PLCs which one could sometimes optimize.
>> This has multiple reasons upstream (like multiple different Services 
>> sending, or you want two logically different addresses which could be 
>> physically equal).
>> 
>> So, we consider to add some kind of optimizer which takes a Batch of 
>> requests and tries to arrange them in an “optimal” way with regard to som 
>> cost function.
>> The cost functions would of course be given by each Driver but the optimizer 
>> could / should be rather general (possibly with pluggable rules).
>> 
>> As Calcites Planner already includes all of that I ask myself if it could be 
>> possible (and make sense) to use that in PLC4X.
>> Generally speaking, this raises the question if the Volcano approach can be 
>> suitable for such problems.
>> The other alternative would be to start with some kind of heuristic based 
>> planning or with other optimization algorithms (genetic algs, cross 
>> entropy,…).
>> 
>> Any thoughs or feedbacks are welcome!
>> 
>> Julian
> 
> 

Reply via email to