Re: H2O integration - intermediate progress update

Dmitriy Lyubimov Wed, 18 Jun 2014 18:30:25 -0700

also, if something is not supported, such as your example, (if it is not
supported), optimizer would simply state so with rejection. But if it takes
it in, then I am pretty sure it will do the right job (or at least there's
a unit test for that case that is asserted on a trivial example).


Here, by trivial i mean local pipelines for 2-split inputs, that's the
general rule i used.


On Wed, Jun 18, 2014 at 6:26 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

> a little bit of additional information is that for rewriting rules stage
> optimizer does 3 passes over semantic tree, each pass matching a tree
> fragment using Scala case class matching and rewriting. This allows to
> match and rewrite pretty elaborate tree structure fragments, although at
> the moment i don't think we dig farther than immediate children, and
> perhaps some their known attributes, in most cases.
>
> More detailed description that that i think is only in reading the source.
>
>
> On Wed, Jun 18, 2014 at 6:19 PM, Dmitriy Lyubimov <dlie...@gmail.com>
> wrote:
>
>> E.g. i know for sure A %.% B is legal where A is string-keyed and b is
>> int-keyed.
>>
>> This is kind of not the point. the point is that you can easily modify
>> rewriting rules and operators to cover misses. (there shouldn't be many,
>> since we've already written quite a bit of expressions out there).
>>
>>
>> On Wed, Jun 18, 2014 at 6:15 PM, Dmitriy Lyubimov <dlie...@gmail.com>
>> wrote:
>>
>>> I am not sure. There are more rewriting rules than i can remember, and i
>>> did not write an algorithm ( i think) that would involve this combination.
>>> I guess the best thing is to try in a shell or a unit test. if it falls
>>> thru, perhaps a new plan element needs to be added (although I am not very
>>> sure there isn't already). I know that there are join-based multiplicative
>>> operators there.
>>>
>>>
>>> On Wed, Jun 18, 2014 at 6:11 PM, Ted Dunning <ted.dunn...@gmail.com>
>>> wrote:
>>>
>>>> On Wed, Jun 18, 2014 at 6:07 PM, Dmitriy Lyubimov <dlie...@gmail.com>
>>>> wrote:
>>>>
>>>> > in simple terms, if non-integer row keying is used anywhere, it tries
>>>> to
>>>> > rewrite pipelines so that product orientations never require non-int
>>>> keys
>>>> > to denote columns. In case pipeline makes it impossible, optimizer
>>>> will
>>>> > refuse to produce a plan.
>>>> >
>>>> > e.g. suppose A is distributed string-keyed.
>>>> >
>>>> > (A.t %.% A) collect  // ok
>>>> >
>>>>
>>>> What happens with the important case of  B.t %.% A where both A and B
>>>> are
>>>> string keyed?
>>>>
>>>
>>>
>>
>

Re: H2O integration - intermediate progress update

Reply via email to