Congratulations, btw, to the drill workshop attendees.

Here are some surprising statistics:

30 signups.  28 or 29 attendees showed up.  (unheard of in my experience)

Essentially all attendees actually prepared for the workshop at a very high
level.  Almost all had drill downloaded and compiled and were ready to rock
and roll.

Two separate groups found this bug and a third group seemed to sense that
something was amiss, but they didn't quite connect the dots.

If this is the average calibre of the CWI community, then I expect some
interesting questions tomorrow at the CWI talk on clustering.

On Wed, Mar 20, 2013 at 4:56 AM, Alejandro Bellogin Kouki <
[email protected]> wrote:

> Hi all,
>
> this morning I attended the Drill workshop in Amsterdam, and as other
> couple of people, my colleagues and I found a bug regarding the
> simple_plan.json query. Its original output was:
> {
>  "sales" : 109.71,
>  "typeCount" : 2,
>  "quantity" : 159,
>  "ppu" : 0.55
> }
> {
>  "sales" : 184.25,
>  "typeCount" : 2,
>  "quantity" : 335,
>  "ppu" : 0.55
> }
>
> Notice that both "ppu" values are the same, whereas the value for the
> first should be 0.69 (109.71/159) and for the second 0.55.
>
> So, after digging a little bit (maybe too much, considering the time of
> writing this email) into the source code, I managed to generate the desired
> output. For that, I have changed both the simple_plan.json and some code in
> CollapsingAggregateROP that is incompatible with the current description in
> the Apache Drill Plan Syntax. Mainly because of the latter, I preferred to
> start some discussion here instead of in the JIRA ticket, but if you want
> me to file the JIRA first, I will do it (please, take into account I am a
> complete newbie).
>
> Well, basically my solution involves changing the aggregate operation as
> follows (notice now it has a target):
>          op: "collapsingaggregate",
>          within: "ppusegment",
>          carryovers: [ "donuts.ppu" ],
>          target: "donuts.ppu",
>          aggregations: [
>            { ref: "donuts.typeCount",  expr: "count(1)" },
>            { ref: "donuts.quantity",  expr: "sum(quantity)" },
>            { ref: "donuts.sales",  expr: "sum(donuts.ppu * quantity)" }
>          ]
> To make this works, I have had to ignore the fact that "*will draw the
> carryover variables from a record where the target field references has a
> true value*" [ADPS]. When no target is used, the carryover contains a
> pointer to the wrong register in method writeOutputRecord (whereas in the
> method consumeCurrent -- where the condition for target is checked -- the
> instance of the register is the one of the current segment).
>
> I acknowledge that this solution is just a workaround, since it does not
> comply with the ADPS, but I hope at least it serves to give some
> indications about where (and how) a real solution could be found (i.e., the
> carryovers should to be computed when they point to the actual register).
>
> Regards,
> Alejandro
>
> PS: an alternative solution would be to ignore the carryovers from the
> initial query plan and (somehow) be able to print also the ppu field in the
> projection stage.
>
> --
>  Alejandro Bellogin Kouki
>  
> http://rincon.uam.es/dir?cw=**435275268554687<http://rincon.uam.es/dir?cw=435275268554687>
>
>

Reply via email to