[influxdb] Re: Join not working, union `processed` does not add up to two parent batch results

nathaniel Wed, 05 Oct 2016 08:38:28 -0700

First the union is working as expected, the script has three parents to the 
union node: expected_instances, expected_instances, and running_instances. 
Expected_instances is given twice, as a result if you do 72985 + 38927 38927 
= 150839 > 150782, which just means that some messages are still in flight.


For the join the reason it is not processing any results is that not points 
are matching up. In order for two points to be joined they must have the 
same timestamp (or be within a specified tolerance) and they must be in the 
same group*. In the example above the expected_instances is grouped by 
'instance' while running_instances is grouped by 'instance' and 
'type_instance'. 
As a result non of the points in the expected_instances parent are in the 
same group as the running_instances parent.


* It is possible to join values that are not in the same group, in this 
case the join "on" property must be specified. The 'on' property must be a 
list of tags to join on and must be a subset of the parents tags. 

For example:

running_instances
    |join(expected_instances)
        .as('running', 'expected')
        .on('instance')
    |httpOut('join_result')

The above will join a point from expected_instances with each of the points 
from running_instances that have the same instance tag. The result will be 
joined points grouped by instance and type_instance.


Since its not clear what you are trying to accomplish I am not sure whether 
you should change the groups of the parents of use the join "on" property.


On Tuesday, October 4, 2016 at 5:38:39 PM UTC-6, Vinit wrote:
>
> Join is not working, it processes zero records. Union works but the 
> processes more than the two parent batches `processed` combined. 
>
> Since nobody is complaining, it is likely that I am doing something wrong 
> but I am out of ideas. Also both these table have exact same timestamps to 
> millis. 
>  
>
> // Get number of instances that should be running
> var expected_instances = batch
>     |query('''select first(value) as desired, instance as app_name from 
> "collectd_db"."default".marathon_apps_value where type_instance='expected' 
> ''')
>         .period(2s)
>         .every(5s)
>         .groupBy('instance')
>
> var running_instances = batch
>     |query('''select first(value), instance as app_name, type_instance 
> from "collectd_db"."default".marathon_tasks_value where value > 60000 ''')
>         .period(1s)
>         .every(2s)
>         .groupBy('instance', 'type_instance')
>
> expected_instances
>     |union(expected_instances, running_instances)
>         .rename('union_result')
>     |httpOut('union_result')
>
> // join this two tables, the app name is same for the same types of columns
> running_instances
>     |join(expected_instances)
>         .as('running', 'expected')
>     |httpOut('join_result')
>
>
>
> kapacitor show flapping_tasks
>
> DOT:
> digraph flapping_tasks {
> graph [throughput="0.00 batches/s"];
>
> batch2 [avg_exec_time_ns="856.130728ms" connect_errors="0" 
> query_errors="0" ];
> batch2 -> join8 [processed="72985"];
> batch2 -> union5 [processed="*72985*"];
>
> batch1 [avg_exec_time_ns="273.525217ms" connect_errors="0" 
> query_errors="0" ];
> batch1 -> join8 [processed="38927"];
> batch1 -> union5 [processed="38927"];
> batch1 -> union5 [processed="*38927*"];
>
> join8 [avg_exec_time_ns="30.272µs" ];
> join8 -> http_out9 [processed="0"];
>
> http_out9 [avg_exec_time_ns="0" ];
>
> union5 [avg_exec_time_ns="2.023µs" ];
> union5 -> http_out6 [processed="150782"];
>
> http_out6 [avg_exec_time_ns="9.795µs" ];
> } 
>
>

-- 
Remember to include the InfluxDB version number with all issue reports
--- 
You received this message because you are subscribed to the Google Groups 
"InfluxDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/influxdb.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/influxdb/4876112e-ec65-48cb-8062-ed4390fbc28d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

[influxdb] Re: Join not working, union `processed` does not add up to two parent batch results

Reply via email to