> On Feb. 11, 2016, 12:57 a.m., Bill Farner wrote:
> > It would be nice to hear how this change jives with the opposite change 
> > made in https://reviews.apache.org/r/42882
> 
> Maxim Khutornenko wrote:
>     I thought about that too. I think there are 2 major differences: the 
> total number of rows generated by the multi-join select statement and number 
> of required subselects. In that RB, lowering the row count from 500k to just 
> under 100 plus the low number of required subselects helped to unlock perf 
> gains.
>     
>     In this particular scenario, it appears that the frequency of subselects 
> trumps everything else.
>     
>     Zameer, what's the overall number of rows returned by the select 
> statement for a single task in your case?
> 
> Zameer Manji wrote:
>     Running:
>     ````
>         SELECT
>           t.id AS row_id,
>           t.task_config_row_id AS task_config_row_id,
>           t.task_id AS task_id,
>           t.instance_id AS instance_id,
>           t.status AS status,
>           t.failure_count AS failure_count,
>           t.ancestor_task_id AS ancestor_id,
>           j.role AS c_j_role,
>           j.environment AS c_j_environment,
>           j.name AS c_j_name,
>           h.slave_id AS slave_id,
>           h.host AS slave_host,
>           tp.name as tp_name,
>           tp.port as tp_port,
>           te.timestamp_ms as te_timestamp,
>           te.status as te_status,
>           te.message as te_message,
>           te.scheduler_host as te_scheduler
>         FROM tasks AS t
>         INNER JOIN task_configs as c ON c.id = t.task_config_row_id
>         INNER JOIN job_keys AS j ON j.id = c.job_key_id
>         LEFT OUTER JOIN task_ports as tp ON tp.task_row_id = t.id
>         LEFT OUTER JOIN task_events as te ON te.task_row_id = t.id
>         LEFT OUTER JOIN host_attributes AS h ON h.id = t.slave_row_id
>         WHERE task_id = 
> '1454546771388-zmanji-devel-labrat-237-0e52b4a9-a8da-4958-997f-7bbe3db6b5d2'
>     ````
>     
>     On a test cluster returns 4 rows where thhe task is in the RUNNING state.
>     
>     If we consider it, a job typically does not allocate that many ports, and 
> will have less than 8 events on the task.
>     
>     Further running
>     ````
>         SELECT
>           c.id AS id,
>           c.creator_user AS creator_user,
>           c.service AS is_service,
>           c.num_cpus AS num_cpus,
>           c.ram_mb AS ram_mb,
>           c.disk_mb AS disk_mb,
>           c.priority AS priority,
>           c.max_task_failures AS max_task_failures,
>           c.production AS production,
>           c.contact_email AS contact_email,
>           c.executor_name AS executor_name,
>           c.executor_data AS executor_data,
>           c.tier AS tier,
>           j.role AS j_role,
>           j.environment AS j_environment,
>           j.name AS j_name,
>           p.port_name AS p_port_name,
>           d.id AS c_id,
>           d.image AS c_image,
>           m.id AS m_id,
>           m.key AS m_key,
>           m.value AS m_value,
>           tc.id AS constraint_id,
>           tc.name AS constraint_name,
>           tlc.id AS constraint_l_id,
>           tlc.value AS constraint_l_limit,
>           tvc.id AS constraint_v_id,
>           tvc.negated AS constraint_v_negated,
>           tvcv.value as constraint_v_v_value
>         FROM task_configs AS c
>         INNER JOIN job_keys AS j ON j.id = c.job_key_id
>         LEFT OUTER JOIN task_config_requested_ports AS p ON p.task_config_id 
> = c.id
>         LEFT OUTER JOIN task_config_docker_containers AS d ON 
> d.task_config_id = c.id
>         LEFT OUTER JOIN task_config_metadata AS m ON m.task_config_id = c.id
>         LEFT OUTER JOIN task_constraints AS tc ON tc.task_config_id = c.id
>         LEFT OUTER JOIN limit_constraints as tlc ON tlc.constraint_id = tc.id
>         LEFT OUTER JOIN value_constraints as tvc ON tvc.constraint_id = tc.id
>         LEFT OUTER JOIN value_constraint_values AS tvcv ON 
> tvcv.value_constraint_id = tvc.id
>         WHERE c.id = 1
>     ````
>     
>     Returns 2 rows for a a task in the above job.
>     
>     I think this is because a tpyical job doesn't have that many constraints.

Thanks Zameer. This confirms my assumptions about row count vs. sub-select 
chattiness.


- Maxim


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118790
-----------------------------------------------------------


On Feb. 11, 2016, 8:03 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 11, 2016, 8:03 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating 
> ResultSets and populating the resulting objects. This patch removes 
> subselects, which reduces the number of ResultSets and removes the population 
> of an object via a constructor which is slower than populating an object via 
> setters.
> 
> 
> Diffs
> -----
> 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java
>  93722395ed9fcd22dcb12e34e648e6e410952d43 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java
>  502a1fa6fc141df498f0f09af292ce24e269731d 
>   
> src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml
>  b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml 
> ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score 
>    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 
> ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 
> ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 
> ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score 
>   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 
> ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 
> ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 
> ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Reply via email to