> On Feb. 10, 2016, 4:57 p.m., Bill Farner wrote:
> > It would be nice to hear how this change jives with the opposite change 
> > made in https://reviews.apache.org/r/42882
> 
> Maxim Khutornenko wrote:
>     I thought about that too. I think there are 2 major differences: the 
> total number of rows generated by the multi-join select statement and number 
> of required subselects. In that RB, lowering the row count from 500k to just 
> under 100 plus the low number of required subselects helped to unlock perf 
> gains.
>     
>     In this particular scenario, it appears that the frequency of subselects 
> trumps everything else.
>     
>     Zameer, what's the overall number of rows returned by the select 
> statement for a single task in your case?

Running:
````
    SELECT
      t.id AS row_id,
      t.task_config_row_id AS task_config_row_id,
      t.task_id AS task_id,
      t.instance_id AS instance_id,
      t.status AS status,
      t.failure_count AS failure_count,
      t.ancestor_task_id AS ancestor_id,
      j.role AS c_j_role,
      j.environment AS c_j_environment,
      j.name AS c_j_name,
      h.slave_id AS slave_id,
      h.host AS slave_host,
      tp.name as tp_name,
      tp.port as tp_port,
      te.timestamp_ms as te_timestamp,
      te.status as te_status,
      te.message as te_message,
      te.scheduler_host as te_scheduler
    FROM tasks AS t
    INNER JOIN task_configs as c ON c.id = t.task_config_row_id
    INNER JOIN job_keys AS j ON j.id = c.job_key_id
    LEFT OUTER JOIN task_ports as tp ON tp.task_row_id = t.id
    LEFT OUTER JOIN task_events as te ON te.task_row_id = t.id
    LEFT OUTER JOIN host_attributes AS h ON h.id = t.slave_row_id
    WHERE task_id = 
'1454546771388-zmanji-devel-labrat-237-0e52b4a9-a8da-4958-997f-7bbe3db6b5d2'
````

On a test cluster returns 4 rows where thhe task is in the RUNNING state.

If we consider it, a job typically does not allocate that many ports, and will 
have less than 8 events on the task.

Further running
````
    SELECT
      c.id AS id,
      c.creator_user AS creator_user,
      c.service AS is_service,
      c.num_cpus AS num_cpus,
      c.ram_mb AS ram_mb,
      c.disk_mb AS disk_mb,
      c.priority AS priority,
      c.max_task_failures AS max_task_failures,
      c.production AS production,
      c.contact_email AS contact_email,
      c.executor_name AS executor_name,
      c.executor_data AS executor_data,
      c.tier AS tier,
      j.role AS j_role,
      j.environment AS j_environment,
      j.name AS j_name,
      p.port_name AS p_port_name,
      d.id AS c_id,
      d.image AS c_image,
      m.id AS m_id,
      m.key AS m_key,
      m.value AS m_value,
      tc.id AS constraint_id,
      tc.name AS constraint_name,
      tlc.id AS constraint_l_id,
      tlc.value AS constraint_l_limit,
      tvc.id AS constraint_v_id,
      tvc.negated AS constraint_v_negated,
      tvcv.value as constraint_v_v_value
    FROM task_configs AS c
    INNER JOIN job_keys AS j ON j.id = c.job_key_id
    LEFT OUTER JOIN task_config_requested_ports AS p ON p.task_config_id = c.id
    LEFT OUTER JOIN task_config_docker_containers AS d ON d.task_config_id = 
c.id
    LEFT OUTER JOIN task_config_metadata AS m ON m.task_config_id = c.id
    LEFT OUTER JOIN task_constraints AS tc ON tc.task_config_id = c.id
    LEFT OUTER JOIN limit_constraints as tlc ON tlc.constraint_id = tc.id
    LEFT OUTER JOIN value_constraints as tvc ON tvc.constraint_id = tc.id
    LEFT OUTER JOIN value_constraint_values AS tvcv ON tvcv.value_constraint_id 
= tvc.id
    WHERE c.id = 1
````

Returns 2 rows for a a task in the above job.

I think this is because a tpyical job doesn't have that many constraints.


- Zameer


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/43457/#review118790
-----------------------------------------------------------


On Feb. 10, 2016, 4:35 p.m., Zameer Manji wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/43457/
> -----------------------------------------------------------
> 
> (Updated Feb. 10, 2016, 4:35 p.m.)
> 
> 
> Review request for Aurora, John Sirois and Maxim Khutornenko.
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Profiling master indicated that the bottleneck was MyBatis populating 
> ResultSets and populating the resulting objects. This patch removes 
> subselects, which reduces the number of ResultSets and removes the population 
> of an object via a constructor which is slower than populating an object via 
> setters.
> 
> 
> Diffs
> -----
> 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssginedPort.java
>  PRE-CREATION 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/views/DbAssignedTask.java
>  93722395ed9fcd22dcb12e34e648e6e410952d43 
>   
> src/main/java/org/apache/aurora/scheduler/storage/db/views/DbScheduledTask.java
>  502a1fa6fc141df498f0f09af292ce24e269731d 
>   
> src/main/resources/org/apache/aurora/scheduler/storage/db/TaskConfigMapper.xml
>  b1394cf44b7ddafcbc47bb1968306d0b33293380 
>   src/main/resources/org/apache/aurora/scheduler/storage/db/TaskMapper.xml 
> ea469cce31544221c34ae05a1c65f71271985655 
> 
> Diff: https://reviews.apache.org/r/43457/diff/
> 
> 
> Testing
> -------
> 
> Master:
> Benchmark                                      (numTasks)   Mode  Cnt   Score 
>    Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  44.052 
> ± 14.689  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   0.179 
> ±  0.052  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   0.087 
> ±  0.022  ops/s
> 
> This Patch:
> Benchmark                                      (numTasks)   Mode  Cnt   Score 
>   Error  Units
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       10000  thrpt    5  51.531 
> ± 7.236  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run       50000  thrpt    5   7.370 
> ± 1.320  ops/s
> TaskStoreBenchmarks.DBFetchTasksBenchmark.run      100000  thrpt    5   2.143 
> ± 1.234  ops/s
> 
> 
> Thanks,
> 
> Zameer Manji
> 
>

Reply via email to