[ 
https://issues.apache.org/jira/browse/AIRAVATA-3372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17222955#comment-17222955
 ] 

Marcus Christie commented on AIRAVATA-3372:
-------------------------------------------

With the most recent read time for getJobDetails, this is the slow query logged 
in mysql, taking ~20 seconds:
{code:sql}
SELECT
    DISTINCT t2.JOB_ID,
    t2.TASK_ID,
    t3.JOB_ID,
    t3.STATUS_ID,
    t3.TASK_ID,
    t3.STATE,
    t3.TIME_OF_STATE_CHANGE
FROM
    PROCESS t0
    INNER JOIN TASK t1 ON t0.PROCESS_ID = t1.PARENT_PROCESS_ID
    INNER JOIN JOB t2 ON t1.TASK_ID = t2.TASK_ID
    INNER JOIN JOB_STATUS t3 ON t2.JOB_ID = t3.JOB_ID
    AND t2.TASK_ID = t3.TASK_ID
WHERE
    (
        t0.EXPERIMENT_ID LIKE 
'C_Base_in_the_Solvent_Phase_Gaussian16_on_Oct_29,_2020_1:15_AM_5c1f991f-fc86-4666-a13f-2c5e09597f68'
 ESCAPE '\\'
    )
ORDER BY
    t2.JOB_ID ASC,
    t2.TASK_ID ASC,
    t3.TIME_OF_STATE_CHANGE ASC;
{code}

This is the EXPLAIN PLAN (note: this is after I added index 
job_task_id_job_id_idx, but it didn't change the plan)
{noformat}
MariaDB [experiment_catalog]> explain SELECT DISTINCT t2.JOB_ID, t2.TASK_ID, 
t3.JOB_ID, t3.STATUS_ID, t3.TASK_ID, t3.STATE, t3.TIME_OF_STATE_CHANGE FROM 
PROCESS t0 INNER JOIN TASK t1 ON t0.PROCESS_ID = t1.PARENT_PROCESS_ID INNER 
JOIN JOB t2 ON t1.TASK_ID = t2.TASK_ID INNER JOIN JOB_STATUS t3 ON t2.JOB_ID = 
t3.JOB_ID AND t2.TASK_ID = t3.TASK_ID WHERE (t0.EXPERIMENT_ID LIKE 
'C_Base_in_the_Solvent_Phase_Gaussian16_on_Oct_29,_2020_1:15_AM_5c1f991f-fc86-4666-a13f-2c5e09597f68'
 ESCAPE '\\') ORDER BY t2.JOB_ID ASC, t2.TASK_ID ASC, t3.TIME_OF_STATE_CHANGE 
ASC;
+------+-------------+-------+--------+-------------------------------------------------------+---------+---------+------------------------------------------------------------+--------+----------------------------------------------+
| id   | select_type | table | type   | possible_keys                           
              | key     | key_len | ref                                         
               | rows   | Extra                                        |
+------+-------------+-------+--------+-------------------------------------------------------+---------+---------+------------------------------------------------------------+--------+----------------------------------------------+
|    1 | SIMPLE      | t2    | index  | 
PRIMARY,TASK_ID,job_job_id_idx,job_task_id_job_id_idx | TASK_ID | 259     | 
NULL                                                       | 110489 | Using 
index; Using temporary; Using filesort |
|    1 | SIMPLE      | t3    | ref    | JOB_ID                                  
              | JOB_ID  | 514     | 
experiment_catalog.t2.JOB_ID,experiment_catalog.t2.TASK_ID |      1 |           
                                   |
|    1 | SIMPLE      | t1    | eq_ref | PRIMARY,PARENT_PROCESS_ID               
              | PRIMARY | 257     | experiment_catalog.t2.TASK_ID               
               |      1 | Distinct                                     |
|    1 | SIMPLE      | t0    | eq_ref | PRIMARY,EXPERIMENT_ID                   
              | PRIMARY | 257     | experiment_catalog.t1.PARENT_PROCESS_ID     
               |      1 | Using where; Distinct                        |
+------+-------------+-------+--------+-------------------------------------------------------+---------+---------+------------------------------------------------------------+--------+----------------------------------------------+
{noformat}

So it's basically sorting the entire JOB table in a temporary table, instead of 
using the indexes; it's not using the EXPERIMENT_ID to filter the rows in JOB.

After some trial and error I discovered if I change the WHERE clause from using 
a {{LIKE}} to using {{=}}, the query returns immediately and properly uses 
indexes.


> BUG: Saved experiment cannot be edited and launch
> -------------------------------------------------
>
>                 Key: AIRAVATA-3372
>                 URL: https://issues.apache.org/jira/browse/AIRAVATA-3372
>             Project: Airavata
>          Issue Type: Sub-task
>          Components: Airavata API, Django Portal
>    Affects Versions: 0.19
>         Environment: https://seagrid.org/
>            Reporter: Eroma
>            Assignee: Marcus Christie
>            Priority: Major
>             Fix For: 0.19
>
>
> # Create an experiment and save for later launch
>  # Select the above experiment edit, and change the wall time and launch.
>  # Error message in Django - *"An error occurred while processing your 
> request. The gateway administrator has been notified of this error."*
>  # In the mailed error report the folowing
> timeout
> The read operation timed out
> Django Version:       1.11.29
> Exception Type:       timeout
> Exception Value:      
> The read operation timed out
> Exception Location:   /usr/lib64/python3.6/ssl.py in read, line 589
> Python Executable:    /var/www/portals/django-seagrid/venv/bin/python
> Python Version:       3.6.8
> Python Path:  
> ['/usr/lib64/python36.zip',
>  '/usr/lib64/python3.6',
>  '/usr/lib64/python3.6/lib-dynload',
>  '/var/www/portals/django-seagrid/venv/lib64/python3.6/site-packages',
>  '/var/www/portals/django-seagrid/airavata-django-portal',
>  '/var/www/portals/django-seagrid/venv/lib/python3.6/site-packages',
>  
> '/var/www/portals/django-seagrid/venv/lib64/python3.6/site-packages/IPython/extensions']



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to