[ 
https://issues.apache.org/jira/browse/OOZIE-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15508644#comment-15508644
 ] 

Fady K Sedrak commented on OOZIE-2626:
--------------------------------------

I tested different indexes on many of slow queries in my repro environment. I 
concluded that these 2 queries would improve the performance significantly. 

Query1:
 SELECT TOP 1 t0.id, t0.app_name, t0.status, t0.run, t0.user_name, 
t0.group_name, t0.created_time, t0.start_time, t0.last_modified_time, 
t0.end_time, t0.external_id, t0.parent_id FROM WF_JOBS t0 ORDER BY 
t0.created_time DESC
Recommended Index:
 CREATE NONCLUSTERED INDEX [nci_wi_WF_JOBS_3A82B3F6CFB2A06DB9FD] ON 
[dbo].[WF_JOBS]
 ([created_time] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, 
ALLOW_PAGE_LOCKS = ON)
Statistics:
 Before:
 Table 'WF_JOBS'. Scan count 1, logical reads 31803, physical reads 256, 
read-ahead reads 31589, lob logical reads 0, lob physical reads 0, lob 
read-ahead reads 0.
 SQL Server Execution Times:
 CPU time = 281 ms, elapsed time = 216350 ms.
 After:
 Table 'WF_JOBS'. Scan count 1, logical reads 8, physical reads 0, read-ahead 
reads 24, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 SQL Server Execution Times:
 CPU time = 0 ms, elapsed time = 0 ms.

Query 2:
 (@P0 bigint,@P1 varchar(8000),@P2 datetime2(7))SELECT t0.id FROM WF_ACTIONS t0 
WHERE (t0.pending = @P0 AND t0.status = @P1 AND t0.last_check_time < @P2)
Recommended Index:
 CREATE NONCLUSTERED INDEX [nci_wi_WF_ACTIONS_D23BAEF4FD5EEA2EBE76] ON 
[dbo].[WF_ACTIONS]
 (
[pending] ASC,
[status] ASC,
[last_check_time] ASC
 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, 
DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

Statistics:
 Before:
 Table 'WF_ACTIONS'. Scan count 1, logical reads 81894, physical reads 1330, 
read-ahead reads 79172, lob logical reads 0, lob physical reads 0, lob 
read-ahead reads 0.
 SQL Server Execution Times:
 CPU time = 875 ms, elapsed time = 652591 ms.
 After:
 Table 'WF_ACTIONS'. Scan count 1, logical reads 83, physical reads 0, 
read-ahead reads 1, lob logical reads 0, lob physical reads 0, lob read-ahead 
reads 0.
 SQL Server Execution Times:
 CPU time = 0 ms, elapsed time = 167 ms.

Unfortunately openjpa doesn't support composite indexes. So I had to work 
around the problem by create 2 simple indexes indexes on last_check_time & 
pending. 

Query 2:
 (@P0 bigint,@P1 varchar(8000),@P2 datetime2(7))SELECT t0.id FROM WF_ACTIONS t0 
WHERE (t0.pending = @P0 AND t0.status = @P1 AND t0.last_check_time < @P2)
 Recommended Index:
 CREATE NONCLUSTERED INDEX [I_WF_JOBS_PENDING] ON [dbo].[WF_ACTIONS]
 (
[pending] ASC
 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, 
DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
 GO
 CREATE NONCLUSTERED INDEX [I_WF_JOBS_LAST_CHECK_TIME] ON [dbo].[WF_ACTIONS]
 (
[pending] ASC
 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, 
DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
 GO
Statistics:
 Before:
 Table 'WF_ACTIONS'. Scan count 1, logical reads 82034, physical reads 89, 
read-ahead reads 36446, lob logical reads 0, lob physical reads 0, lob 
read-ahead reads 0.
 SQL Server Execution Times:
 CPU time = 94 ms, elapsed time = 641094 ms.
 After:
 Table 'WF_ACTIONS'. Scan count 3, logical reads 10831, physical reads 0, 
read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead 
reads 0.
 Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead 
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead 
reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
 SQL Server Execution Times:
 CPU time = 78 ms, elapsed time = 5085 ms.


> Oozie workflow processing becomes slow after the increase of rows in WF_JOBS 
> and WF_ACTIONS tables when the metastore is running on SQL Server
> ----------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: OOZIE-2626
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2626
>             Project: Oozie
>          Issue Type: Bug
>          Components: coordinator, workflow
>            Reporter: Fady K Sedrak
>
> Oozie workflow processing becomes slow after the increase of rows in WF_JOBS 
> and WF_ACTIONS tables when running against SQL Server. A secondary index need 
> to be created against WF_JOBS (created_time), WF_ACTIONS (last_check_time & 
> pending) to avoid going for a full table scan when querying against these 
> fields.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to