[
https://issues.apache.org/jira/browse/SQOOP-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14705296#comment-14705296
]
Dian Fu commented on SQOOP-2513:
--------------------------------
Hi [~jarcec],
Thanks for review. :)
{quote}
Could you advise how the patch actually fixes the deadlock
{quote}
Yes, of course. There is an UpdateThread in JobManager which will execute
{{updateSubmission}} periodically. If one job finished successfully, it will
call {{invokeDestroyerOnJobSuccess}} which will call
{{CommonRepositoryHandler#updateJob}}. Method
{{CommonRepositoryHandler#updateJob}} firstly update table {{SQ_JOB_INPUT}}
which of course will hold the exclude lock of table {{SQ_JOB_INPUT}} and then
update table {{SQ_JOB}} when it still hold the exclude lock of table
{{SQ_JOB_INPUT}}.
Client side will also call {{JobRequestHandler#getJobStatus}} to fetch the
latest status of job. {{JobRequestHandler#getJobStatus}} calls
{{JobManager#updateSubmission}} to update the status of the job. If one job
finished successfully during this call, it will call
{{JobManager#invokeDestroyerOnJobSuccess}} which will call
{{JobManager#getJob}}. {{JobManager#getJob}} finally calls
{{CommonRepositoryHandler#findJob}}. {{CommonRepositoryHandler#loadJobs}}
firstly execute select statement on table {{SQ_JOB}} and will of course hold
the shared lock of table {{SQ_JOB}}, then it execute select statement on table
{{SQ_JOB_INPUT}} which will try to get the shared lock of table
{{SQ_JOB_INPUT}}.
We can see that deadlock occurs.
{quote}
I'm a bit concerned that we're having each query in it's own try-catch block as
now we can execute the second query without correctly finishing the first which
doesn't seem right thing to do.
{quote}
If the first try-finally block throw exceptions, it will still be caught by the
outer catch statement and so the second try-finally block won't execute at this
time.
{quote}
Shouldn't we perhaps restart the transaction if we got DB deadlock?
{quote}
I'm not quite sure how to do this, any suggestions?
> Sqoop2: Deadlock occurs between getJobStatus and
> JobManager.UpdateThread#updateSubmission
> -----------------------------------------------------------------------------------------
>
> Key: SQOOP-2513
> URL: https://issues.apache.org/jira/browse/SQOOP-2513
> Project: Sqoop
> Issue Type: Bug
> Reporter: Dian Fu
> Assignee: Dian Fu
> Priority: Critical
> Fix For: 1.99.7
>
> Attachments: SQOOP-2513.001.patch, SQOOP-2513.002.patch
>
>
> I encountered the following test failure log in the test of SQOOP-2439:
> {noformat}
> java.sql.SQLTransactionRollbackException: A lock could not be obtained due to
> a deadlock, cycle of locks and waiters is:
> Lock : ROW, SQ_JOB_INPUT, (1,187)
> Waiting XID : {15543, S} , SA, SELECT "SQI_ID", "SQI_NAME", "SQI_CONFIG",
> "SQI_INDEX", "SQI_TYPE", "SQI_STRMASK", "SQI_STRLENGTH", "SQI_EDITABLE",
> "SQI_ENUMVALS", "SQBI_VALUE" FROM "SQOOP"."SQ_INPUT" LEFT OUTER JOIN
> "SQOOP"."SQ_JOB_INPUT" ON "SQBI_INPUT" = "SQI_ID" AND "SQBI_JOB" = ? WHERE
> "SQI_CONFIG" = ? ORDER BY "SQI_INDEX" Granted XID : {15545, X}
> Lock : ROW, SQ_JOB, (1,17)
> Waiting XID : {15545, X} , SA, UPDATE "SQOOP"."SQ_JOB" SET "SQB_NAME" = ?,
> "SQB_UPDATE_USER" = ?, "SQB_UPDATE_DATE" = ? WHERE "SQB_ID" = ? Granted XID
> : {15543, S} . The selected victim is XID : 15543. at
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown
> Source) at
> org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
> Source) at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown
> Source) at
> org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
> at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown
> Source) at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeQuery(Unknown
> Source) at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
> at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
> at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
> at
> org.apache.sqoop.repository.common.CommonRepositoryHandler.loadDriverConfigs(CommonRepositoryHandler.java:2265)
> at
> org.apache.sqoop.repository.common.CommonRepositoryHandler.loadJobs(CommonRepositoryHandler.java:1876)
> at
> org.apache.sqoop.repository.common.CommonRepositoryHandler.findJob(CommonRepositoryHandler.java:885)
> at
> org.apache.sqoop.repository.JdbcRepository$20.doIt(JdbcRepository.java:500)
> at
> org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:92)
> at
> org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:63)
> at
> org.apache.sqoop.repository.JdbcRepository.findJob(JdbcRepository.java:497)
> at
> org.apache.sqoop.driver.JobManager.getJob(JobManager.java:438) at
> org.apache.sqoop.driver.JobManager.invokeDestroyerOnJobSuccess(JobManager.java:517)
> at
> org.apache.sqoop.driver.JobManager.updateSubmission(JobManager.java:655)
> at
> org.apache.sqoop.driver.JobManager.status(JobManager.java:636) at
> org.apache.sqoop.handler.JobRequestHandler.getJobStatus(JobRequestHandler.java:410)
> at
> org.apache.sqoop.handler.JobRequestHandler.handleEvent(JobRequestHandler.java:102)
> at
> org.apache.sqoop.server.v1.JobServlet.handleGetRequest(JobServlet.java:86)
> at
> org.apache.sqoop.server.SqoopProtocolServlet.doGet(SqoopProtocolServlet.java:48)
> at
> javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at
> javax.servlet.http.HttpServlet.service(HttpServlet.java:723) at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:572)
> at
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:269)
> at
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:542)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
> at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:615)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
> {noformat}
> From the above log, we can see that transaction {{15543}} has the shared lock
> of table {{SQ_JOB}} and is requesting the shared lock of table
> {{SQ_JOB_INPUT}}, while transaction {{15545}} has the exclusive lock of table
> {{SQ_JOB_INPUT}} and is requesting the shared lock of table {{SQ_JOB}}.
> Deadlock occurs between these two transactions.
> PS: transaction {{15543}} is a thread calling {{JobManager#getJobStatus}}
> and transaction {{15545}} is the updatethread in {{JobManager}} which is
> calling {{updateSubmission}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)