[
https://issues.apache.org/jira/browse/SQOOP-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14706242#comment-14706242
]
Dian Fu commented on SQOOP-2513:
--------------------------------
Silly me. Just thought of that we can just make the two threads access the
tables in a consistent order. Uploaded the v3 patch. [~jarcec], could you help
to take a look at the latest patch?
{quote}
I'm still not clear how the dual try-catch will however resolve the deadlock.
As the method body is still executed in the same database transaction, it won't
really help right?
{quote}
You're right. The lock won't be released until the transaction is committed. So
the v2 patch can't resolve the deadlock.
{quote}
One of other solutions that I'm thinking about is to catch the SQL exception
and if it's deadlock, we can rerun the operation (e.g. call the updateJob()
again or something like that). What do you think?
{quote}
This is a good solution. But before adopting this solution, I think we should
try to find a good solution to avoid the deadlock which appears here. If
avoiding deadlock is impossible or difficult to implement, we can take the
solution to rerun the operation.
In the other hand, we don't know if there are deadlocks in other places. Rerun
the operation is a good precautions. What about limiting the scope of this JIRA
to elimate the described deadlock and filing another JIRA to add rerun
mechanism to all the client side APIs as an enhancement?
> Sqoop2: Deadlock occurs between getJobStatus and
> JobManager.UpdateThread#updateSubmission
> -----------------------------------------------------------------------------------------
>
> Key: SQOOP-2513
> URL: https://issues.apache.org/jira/browse/SQOOP-2513
> Project: Sqoop
> Issue Type: Bug
> Reporter: Dian Fu
> Assignee: Dian Fu
> Priority: Critical
> Fix For: 1.99.7
>
> Attachments: SQOOP-2513.001.patch, SQOOP-2513.002.patch,
> SQOOP-2513.003.patch
>
>
> I encountered the following test failure log in the test of SQOOP-2439:
> {noformat}
> java.sql.SQLTransactionRollbackException: A lock could not be obtained due to
> a deadlock, cycle of locks and waiters is:
> Lock : ROW, SQ_JOB_INPUT, (1,187)
> Waiting XID : {15543, S} , SA, SELECT "SQI_ID", "SQI_NAME", "SQI_CONFIG",
> "SQI_INDEX", "SQI_TYPE", "SQI_STRMASK", "SQI_STRLENGTH", "SQI_EDITABLE",
> "SQI_ENUMVALS", "SQBI_VALUE" FROM "SQOOP"."SQ_INPUT" LEFT OUTER JOIN
> "SQOOP"."SQ_JOB_INPUT" ON "SQBI_INPUT" = "SQI_ID" AND "SQBI_JOB" = ? WHERE
> "SQI_CONFIG" = ? ORDER BY "SQI_INDEX" Granted XID : {15545, X}
> Lock : ROW, SQ_JOB, (1,17)
> Waiting XID : {15545, X} , SA, UPDATE "SQOOP"."SQ_JOB" SET "SQB_NAME" = ?,
> "SQB_UPDATE_USER" = ?, "SQB_UPDATE_DATE" = ? WHERE "SQB_ID" = ? Granted XID
> : {15543, S} . The selected victim is XID : 15543. at
> org.apache.derby.impl.jdbc.SQLExceptionFactory40.getSQLException(Unknown
> Source) at
> org.apache.derby.impl.jdbc.Util.generateCsSQLException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.wrapInSQLException(Unknown
> Source) at
> org.apache.derby.impl.jdbc.TransactionResourceImpl.handleException(Unknown
> Source) at
> org.apache.derby.impl.jdbc.EmbedConnection.handleException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.ConnectionChild.handleException(Unknown Source)
> at
> org.apache.derby.impl.jdbc.EmbedStatement.executeStatement(Unknown Source)
> at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeStatement(Unknown
> Source) at
> org.apache.derby.impl.jdbc.EmbedPreparedStatement.executeQuery(Unknown
> Source) at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
> at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
> at
> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(DelegatingPreparedStatement.java:96)
> at
> org.apache.sqoop.repository.common.CommonRepositoryHandler.loadDriverConfigs(CommonRepositoryHandler.java:2265)
> at
> org.apache.sqoop.repository.common.CommonRepositoryHandler.loadJobs(CommonRepositoryHandler.java:1876)
> at
> org.apache.sqoop.repository.common.CommonRepositoryHandler.findJob(CommonRepositoryHandler.java:885)
> at
> org.apache.sqoop.repository.JdbcRepository$20.doIt(JdbcRepository.java:500)
> at
> org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:92)
> at
> org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:63)
> at
> org.apache.sqoop.repository.JdbcRepository.findJob(JdbcRepository.java:497)
> at
> org.apache.sqoop.driver.JobManager.getJob(JobManager.java:438) at
> org.apache.sqoop.driver.JobManager.invokeDestroyerOnJobSuccess(JobManager.java:517)
> at
> org.apache.sqoop.driver.JobManager.updateSubmission(JobManager.java:655)
> at
> org.apache.sqoop.driver.JobManager.status(JobManager.java:636) at
> org.apache.sqoop.handler.JobRequestHandler.getJobStatus(JobRequestHandler.java:410)
> at
> org.apache.sqoop.handler.JobRequestHandler.handleEvent(JobRequestHandler.java:102)
> at
> org.apache.sqoop.server.v1.JobServlet.handleGetRequest(JobServlet.java:86)
> at
> org.apache.sqoop.server.SqoopProtocolServlet.doGet(SqoopProtocolServlet.java:48)
> at
> javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at
> javax.servlet.http.HttpServlet.service(HttpServlet.java:723) at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:572)
> at
> org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:269)
> at
> org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:542)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
> at
> org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:615)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
> {noformat}
> From the above log, we can see that transaction {{15543}} has the shared lock
> of table {{SQ_JOB}} and is requesting the shared lock of table
> {{SQ_JOB_INPUT}}, while transaction {{15545}} has the exclusive lock of table
> {{SQ_JOB_INPUT}} and is requesting the shared lock of table {{SQ_JOB}}.
> Deadlock occurs between these two transactions.
> PS: transaction {{15543}} is a thread calling {{JobManager#getJobStatus}}
> and transaction {{15545}} is the updatethread in {{JobManager}} which is
> calling {{updateSubmission}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)