Bhuvan Arumugam created AURORA-640:
--------------------------------------
Summary: aurora create fail due to lock held by different job
Key: AURORA-640
URL: https://issues.apache.org/jira/browse/AURORA-640
Project: Aurora
Issue Type: Bug
Components: Scheduler
Affects Versions: 0.5.0
Reporter: Bhuvan Arumugam
With recent HEAD, unable to create or killall job. It always complain following
error:
{code}
aurora create cp0/bhuvan/staging10/hello hello_world.aurora
[stories/apps-in-docker] 15:35:57
INFO] Creating job hello
INFO] Starting new HTTP connection (1): a005832.vp.iso.apple.com
INFO] Starting new HTTP connection (1): a005832.vp.iso.apple.com
INFO] Response from scheduler: LOCK_ERROR (message: Unable to perform
operation for: bhuvan/staging10/hello. Use override/cancel option.)
INFO]
Note: if the scheduler detects that a job update is in progress (or was not
properly completed) it will reject subsequent updates. This is because your
job is likely in a partially-updated state. You should only begin another
update if you are confident that nobody is updating this job, and that
the job is in a state suitable for an update.
After checking on the above, you may release the update lock on the job by
invoking cancel_update.
{code}
The scheduler log, when run in FINE log level, show that one lock is held. The
lock is held by completely different task. Confirmed it by querying {{/locks}}
endpoint. This is the commit, where lockMapper is changed to use {{LEFT OUTER
JOIN}}.
https://github.com/apache/incubator-aurora/commit/5cf760bf31315c220c0f17cc233ad3a1dcfb6d86
{code}
D0806 22:37:34.903 THREAD1754
org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug: ==> Preparing: SELECT *
FROM locks LEFT OUTER JOIN job_keys AS key ON key.role = ? AND key.environment
= ? AND key.name = ? AND key.id = job_key_id
D0806 22:37:34.903 THREAD1754
org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug: ==> Parameters:
bhuvan(String), staging10(String), hello(String)
D0806 22:37:34.904 THREAD1754
org.apache.ibatis.logging.jdbc.BaseJdbcLogger.debug: <== Total: 1
{code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)