[jira] [Updated] (AURORA-1779) BatchWorker fails with PersistenceException

Stephan Erb (JIRA) Wed, 21 Sep 2016 14:41:59 -0700

     [ 
https://issues.apache.org/jira/browse/AURORA-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Stephan Erb updated AURORA-1779:
--------------------------------
    Description: 
Steps to reproduce (if you are lucky):
{code}
vagrant destroy
vagrant up
vagrant ssh
cd /vagrant
./build-support/release/verify-release-candidate 0.16.0-rc1
{code}

For me, this resulted in the following initial stack trace (and a couple 
different follow ups due to a corrupted DB state):

{code}
I0921 19:45:38.730 [TaskEventBatchWorker, JobUpdateControllerImpl:359] 
Forwarding task change for vagrant/test/http_example/0 
I0921 19:45:38.744 [qtp1742582311-39, Slf4jRequestLog:60] 127.0.0.1 - - 
[21/Sep/2016:19:45:38 +0000] "POST //aurora.local/api HTTP/1.1" 200 108  
I0921 19:45:38.771 [TaskEventBatchWorker, JobUpdateControllerImpl:587] 
IJobUpdateKey{job=IJobKey{role=vagrant, environment=test, name=http_example}, 
id=97ff31cc-c17f-45ef-a1d9-c16cb07388c7} evaluation result: 
EvaluationResult{status=WORKING, 
sideEffects={0=SideEffect{action=Optional.of(ADD_TASK), statusChanges=[]}}} 
I0921 19:45:38.786 [TaskEventBatchWorker, JobUpdateControllerImpl:654] 
Executing side-effects for update of IJobUpdateKey{job=IJobKey{role=vagrant, 
environment=test, name=http_example}, id=97ff31cc-c17f-45ef-a1d9-c16cb07388c7}: 
{0=SideEffect{action=Optional.of(ADD_TASK), statusChanges=[]}} 
I0921 19:45:38.789 [TaskEventBatchWorker, InstanceActionHandler$AddTask:95] 
Adding instance IInstanceKey{jobKey=IJobKey{role=vagrant, environment=test, 
name=http_example}, instanceId=0} while ROLLING_FORWARD 
I0921 19:45:38.805 [TaskEventBatchWorker, StateMachine$Builder:389] 
vagrant-test-http_example-0-100b76ee-ed70-4e15-8e52-046f992d0b4a state machine 
transition INIT -> PENDING 
I0921 19:45:38.806 [TaskEventBatchWorker, TaskStateMachine:474] Adding work 
command SAVE_STATE for 
vagrant-test-http_example-0-100b76ee-ed70-4e15-8e52-046f992d0b4a 
I0921 19:45:38.808 [ShutdownHook, SchedulerMain:101] Stopping scheduler 
services. 
I0921 19:45:38.822 [TimeSeriesRepositoryImpl STOPPING, 
TimeSeriesRepositoryImpl:168] Variable sampler shut down 
I0921 19:45:38.839 [TearDownShutdownRegistry STOPPING, 
ShutdownRegistry$ShutdownRegistryImpl:77] Executing 4 shutdown commands. 
E0921 19:45:38.845 [TaskEventBatchWorker, BatchWorker:217] 
TaskEventBatchWorker: Failed to process batch item. Error: {} 
org.apache.ibatis.exceptions.PersistenceException: 
### Error querying database.  Cause: org.h2.jdbc.JdbcSQLException: Database is 
already closed (to disable automatic closing at VM shutdown, add 
";DB_CLOSE_ON_EXIT=FALSE" to the db URL) [90121-190]
### The error may exist in org/apache/aurora/scheduler/storage/db/TaskMapper.xml
### The error may involve 
org.apache.aurora.scheduler.storage.db.TaskMapper.selectById
### The error occurred while executing a query
### SQL: SELECT       t.id AS row_id,       t.task_config_row_id AS 
task_config_row_id,       t.task_id AS task_id,       t.instance_id AS 
instance_id,       t.status AS status,       t.failure_count AS failure_count,  
     t.ancestor_task_id AS ancestor_id,       j.role AS c_j_role,       
j.environment AS c_j_environment,       j.name AS c_j_name,       h.slave_id AS 
slave_id,       h.host AS slave_host,       tp.id as tp_id,       tp.name as 
tp_name,       tp.port as tp_port,       te.id as te_id,       te.timestamp_ms 
as te_timestamp,       te.status as te_status,       te.message as te_message,  
     te.scheduler_host as te_scheduler     FROM tasks AS t     INNER JOIN 
task_configs as c ON c.id = t.task_config_row_id     INNER JOIN job_keys AS j 
ON j.id = c.job_key_id     LEFT OUTER JOIN task_ports as tp ON tp.task_row_id = 
t.id     LEFT OUTER JOIN task_events as te ON te.task_row_id = t.id     LEFT 
OUTER JOIN host_attributes AS h ON h.id = t.slave_row_id         WHERE       
t.task_id = ?
{code}


  was:
Steps to reproduce (if you are luck):
{code}
vagrant destroy
vagrant up
vagrant ssh
cd /vagrant
./build-support/release/verify-release-candidate 0.16.0-rc1
{code}

For me, this resulted in the following initial stack trace (and a couple 
different follow ups due to a corrupted DB state):

{code}
I0921 19:45:38.730 [TaskEventBatchWorker, JobUpdateControllerImpl:359] 
Forwarding task change for vagrant/test/http_example/0 
I0921 19:45:38.744 [qtp1742582311-39, Slf4jRequestLog:60] 127.0.0.1 - - 
[21/Sep/2016:19:45:38 +0000] "POST //aurora.local/api HTTP/1.1" 200 108  
I0921 19:45:38.771 [TaskEventBatchWorker, JobUpdateControllerImpl:587] 
IJobUpdateKey{job=IJobKey{role=vagrant, environment=test, name=http_example}, 
id=97ff31cc-c17f-45ef-a1d9-c16cb07388c7} evaluation result: 
EvaluationResult{status=WORKING, 
sideEffects={0=SideEffect{action=Optional.of(ADD_TASK), statusChanges=[]}}} 
I0921 19:45:38.786 [TaskEventBatchWorker, JobUpdateControllerImpl:654] 
Executing side-effects for update of IJobUpdateKey{job=IJobKey{role=vagrant, 
environment=test, name=http_example}, id=97ff31cc-c17f-45ef-a1d9-c16cb07388c7}: 
{0=SideEffect{action=Optional.of(ADD_TASK), statusChanges=[]}} 
I0921 19:45:38.789 [TaskEventBatchWorker, InstanceActionHandler$AddTask:95] 
Adding instance IInstanceKey{jobKey=IJobKey{role=vagrant, environment=test, 
name=http_example}, instanceId=0} while ROLLING_FORWARD 
I0921 19:45:38.805 [TaskEventBatchWorker, StateMachine$Builder:389] 
vagrant-test-http_example-0-100b76ee-ed70-4e15-8e52-046f992d0b4a state machine 
transition INIT -> PENDING 
I0921 19:45:38.806 [TaskEventBatchWorker, TaskStateMachine:474] Adding work 
command SAVE_STATE for 
vagrant-test-http_example-0-100b76ee-ed70-4e15-8e52-046f992d0b4a 
I0921 19:45:38.808 [ShutdownHook, SchedulerMain:101] Stopping scheduler 
services. 
I0921 19:45:38.822 [TimeSeriesRepositoryImpl STOPPING, 
TimeSeriesRepositoryImpl:168] Variable sampler shut down 
I0921 19:45:38.839 [TearDownShutdownRegistry STOPPING, 
ShutdownRegistry$ShutdownRegistryImpl:77] Executing 4 shutdown commands. 
E0921 19:45:38.845 [TaskEventBatchWorker, BatchWorker:217] 
TaskEventBatchWorker: Failed to process batch item. Error: {} 
org.apache.ibatis.exceptions.PersistenceException: 
### Error querying database.  Cause: org.h2.jdbc.JdbcSQLException: Database is 
already closed (to disable automatic closing at VM shutdown, add 
";DB_CLOSE_ON_EXIT=FALSE" to the db URL) [90121-190]
### The error may exist in org/apache/aurora/scheduler/storage/db/TaskMapper.xml
### The error may involve 
org.apache.aurora.scheduler.storage.db.TaskMapper.selectById
### The error occurred while executing a query
### SQL: SELECT       t.id AS row_id,       t.task_config_row_id AS 
task_config_row_id,       t.task_id AS task_id,       t.instance_id AS 
instance_id,       t.status AS status,       t.failure_count AS failure_count,  
     t.ancestor_task_id AS ancestor_id,       j.role AS c_j_role,       
j.environment AS c_j_environment,       j.name AS c_j_name,       h.slave_id AS 
slave_id,       h.host AS slave_host,       tp.id as tp_id,       tp.name as 
tp_name,       tp.port as tp_port,       te.id as te_id,       te.timestamp_ms 
as te_timestamp,       te.status as te_status,       te.message as te_message,  
     te.scheduler_host as te_scheduler     FROM tasks AS t     INNER JOIN 
task_configs as c ON c.id = t.task_config_row_id     INNER JOIN job_keys AS j 
ON j.id = c.job_key_id     LEFT OUTER JOIN task_ports as tp ON tp.task_row_id = 
t.id     LEFT OUTER JOIN task_events as te ON te.task_row_id = t.id     LEFT 
OUTER JOIN host_attributes AS h ON h.id = t.slave_row_id         WHERE       
t.task_id = ?
{code}



> BatchWorker fails with PersistenceException
> -------------------------------------------
>
>                 Key: AURORA-1779
>                 URL: https://issues.apache.org/jira/browse/AURORA-1779
>             Project: Aurora
>          Issue Type: Bug
>            Reporter: Stephan Erb
>            Priority: Blocker
>         Attachments: aurora-scheduler.log
>
>
> Steps to reproduce (if you are lucky):
> {code}
> vagrant destroy
> vagrant up
> vagrant ssh
> cd /vagrant
> ./build-support/release/verify-release-candidate 0.16.0-rc1
> {code}
> For me, this resulted in the following initial stack trace (and a couple 
> different follow ups due to a corrupted DB state):
> {code}
> I0921 19:45:38.730 [TaskEventBatchWorker, JobUpdateControllerImpl:359] 
> Forwarding task change for vagrant/test/http_example/0 
> I0921 19:45:38.744 [qtp1742582311-39, Slf4jRequestLog:60] 127.0.0.1 - - 
> [21/Sep/2016:19:45:38 +0000] "POST //aurora.local/api HTTP/1.1" 200 108  
> I0921 19:45:38.771 [TaskEventBatchWorker, JobUpdateControllerImpl:587] 
> IJobUpdateKey{job=IJobKey{role=vagrant, environment=test, name=http_example}, 
> id=97ff31cc-c17f-45ef-a1d9-c16cb07388c7} evaluation result: 
> EvaluationResult{status=WORKING, 
> sideEffects={0=SideEffect{action=Optional.of(ADD_TASK), statusChanges=[]}}} 
> I0921 19:45:38.786 [TaskEventBatchWorker, JobUpdateControllerImpl:654] 
> Executing side-effects for update of IJobUpdateKey{job=IJobKey{role=vagrant, 
> environment=test, name=http_example}, 
> id=97ff31cc-c17f-45ef-a1d9-c16cb07388c7}: 
> {0=SideEffect{action=Optional.of(ADD_TASK), statusChanges=[]}} 
> I0921 19:45:38.789 [TaskEventBatchWorker, InstanceActionHandler$AddTask:95] 
> Adding instance IInstanceKey{jobKey=IJobKey{role=vagrant, environment=test, 
> name=http_example}, instanceId=0} while ROLLING_FORWARD 
> I0921 19:45:38.805 [TaskEventBatchWorker, StateMachine$Builder:389] 
> vagrant-test-http_example-0-100b76ee-ed70-4e15-8e52-046f992d0b4a state 
> machine transition INIT -> PENDING 
> I0921 19:45:38.806 [TaskEventBatchWorker, TaskStateMachine:474] Adding work 
> command SAVE_STATE for 
> vagrant-test-http_example-0-100b76ee-ed70-4e15-8e52-046f992d0b4a 
> I0921 19:45:38.808 [ShutdownHook, SchedulerMain:101] Stopping scheduler 
> services. 
> I0921 19:45:38.822 [TimeSeriesRepositoryImpl STOPPING, 
> TimeSeriesRepositoryImpl:168] Variable sampler shut down 
> I0921 19:45:38.839 [TearDownShutdownRegistry STOPPING, 
> ShutdownRegistry$ShutdownRegistryImpl:77] Executing 4 shutdown commands. 
> E0921 19:45:38.845 [TaskEventBatchWorker, BatchWorker:217] 
> TaskEventBatchWorker: Failed to process batch item. Error: {} 
> org.apache.ibatis.exceptions.PersistenceException: 
> ### Error querying database.  Cause: org.h2.jdbc.JdbcSQLException: Database 
> is already closed (to disable automatic closing at VM shutdown, add 
> ";DB_CLOSE_ON_EXIT=FALSE" to the db URL) [90121-190]
> ### The error may exist in 
> org/apache/aurora/scheduler/storage/db/TaskMapper.xml
> ### The error may involve 
> org.apache.aurora.scheduler.storage.db.TaskMapper.selectById
> ### The error occurred while executing a query
> ### SQL: SELECT       t.id AS row_id,       t.task_config_row_id AS 
> task_config_row_id,       t.task_id AS task_id,       t.instance_id AS 
> instance_id,       t.status AS status,       t.failure_count AS 
> failure_count,       t.ancestor_task_id AS ancestor_id,       j.role AS 
> c_j_role,       j.environment AS c_j_environment,       j.name AS c_j_name,   
>     h.slave_id AS slave_id,       h.host AS slave_host,       tp.id as tp_id, 
>       tp.name as tp_name,       tp.port as tp_port,       te.id as te_id,     
>   te.timestamp_ms as te_timestamp,       te.status as te_status,       
> te.message as te_message,       te.scheduler_host as te_scheduler     FROM 
> tasks AS t     INNER JOIN task_configs as c ON c.id = t.task_config_row_id    
>  INNER JOIN job_keys AS j ON j.id = c.job_key_id     LEFT OUTER JOIN 
> task_ports as tp ON tp.task_row_id = t.id     LEFT OUTER JOIN task_events as 
> te ON te.task_row_id = t.id     LEFT OUTER JOIN host_attributes AS h ON h.id 
> = t.slave_row_id         WHERE       t.task_id = ?
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (AURORA-1779) BatchWorker fails with PersistenceException

Reply via email to