Maxim Khutornenko created AURORA-1494:
-----------------------------------------
Summary: Scheduler fails to start due to thrift/SQL schema data
type mismatch
Key: AURORA-1494
URL: https://issues.apache.org/jira/browse/AURORA-1494
Project: Aurora
Issue Type: Bug
Components: Scheduler
Reporter: Maxim Khutornenko
Assignee: Maxim Khutornenko
Priority: Blocker
After https://reviews.apache.org/r/38288 we are unable to upgrade scheduler in
one of our clusters due to the following failure on restart:
{noformat}
### Cause: org.h2.jdbc.JdbcSQLException: Numeric value out of range:
"3174400000031744"; SQL statement:
INSERT INTO task_configs (
job_key_id,
creator_user,
service,
num_cpus,
ram_mb,
disk_mb,
priority,
max_task_failures,
production,
contact_email,
executor_name,
executor_data,
tier
) VALUES (
(
SELECT ID
FROM job_keys
WHERE role = ?
AND environment = ?
AND name = ?
),
?,
?,
?,
?,
?,
?,
?,
?,
?,
{noformat}
This appears due to type mismatch between TaskConfig.diskMb (i64) and
task_configs.disk_mb (INT).
A possible real-life scenario:
- user creates a job with an oversized resource requirement and the job fails
to schedule
- user realizes the mistake and attempts to correct it by running {{aurora
update start}}
- scheduler creates an instance of the JobUpdate with the oversized TaskConfig
as its initial state and persists it in the log
- scheduler restarts to a new version (with the patch above) and attempts to
reload job updates from the log but now instead of storing TaskConfigs as
binary blobs it attempts to insert into task_configs table where resource
columns have narrower type.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)