I don't know what is going on then. If you look at the SQL it would appear all sorts of interesting characters are in there. I thought your nodelist was very interesting as well. Job name was also very interesting.

Perhaps you should look at to why those names are coming through. A ' in a job name is already handled. I don't know how the partition name could be messed with though.

Danny

On 02/02/12 10:05, Tibor Pausz wrote:
Hello Danny,
we don't have any ' inside the cluster name, any partition or somewhere else!
I don't know if some users have submitted jobs where ' ist part of the job
name, but we have not used any special charater inside our config.

Best regards,
Tibor

On Thu, Feb 02, 2012 at 08:51:20AM -0800, Danny Auble wrote:
Tibor, the problem comes from the ' in your partition name.  Up to
this time I don't think anyone has ever done that.  I am not sure
what other problems might arise from that name either.  But if this
is the first time you have seen it this might be one of the only
problems it causes.

To get around it in the code you can call
slurm_add_slash_to_quotes() as done elsewhere in the code on the
partition name in src/plugins/mysql/as_mysql_job.c, pretty much
where ever you see the partition name.  It might need to be done
elsewhere as well in the mysql plugin.  Let us know how it goes.
Don't forget to xfree the new partition name after it's use.

Danny

On 02/02/12 01:06, Tibor Pausz wrote:
Hello!

we have trouble with the slurmdbd (version 2.3.0-2) in combination with
MySQL accouting. We have severall entries in the slurmdbd.log per hour
with the same kind of error (see below). After some time the slurmdbd
stucks.

error: mysql_query failed: 1064 You have an error in your SQL syntax;
check the manual that corresponds to your MySQL server version for the
right syntax to use near '<A4><AC>*', '744') on duplicate key update
job_db_inx=LAST_INSERT_ID(job_db_inx), id_w' at line 1
insert into "loewe_job_table" (id_job, id_assoc, id_qos, id_wckey,
id_user, id_group, nodelist, id_resv, timelimit, time_eligible,
time_submit, time_start, job_name, track_steps, state, priority,
cpus_req, cpus_alloc, nodes_alloc, account, partition, node_inx) values
(2290165, 463, 3, 0, 536, 525, '<A0>+^F<A4><AC>*', 0, 1440, 1327054146,
1327054146, 1327054146, '0^N<A4><AC>*', 0, 5, 100006, 1, 1, 1,
'tomograpp','<D0>'<A4><AC>*', '744') on duplicate key update
job_db_inx=LAST_INSERT_ID(job_db_inx), id_wckey=0, id_user=536,
id_group=525, nodelist='<A0>+^F<A4><AC>*', id_resv=0, timelimit=1440,
time_submit=1327054146, time_start=1327054146, job_name='0^N<A4><AC>*',
track_steps=0, id_qos=3, state=greatest(state, 5), priority=100006,
cpus_req=1, cpus_alloc=1, nodes_alloc=1, account='tomograpp', partition='
<D0>'<A4><AC>*', node_inx='744'


The slurmctld.log contains
error: slurmdbd: agent queue filling, RESTART SLURMDBD NOW
…

Reply via email to