subject:"\[Bacula\-users\] VirtualFull mysql query blocks other jobs for a long time"

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-14 Thread ebollengier


Hello,


Graham Keeling wrote:
 
 Hello,
 
 I now believe that the 'taking hours' problem that I was having was
 down to having additional indexes on my File table, as Eric suggested.
 
 I am using mysql-5.0.45.
 
 I had these indexes:
 JobId
 JobId, PathId, FilenameId
 PathId
 FilenameId
 
 Now I have these indexes:
 JobId
 JobId, PathId, FilenameId
 
 The queries on my 'real' database now take about a second, rather than
 half a
 day.
 
 

Nice to see that my first advise was the good one.



 
 A suggestion - perhaps the following comment in
 src/cats/make_mysql_tables.in
 could be changed to include a warning:
 
 #
 # Possibly add one or more of the following indexes
 #  to the above File table if your Verifies are
 #  too slow.
 #
 #  INDEX (PathId),
 #  INDEX (FilenameId),
 #  INDEX (FilenameId, PathId)
 #  INDEX (JobId),
 #
 
 
 
 However, I also tested the 3.0.3 and 5.0.1 queries using Eric's test
 script and
 the much larger database that it generates.
 I found that there is a definite slowdown.
 
 Results from do_bench(10,13, 220). In this case, the slowdown is
 about
 15%.
 
 new|220|220|312
 old|220|220|268
 graham|220|220|158
 
 Result 'graham' is the time it takes to do a query that I came up with
 that
 looks similar to the postgresql query, but uses the mysql group by trick
 that
 is frowned upon:
 
 SELECT MAX(JobTDate) AS JobTDate, JobId, FileId, FileIndex, PathId,
 FilenameId, LStat, MD5 FROM
  (SELECT JobTDate, JobId, FileId, FileIndex, PathId, FilenameId, LStat,
 MD5
FROM
(SELECT FileId, JobId, PathId, FilenameId, FileIndex, LStat, MD5
   FROM File WHERE JobId IN ($jobid)
  UNION ALL
 SELECT File.FileId, File.JobId, PathId, FilenameId,
File.FileIndex, LStat, MD5
   FROM BaseFiles JOIN File USING (FileId)
  WHERE BaseFiles.JobId IN ($jobid)
 ) AS T JOIN Job USING (JobId)
ORDER BY FilenameId, PathId, JobTDate DESC ) AS U
 GROUP BY PathId, FilenameId
 
 

I don't think it will work, the documentation saids that if two records have
the same FilenameId, PathId, the value for LStat, MD5, JobId and FileIndex
will be a random one (which is not acceptable in our case).

Thanks for your feeback and your help.

Bye

-- 
View this message in context: 
http://old.nabble.com/VirtualFull-mysql-query-blocks-other-jobs-for-a-long-time-tp28149748p28247276.html
Sent from the Bacula - Users mailing list archive at Nabble.com.


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-13 Thread Graham Keeling

Hello,

I now believe that the 'taking hours' problem that I was having was
down to having additional indexes on my File table, as Eric suggested.

I am using mysql-5.0.45.

I had these indexes:
JobId
JobId, PathId, FilenameId
PathId
FilenameId

Now I have these indexes:
JobId
JobId, PathId, FilenameId

The queries on my 'real' database now take about a second, rather than half a
day.

A suggestion - perhaps the following comment in src/cats/make_mysql_tables.in
could be changed to include a warning:

#
# Possibly add one or more of the following indexes
#  to the above File table if your Verifies are
#  too slow.
#
#  INDEX (PathId),
#  INDEX (FilenameId),
#  INDEX (FilenameId, PathId)
#  INDEX (JobId),
#



However, I also tested the 3.0.3 and 5.0.1 queries using Eric's test script and
the much larger database that it generates.
I found that there is a definite slowdown.

Results from do_bench(10,13, 220). In this case, the slowdown is about
15%.

new|220|220|312
old|220|220|268
graham|220|220|158

Result 'graham' is the time it takes to do a query that I came up with that
looks similar to the postgresql query, but uses the mysql group by trick that
is frowned upon:

SELECT MAX(JobTDate) AS JobTDate, JobId, FileId, FileIndex, PathId, FilenameId, 
LStat, MD5 FROM
 (SELECT JobTDate, JobId, FileId, FileIndex, PathId, FilenameId, LStat, MD5
   FROM
   (SELECT FileId, JobId, PathId, FilenameId, FileIndex, LStat, MD5
  FROM File WHERE JobId IN ($jobid)
 UNION ALL
SELECT File.FileId, File.JobId, PathId, FilenameId,
   File.FileIndex, LStat, MD5
  FROM BaseFiles JOIN File USING (FileId)
 WHERE BaseFiles.JobId IN ($jobid)
) AS T JOIN Job USING (JobId)
   ORDER BY FilenameId, PathId, JobTDate DESC ) AS U
GROUP BY PathId, FilenameId


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-09 Thread Graham Keeling

On Thu, Apr 08, 2010 at 12:29:05PM -0700, ebollengier wrote:
 Graham Keeling wrote:
  
  On Thu, Apr 08, 2010 at 07:44:14AM -0700, ebollengier wrote:
  
  Hello,
  
  
  Graham Keeling wrote:
   
   Hello,
   
   I'm still waiting for my test database to fill up with Eric's data
   (actually,
   it's full now, but generating the right indexes is taking lots of
  time).
   
   
   But, I have another proposed solution, better than the last one I made.
   
   My previous solution was still taking a very very long time for a
  backup
   of a
   particular client that I had. Removing mention of BaseFiles did not
  help
   for
   this client.
   
   However, the following did, and it doesn't break Base jobs.
   
   Eric, I would appreciate it if you could give this a go on your test
   machine.
   
   It removes nasty the join on JobTDate by replacing it with a join on
  JobId
   (which makes more sense, and is also an index on Job and File).
   
  
  It's not possible, for example when you bscan a volume, or you run a
  copy,
  old
  records with old JobTDate have new FileIds or JobIds. You can't trust
  this
  field. It
  will work in many situations, but also will fail in an horrible way on
  many
  others...
 
  The first version (3.0.3) was using FileId, and it was quite good, but
  it could result wrong results on some cases, and for restore, you don't
  have
  choice, you
  need the exact one.
  
  I don't understand this at all.
  If you cannot trust the JobIds or FileIds in the File table, then the
  postgres
  query is also broken. The postgres query doesn't even mention JobTDate.
  In fact, the postgres query is using StartTime to do the ordering.
  
 
 And JobTDate is equivalent to StartTime (can be changed in PostgreSQL or in
 MySQL)

I still don't understand your original point.
If you cannot trust the JobIds or FileIds in the File table, then the postgres
query is also broken.

To clarify, can you tell me which fields in the File table you can trust?

   This also means it can get rid of the outer WHERE 'BaseJobId' OR
  'JobId'
   that I
   was complaining about before.
   The correct JobId is chosen with MAX(JobTDate) by ordering by JobTDate
   DESC on
   the innermost select.
   
   
  
  I would love that, it's what DISTINCT ON() on PostgreSQL does. But,
  unfortunately
  in SQL, you can only get fields that have a group function (like MAX,
  MIN,
  AVG) and
  fields present in GROUP BY. (we can also take FileIndex, MD5 and LStat at
  the same time)
  
  from http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-columns.html
   When using this feature, all rows in each group should have the same
   values for the columns that are
   ommitted from the GROUP BY part. The server is free to return any value
   from the group, so the results 
   are indeterminate unless all values are the same.
  
  This is not only a MySQL gotcha, it's an SQL property. Unless i'm
  mistaken
  this paragraph, I'm sure
  that we can't use it :-(
  
  OK, I understand what you are saying here.
  
  I appreciate your help, and I hope that we will find a solution for those
  that have
  this performance problem (that I can't reproduce myself).
  
  I am probably out of ideas now, other than:
  a) reverting to 3.0.3,
  b) reverting to the 3.0.3 queries, or
  c) switching to postgres (with all the horrible migration problems that
  will
  cause).
  
 
 Even if we found a workaround for MySQL, Postgres will stay far more faster
 (For the 2M file query, postgres was about 12s, MySQL 3.0.3 at 60s and MySQL
 5.0.1 at 90s)
 
 Can you confirm that the BaseJob doesn't change your timing ?

I think that I must have made some sort of mistake when I was testing
that. I do not believe that it changes my timing enough to help with the
problem that I have.

I also think that there must be some kind of mistake with your test program
because the slowness of the query between the 3.0.3 query and the 5.0.1 query
is very, very massive on my real database.

I will attempt to prove this soon (I still haven't set up your test database
properly).


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-09 Thread ebollengier

Graham Keeling wrote:

I don't understand this at all.
If you cannot trust the JobIds or FileIds in the File table, then the
postgres
query is also broken. The postgres query doesn't even mention JobTDate.
In fact, the postgres query is using StartTime to do the ordering.

And JobTDate is equivalent to StartTime (can be changed in PostgreSQL or
in
MySQL)

I still don't understand your original point.
If you cannot trust the JobIds or FileIds in the File table, then the
postgres
query is also broken.

No, the PostgreSQL query is using StartTime to find the latest record for a
given
(FilenameId, PathId) tuple, it's not broken. see sql_cmd.c. The DISTINCT
ON() permits
to use other fields of this record directly (which seems to be the normal
way to do things)

To clarify, can you tell me which fields in the File table you can trust?

You don't have direct File table fields that gives you info about the backup
time. (only somewhere
in the LStat field, but it will be hard to extract, performance wouldn't be
there, and you have to
join the File table again to get MD5 and the rest of the LStat field).

The main difference between the 3.0.3 query and the 5.0.x one, is that
instead of using FileId to
join the table a second time to get all info, it uses JobId, FilenameId and
PathId. We already have an
index on those 3 criteria, and it should be a bit slower, but nothing to do
with hours or days of work.

What is possible, is that MySQL is trying to use an other index (that you
added), and the result is
worst than the original plan with the composed index.

Even if we found a workaround for MySQL, Postgres will stay far more
faster
(For the 2M file query, postgres was about 12s, MySQL 3.0.3 at 60s and
MySQL
5.0.1 at 90s)

Can you confirm that the BaseJob doesn't change your timing ?

I think that I must have made some sort of mistake when I was testing
that. I do not believe that it changes my timing enough to help with the
problem that I have.

I also think that there must be some kind of mistake with your test
program
because the slowness of the query between the 3.0.3 query and the 5.0.1
query
is very, very massive on my real database.

I'm really thinking that the problem is on the MySQL side (bad version
perhaps), or on your
modifications (my tests shows that with a FilenameId, PathId index, results
are 10 times slower than
with the default indexes)

What version of MySQL are you using ? (and on which OS)

I will attempt to prove this soon (I still haven't set up your test
database
properly).

Here, it takes less than 1 hour (maybe 20 mins), to get the job done...
(It's a Ubuntu with MySQL 5.0.51a, and i'm also using MySQL 5.1.45)

Bye
--
View this message in context:
http://old.nabble.com/VirtualFull-mysql-query-blocks-other-jobs-for-a-long-time-tp28149748p28190830.html
Sent from the Bacula - Users mailing list archive at Nabble.com.

--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-09 Thread Graham Keeling

On Fri, Apr 09, 2010 at 05:27:44AM -0700, ebollengier wrote:
 I'm really thinking that the problem is on the MySQL side (bad version
 perhaps), or on your
 modifications (my tests shows that with a FilenameId, PathId index, results
 are 10 times slower than
 with the default indexes)
 
 What version of MySQL are you using ? (and on which OS)

I've been testing this with the default bacula indexes.

I have:
Kernel 2.4.33.3, mysql 5.0.45.

The reporter of bug 1472 (the same problem), mnalis, has:
Kernel 2.6.32+23, mysql 5.1.43-1~bpo50+1.

What are you using?


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-08 Thread Graham Keeling

On Wed, Apr 07, 2010 at 09:58:51AM -0700, ebollengier wrote:

ebollengier wrote:

Graham Keeling wrote:

On Wed, Apr 07, 2010 at 08:22:09AM -0700, ebollengier wrote:
I tweaked my test to compare both queries, and it shows no difference
with
and without base job part... If you want to test queries on your side
with
your data, you can download my tool (accurate-test.pl) on
http://bacula.git.sourceforge.net/git/gitweb.cgi?p=bacula/docs;a=tree;f=docs/techlogs;hb=HEAD

If you can tweak my script to reproduce your problem, i would be able to
fix
things.

http://old.nabble.com/file/p28166612/diff-with-without-basejob.png

I'm currently running your script to generate the test database. I think
that is going to take a long time, so I'll leave it overnight.

This is your first problem, on my server (just a workstation), it takes
less
than 10s to add 200,000 records...

In fact, it's 4seconds, here...

JobId=15 files=2
Insert takes 4secs for 20 records
JobId=16 files=5
Insert takes 11secs for 50 records
JobId=20 files=20
Insert takes 43secs for 200 records

I was running these three simultaneously, as described in the script.
I assume that fill_table.pl is the same as docs_techlogs_accurate-test.pl.

# filename=1 ./fill_table.pl
# path=1 ./fill_table.pl
# file=1 ./fill_table.pl

However, I was also not running it on the machine that I did my original tests
on because it didn't have pwgen and the perl DBD-mysql thing installed.
But I have installed them now, so I will switch back to that machine.
Running one instance looks comparable with yours:

tserv tmp # file=1 docs_techlogs_accurate-test.pl
JobId=1 files=2
Insert takes 9secs for 20 records
JobId=2 files=2
Insert takes 6secs for 20 records
JobId=3 files=2
Insert takes 8secs for 20 records
JobId=4 files=2

(but I think it is still going to take some time)

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-08 Thread Graham Keeling

Hello,

I'm still waiting for my test database to fill up with Eric's data (actually,
it's full now, but generating the right indexes is taking lots of time).


But, I have another proposed solution, better than the last one I made.

My previous solution was still taking a very very long time for a backup of a
particular client that I had. Removing mention of BaseFiles did not help for
this client.

However, the following did, and it doesn't break Base jobs.

Eric, I would appreciate it if you could give this a go on your test machine.

It removes nasty the join on JobTDate by replacing it with a join on JobId
(which makes more sense, and is also an index on Job and File).
This also means it can get rid of the outer WHERE 'BaseJobId' OR 'JobId' that I
was complaining about before.
The correct JobId is chosen with MAX(JobTDate) by ordering by JobTDate DESC on
the innermost select.



SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5
FROM
   ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId,
File.FilenameId AS FilenameId, LStat, MD5
 FROM Job, File,
( SELECT JobId, MAX(JobTDate) AS JobTDate, PathId, FilenameId
  FROM
  ( SELECT JobId, JobTDate, PathId, FilenameId
FROM File
JOIN Job USING (JobId)
WHERE File.JobId IN ($jobids)
UNION ALL
SELECT BaseJobId, JobTDate, PathId, FilenameId
FROM BaseFiles
JOIN File USING (FileId)
JOIN Job  ON(BaseJobId = Job.JobId)
WHERE BaseFiles.JobId IN ($jobids)
ORDER BY JobTDate DESC
  )
  AS tmp GROUP BY PathId, FilenameId
) AS T1
 WHERE
 T1.JobId = Job.JobId
 AND Job.JobId = File.JobId
 AND T1.PathId = File.PathId
 AND T1.FilenameId = File.FilenameId
   ) AS Temp
JOIN Filename ON (Filename.FilenameId = Temp.FilenameId)
JOIN Path ON (Path.PathId = Temp.PathId)
WHERE FileIndex  0
ORDER BY Path.Path, Filename.Name, Temp.JobId ASC;



--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-08 Thread ebollengier

Hello,

Graham Keeling wrote:

Hello,

I'm still waiting for my test database to fill up with Eric's data
(actually,
it's full now, but generating the right indexes is taking lots of time).

But, I have another proposed solution, better than the last one I made.

My previous solution was still taking a very very long time for a backup
of a
particular client that I had. Removing mention of BaseFiles did not help
for
this client.

However, the following did, and it doesn't break Base jobs.

Eric, I would appreciate it if you could give this a go on your test
machine.

It removes nasty the join on JobTDate by replacing it with a join on JobId
(which makes more sense, and is also an index on Job and File).

It's not possible, for example when you bscan a volume, or you run a copy,
old
records with old JobTDate have new FileIds or JobIds. You can't trust this
field. It
will work in many situations, but also will fail in an horrible way on many
others...

The first version (3.0.3) was using FileId, and it was quite good, but
it could result wrong results on some cases, and for restore, you don't have
choice, you
need the exact one.

This also means it can get rid of the outer WHERE 'BaseJobId' OR 'JobId'
that I
was complaining about before.
The correct JobId is chosen with MAX(JobTDate) by ordering by JobTDate
DESC on
the innermost select.

I would love that, it's what DISTINCT ON() on PostgreSQL does. But,
unfortunately
in SQL, you can only get fields that have a group function (like MAX, MIN,
AVG) and
fields present in GROUP BY. (we can also take FileIndex, MD5 and LStat at
the same time)

from http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-columns.html
When using this feature, all rows in each group should have the same
values for the columns that are
ommitted from the GROUP BY part. The server is free to return any value
from the group, so the results
are indeterminate unless all values are the same.

This is not only a MySQL gotcha, it's an SQL property. Unless i'm mistaken
this paragraph, I'm sure
that we can't use it :-(

I appreciate your help, and I hope that we will find a solution for those
that have
this performance problem (that I can't reproduce myself).

Bye

--
View this message in context:
http://old.nabble.com/VirtualFull-mysql-query-blocks-other-jobs-for-a-long-time-tp28149748p28179634.html
Sent from the Bacula - Users mailing list archive at Nabble.com.

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-08 Thread Graham Keeling

On Thu, Apr 08, 2010 at 07:44:14AM -0700, ebollengier wrote:
 
 Hello,
 
 
 Graham Keeling wrote:
  
  Hello,
  
  I'm still waiting for my test database to fill up with Eric's data
  (actually,
  it's full now, but generating the right indexes is taking lots of time).
  
  
  But, I have another proposed solution, better than the last one I made.
  
  My previous solution was still taking a very very long time for a backup
  of a
  particular client that I had. Removing mention of BaseFiles did not help
  for
  this client.
  
  However, the following did, and it doesn't break Base jobs.
  
  Eric, I would appreciate it if you could give this a go on your test
  machine.
  
  It removes nasty the join on JobTDate by replacing it with a join on JobId
  (which makes more sense, and is also an index on Job and File).
  
 
 It's not possible, for example when you bscan a volume, or you run a copy,
 old
 records with old JobTDate have new FileIds or JobIds. You can't trust this
 field. It
 will work in many situations, but also will fail in an horrible way on many
 others...

 The first version (3.0.3) was using FileId, and it was quite good, but
 it could result wrong results on some cases, and for restore, you don't have
 choice, you
 need the exact one.

I don't understand this at all.
If you cannot trust the JobIds or FileIds in the File table, then the postgres
query is also broken. The postgres query doesn't even mention JobTDate.
In fact, the postgres query is using StartTime to do the ordering.

  This also means it can get rid of the outer WHERE 'BaseJobId' OR 'JobId'
  that I
  was complaining about before.
  The correct JobId is chosen with MAX(JobTDate) by ordering by JobTDate
  DESC on
  the innermost select.
  
  
 
 I would love that, it's what DISTINCT ON() on PostgreSQL does. But,
 unfortunately
 in SQL, you can only get fields that have a group function (like MAX, MIN,
 AVG) and
 fields present in GROUP BY. (we can also take FileIndex, MD5 and LStat at
 the same time)
 
 from http://dev.mysql.com/doc/refman/5.0/en/group-by-hidden-columns.html
  When using this feature, all rows in each group should have the same
  values for the columns that are
  ommitted from the GROUP BY part. The server is free to return any value
  from the group, so the results 
  are indeterminate unless all values are the same.
 
 This is not only a MySQL gotcha, it's an SQL property. Unless i'm mistaken
 this paragraph, I'm sure
 that we can't use it :-(

OK, I understand what you are saying here.

 I appreciate your help, and I hope that we will find a solution for those
 that have
 this performance problem (that I can't reproduce myself).

I am probably out of ideas now, other than:
a) reverting to 3.0.3,
b) reverting to the 3.0.3 queries, or
c) switching to postgres (with all the horrible migration problems that will
cause).


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-08 Thread ebollengier

Graham Keeling wrote:

On Thu, Apr 08, 2010 at 07:44:14AM -0700, ebollengier wrote:

Hello,

Graham Keeling wrote:

Hello,

I'm still waiting for my test database to fill up with Eric's data
(actually,
it's full now, but generating the right indexes is taking lots of
time).

But, I have another proposed solution, better than the last one I made.

My previous solution was still taking a very very long time for a
backup
of a
particular client that I had. Removing mention of BaseFiles did not
help
for
this client.

However, the following did, and it doesn't break Base jobs.

Eric, I would appreciate it if you could give this a go on your test
machine.

It removes nasty the join on JobTDate by replacing it with a join on
JobId
(which makes more sense, and is also an index on Job and File).

It's not possible, for example when you bscan a volume, or you run a
copy,
old
records with old JobTDate have new FileIds or JobIds. You can't trust
this
field. It
will work in many situations, but also will fail in an horrible way on
many
others...

The first version (3.0.3) was using FileId, and it was quite good, but
it could result wrong results on some cases, and for restore, you don't
have
choice, you
need the exact one.

And JobTDate is equivalent to StartTime (can be changed in PostgreSQL or in
MySQL)

This also means it can get rid of the outer WHERE 'BaseJobId' OR
'JobId'
that I
was complaining about before.
The correct JobId is chosen with MAX(JobTDate) by ordering by JobTDate
DESC on
the innermost select.

I would love that, it's what DISTINCT ON() on PostgreSQL does. But,
unfortunately
in SQL, you can only get fields that have a group function (like MAX,
MIN,
AVG) and
fields present in GROUP BY. (we can also take FileIndex, MD5 and LStat at
the same time)

This is not only a MySQL gotcha, it's an SQL property. Unless i'm
mistaken
this paragraph, I'm sure
that we can't use it :-(

OK, I understand what you are saying here.

I appreciate your help, and I hope that we will find a solution for those
that have
this performance problem (that I can't reproduce myself).

I am probably out of ideas now, other than:
a) reverting to 3.0.3,
b) reverting to the 3.0.3 queries, or
c) switching to postgres (with all the horrible migration problems that
will
cause).

Even if we found a workaround for MySQL, Postgres will stay far more faster
(For the 2M file query, postgres was about 12s, MySQL 3.0.3 at 60s and MySQL
5.0.1 at 90s)

Can you confirm that the BaseJob doesn't change your timing ?

Bye
--
View this message in context:
http://old.nabble.com/VirtualFull-mysql-query-blocks-other-jobs-for-a-long-time-tp28149748p28183478.html
Sent from the Bacula - Users mailing list archive at Nabble.com.

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread Graham Keeling

On Tue, Apr 06, 2010 at 09:01:13AM -0700, ebollengier wrote:
 Hello Graham,

Hello, thanks for your reply.

 Graham Keeling wrote:
  
  Hello,
  I'm using bacula-5.0.1.
  I have a 2.33GHz CPU with 2G of RAM.
  I am using MySQL.
  I had a VirtualFull scheduled for my client.
  
  My log says the following:
  
  Apr  4 18:56:02  Start Virtual Backup JobId 56,
  Job=Linux:cvs.2010-04-04_18.56.00_03
  Apr  4 18:56:02  This Job is not an Accurate backup so is not equivalent
  to a Full backup.
  Apr  6 03:56:12  Bootstrap records written to
  /var/lib/bacula/backup.dev.equiinet.com-.restore.1.bsr
  Apr  6 10:23:20  Ready to read from volume backup-0002 on device Tower
  2.0 (/write/mnt/Tower 2).
  Apr  6 10:23:20  Labeled new Volume backup-0046 on device Tower 2.1
  (/write/mnt/Tower 2).
  ...and now, backup-0046 is being written to.
  
  At about Apr 6 10:00, I was logged on to the director, and attempting to
  run
  a 'status, director' command on bconsole.
  bconsole printed a few bits of information, then paused. Presumably
  because JobId 56 had locked the database. After 10:23:20, the command
  printed
  the rest of its information and gave me the prompt back.
  
  At about Apr 6 10:00, I logged into mysql, and ran 'show full
  processlist;',
  which gave the following.
  
  | Id   | User | Host  | db | Command | Time  | State | Info
  | 3032 | root | localhost | bacula | Query   | 22464 | Locked| UPDATE
  Job SE
  T JobStatus='R',Level='F',StartTime='2010-04-06
  03:56:14',ClientId=2,JobTDate=12
  70522574,PoolId=7,FileSetId=2 WHERE JobId=56
  
  | 3033 | root | localhost | bacula | Query   | 22464 | executing | SELECT
  Path.P
  ath, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( SELECT
  FileId,
   Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId AS
  Filena
  meId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate,
  PathId, Fil
  enameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job
  USING (Job
  Id) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT JobTDate,
  PathId
  , FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job  ON   
  (BaseJobId 
  = Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp
  GROUP BY P
  athId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId
  FROM B
  aseFiles WHERE JobId IN (22,23,31,34,42,48,52)) OR Job.JobId IN
  (22,23,31,34,42,
  48,52)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND
  T1.PathId 
  = File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN Filename
  ON (Fi
  lename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId =
  Temp.PathId) WH
  ERE FileIndex  0 ORDER BY Temp.JobId, FileIndex ASC | 
  
  'Time 22464' is equivalent to about 6 hours, so I presume that after
  'Bootstrap
  records written', bacula has been waiting for this sql command, during
  which
  time it could not do anything else. Backups of other clients were failing
  because I had Max Wait Time set to something less than 6 hours.
  
  The following is a listing of the number of files that each previous
  backup
  said it had backed up, which doesn't seem extraordinarily large to my
  mind.
  
  Full: 810,400
  Incr:  52,487
  Incr:  52,485
  Incr:  52,485
  Incr: 332,649
  Incr:  52,492
  Incr:  52,593
  Incr: 332,677
  Incr:  52,497
  Incr:  52,497
  Incr:  52,497
  
  
  So, my questions are:
  
  Does it really need to take so long?
  Is it expected that this Job blocks everything else for 6 hours?
  
 
 I think that this part of the code doesn't use a dedicated connection, it
 would be nice to
 improve this area. (other part of the code are using a specific connection
 to do this work)
 
 
 Graham Keeling wrote:
  
  Is there anything I can do that would speed it up?
  Perhaps even more importantly, what was it doing for the 33 hour period
  between Apr 4 18:56:02 and Apr 6 03:56:12? Was it just writing the
  bootstrap
  records?
  
  On the Path table, I have these:
 PRIMARY KEY(PathId),
 INDEX (Path(255))
  
  On the File table, I have these:
 PRIMARY KEY(FileId),
 INDEX (JobId, PathId, FilenameId),
 INDEX (PathId),
 INDEX (FilenameId)
  
  On the Job table, I have these:
 PRIMARY KEY(JobId),
 INDEX (Name(128)),
 INDEX (ClientId)
  
 
 Did you run tests with the original schema? It looks like you removed
 important index such as the one
 on JobId and you added others, and i'm not sure that MySQL can play properly
 with composed indexes (it doesn't looks like) in this treatment.

In future, to avoid generating this sort of complaint, I will not fiddle with
the default bacula indexes.
However, I think that the problem is not to do with this.
I shall explain below...

 I've made tests with 400,000,000 files and 40,000,000 filenames over 600
 jobs, and it doesn't run fast as PostgreSQL (factor 6 between both engines),
 but it should handle the job rather

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread ebollengier




Graham Keeling wrote:
 
 On Tue, Apr 06, 2010 at 09:01:13AM -0700, ebollengier wrote:
 Hello Graham,
 
 Hello, thanks for your reply.
 
 Graham Keeling wrote:
  
  Hello,
  I'm using bacula-5.0.1.
  I have a 2.33GHz CPU with 2G of RAM.
  I am using MySQL.
  I had a VirtualFull scheduled for my client.
  
  My log says the following:
  
  Apr  4 18:56:02  Start Virtual Backup JobId 56,
  Job=Linux:cvs.2010-04-04_18.56.00_03
  Apr  4 18:56:02  This Job is not an Accurate backup so is not
 equivalent
  to a Full backup.
  Apr  6 03:56:12  Bootstrap records written to
  /var/lib/bacula/backup.dev.equiinet.com-.restore.1.bsr
  Apr  6 10:23:20  Ready to read from volume backup-0002 on device
 Tower
  2.0 (/write/mnt/Tower 2).
  Apr  6 10:23:20  Labeled new Volume backup-0046 on device Tower 2.1
  (/write/mnt/Tower 2).
  ...and now, backup-0046 is being written to.
  
  At about Apr 6 10:00, I was logged on to the director, and attempting
 to
  run
  a 'status, director' command on bconsole.
  bconsole printed a few bits of information, then paused. Presumably
  because JobId 56 had locked the database. After 10:23:20, the command
  printed
  the rest of its information and gave me the prompt back.
  
  At about Apr 6 10:00, I logged into mysql, and ran 'show full
  processlist;',
  which gave the following.
  
  | Id   | User | Host  | db | Command | Time  | State | Info
  | 3032 | root | localhost | bacula | Query   | 22464 | Locked|
 UPDATE
  Job SE
  T JobStatus='R',Level='F',StartTime='2010-04-06
  03:56:14',ClientId=2,JobTDate=12
  70522574,PoolId=7,FileSetId=2 WHERE JobId=56   
   
  | 3033 | root | localhost | bacula | Query   | 22464 | executing |
 SELECT
  Path.P
  ath, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM (
 SELECT
  FileId,
   Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId
 AS
  Filena
  meId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate,
  PathId, Fil
  enameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job
  USING (Job
  Id) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT
 JobTDate,
  PathId
  , FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job  ON   
  (BaseJobId 
  = Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp
  GROUP BY P
  athId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT
 BaseJobId
  FROM B
  aseFiles WHERE JobId IN (22,23,31,34,42,48,52)) OR Job.JobId IN
  (22,23,31,34,42,
  48,52)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND
  T1.PathId 
  = File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN
 Filename
  ON (Fi
  lename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId =
  Temp.PathId) WH
  ERE FileIndex  0 ORDER BY Temp.JobId, FileIndex ASC | 
  
  'Time 22464' is equivalent to about 6 hours, so I presume that after
  'Bootstrap
  records written', bacula has been waiting for this sql command, during
  which
  time it could not do anything else. Backups of other clients were
 failing
  because I had Max Wait Time set to something less than 6 hours.
  
  The following is a listing of the number of files that each previous
  backup
  said it had backed up, which doesn't seem extraordinarily large to my
  mind.
  
  Full: 810,400
  Incr:  52,487
  Incr:  52,485
  Incr:  52,485
  Incr: 332,649
  Incr:  52,492
  Incr:  52,593
  Incr: 332,677
  Incr:  52,497
  Incr:  52,497
  Incr:  52,497
  
  
  So, my questions are:
  
  Does it really need to take so long?
  Is it expected that this Job blocks everything else for 6 hours?
  
 
 I think that this part of the code doesn't use a dedicated connection, it
 would be nice to
 improve this area. (other part of the code are using a specific
 connection
 to do this work)
 
 
 Graham Keeling wrote:
  
  Is there anything I can do that would speed it up?
  Perhaps even more importantly, what was it doing for the 33 hour period
  between Apr 4 18:56:02 and Apr 6 03:56:12? Was it just writing the
  bootstrap
  records?
  
  On the Path table, I have these:
 PRIMARY KEY(PathId),
 INDEX (Path(255))
  
  On the File table, I have these:
 PRIMARY KEY(FileId),
 INDEX (JobId, PathId, FilenameId),
 INDEX (PathId),
 INDEX (FilenameId)
  
  On the Job table, I have these:
 PRIMARY KEY(JobId),
 INDEX (Name(128)),
 INDEX (ClientId)
  
 
 Did you run tests with the original schema? It looks like you removed
 important index such as the one
 on JobId and you added others, and i'm not sure that MySQL can play
 properly
 with composed indexes (it doesn't looks like) in this treatment.
 
 In future, to avoid generating this sort of complaint, I will not fiddle
 with
 the default bacula indexes.
 However, I think that the problem is not to do with this.
 I shall explain below...
 
 I've made tests with 400,000,000 files and 40,000,000 filenames over 600
 jobs, and it doesn't run fast as PostgreSQL (factor 6 between

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread Martin Simmons

 On Wed, 7 Apr 2010 12:40:24 +0100, Graham Keeling said:
 
 On Tue, Apr 06, 2010 at 09:01:13AM -0700, ebollengier wrote:
  Hello Graham,
 
 Hello, thanks for your reply.
 
  Graham Keeling wrote:
   
   Hello,
   I'm using bacula-5.0.1.
   I have a 2.33GHz CPU with 2G of RAM.
   I am using MySQL.
   I had a VirtualFull scheduled for my client.
   
   My log says the following:
   
   Apr  4 18:56:02  Start Virtual Backup JobId 56,
   Job=Linux:cvs.2010-04-04_18.56.00_03
   Apr  4 18:56:02  This Job is not an Accurate backup so is not equivalent
   to a Full backup.
   Apr  6 03:56:12  Bootstrap records written to
   /var/lib/bacula/backup.dev.equiinet.com-.restore.1.bsr
   Apr  6 10:23:20  Ready to read from volume backup-0002 on device Tower
   2.0 (/write/mnt/Tower 2).
   Apr  6 10:23:20  Labeled new Volume backup-0046 on device Tower 2.1
   (/write/mnt/Tower 2).
   ...and now, backup-0046 is being written to.
   
   At about Apr 6 10:00, I was logged on to the director, and attempting to
   run
   a 'status, director' command on bconsole.
   bconsole printed a few bits of information, then paused. Presumably
   because JobId 56 had locked the database. After 10:23:20, the command
   printed
   the rest of its information and gave me the prompt back.
   
   At about Apr 6 10:00, I logged into mysql, and ran 'show full
   processlist;',
   which gave the following.
   
   | Id   | User | Host  | db | Command | Time  | State | Info
   | 3032 | root | localhost | bacula | Query   | 22464 | Locked| UPDATE
   Job SE
   T JobStatus='R',Level='F',StartTime='2010-04-06
   03:56:14',ClientId=2,JobTDate=12
   70522574,PoolId=7,FileSetId=2 WHERE JobId=56  
 
   | 3033 | root | localhost | bacula | Query   | 22464 | executing | SELECT
   Path.P
   ath, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( SELECT
   FileId,
Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId AS
   Filena
   meId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate,
   PathId, Fil
   enameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job
   USING (Job
   Id) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT JobTDate,
   PathId
   , FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job  ON   
   (BaseJobId 
   = Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp
   GROUP BY P
   athId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId
   FROM B
   aseFiles WHERE JobId IN (22,23,31,34,42,48,52)) OR Job.JobId IN
   (22,23,31,34,42,
   48,52)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND
   T1.PathId 
   = File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN Filename
   ON (Fi
   lename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId =
   Temp.PathId) WH
   ERE FileIndex  0 ORDER BY Temp.JobId, FileIndex ASC | 
   
   'Time 22464' is equivalent to about 6 hours, so I presume that after
   'Bootstrap
   records written', bacula has been waiting for this sql command, during
   which
   time it could not do anything else. Backups of other clients were failing
   because I had Max Wait Time set to something less than 6 hours.
   
   The following is a listing of the number of files that each previous
   backup
   said it had backed up, which doesn't seem extraordinarily large to my
   mind.
   
   Full: 810,400
   Incr:  52,487
   Incr:  52,485
   Incr:  52,485
   Incr: 332,649
   Incr:  52,492
   Incr:  52,593
   Incr: 332,677
   Incr:  52,497
   Incr:  52,497
   Incr:  52,497
   
   
   So, my questions are:
   
   Does it really need to take so long?
   Is it expected that this Job blocks everything else for 6 hours?
   
  
  I think that this part of the code doesn't use a dedicated connection, it
  would be nice to
  improve this area. (other part of the code are using a specific connection
  to do this work)
  
  
  Graham Keeling wrote:
   
   Is there anything I can do that would speed it up?
   Perhaps even more importantly, what was it doing for the 33 hour period
   between Apr 4 18:56:02 and Apr 6 03:56:12? Was it just writing the
   bootstrap
   records?
   
   On the Path table, I have these:
  PRIMARY KEY(PathId),
  INDEX (Path(255))
   
   On the File table, I have these:
  PRIMARY KEY(FileId),
  INDEX (JobId, PathId, FilenameId),
  INDEX (PathId),
  INDEX (FilenameId)
   
   On the Job table, I have these:
  PRIMARY KEY(JobId),
  INDEX (Name(128)),
  INDEX (ClientId)
   
  
  Did you run tests with the original schema? It looks like you removed
  important index such as the one
  on JobId and you added others, and i'm not sure that MySQL can play properly
  with composed indexes (it doesn't looks like) in this treatment.
 
 In future, to avoid generating this sort of complaint, I will not fiddle with
 the default bacula indexes.
 However, I think that the problem is not to do with this.
 I shall

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread Graham Keeling

On Wed, Apr 07, 2010 at 02:51:42PM +0100, Martin Simmons wrote:
 Does it still run quickly if keep that Job.JobId IN clause but use the numbers
 returned by
 
 SELECT DISTINCT BaseJobId
  FROM BaseFiles
  WHERE JobId IN (22,23,31,34,42,48,52)
 
 in place of the the nested select?

In my case, there are no numbers returned.
But, I assume that you are suggesting something like the command below.
For me, it completes in under a second.

SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5
FROM
   ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId,
File.FilenameId AS FilenameId, LStat, MD5
 FROM Job, File,
( SELECT MAX(JobTDate) AS JobTDate, PathId, FilenameId
  FROM
  ( SELECT JobTDate, PathId, FilenameId
FROM File
JOIN Job USING (JobId)
WHERE File.JobId IN (22,23,31,34,42,48,52)
UNION ALL
SELECT JobTDate, PathId, FilenameId
FROM BaseFiles
JOIN File USING (FileId)
JOIN Job  ON(BaseJobId = Job.JobId)
WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52)
  )
  AS tmp GROUP BY PathId, FilenameId
) AS T1
 WHERE
( Job.JobId IN (22,23,31,34,42,48,52)
 OR   Job.JobId IN (22,23,31,34,42,48,52) )
 AND T1.JobTDate = Job.JobTDate
 AND Job.JobId = File.JobId
 AND T1.PathId = File.PathId
 AND T1.FilenameId = File.FilenameId
   ) AS Temp
JOIN Filename ON (Filename.FilenameId = Temp.FilenameId)
JOIN Path ON (Path.PathId = Temp.PathId)
WHERE FileIndex  0
ORDER BY Temp.JobId, FileIndex ASC



--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread Graham Keeling

Since:

a) I am not using Base jobs
b) I am currently stuck with using MySQL
c) There is not a 'proper' fix yet

I am going to use the attached patch as a temporary solution to the problem.
Index: src/cats/sql_cmds.c
===
RCS file: /cvs/netpilot/GPL/bacula-5.0.1/WORK/src/cats/sql_cmds.c,v
retrieving revision 1.2
diff -u -r1.2 sql_cmds.c
--- src/cats/sql_cmds.c	26 Mar 2010 10:36:03 -	1.2
+++ src/cats/sql_cmds.c	7 Apr 2010 13:56:10 -
@@ -481,9 +481,7 @@
  WHERE BaseFiles.JobId IN (%s) /* Use Max(JobTDate) to find */
) AS tmp GROUP BY PathId, FilenameId/* the latest file version */
 ) AS T1 
-WHERE (Job.JobId IN (   /* Security, we force JobId to be valid */
-SELECT DISTINCT BaseJobId FROM BaseFiles WHERE JobId IN (%s)) 
-OR Job.JobId IN (%s)) 
+WHERE Job.JobId IN (%s) 
   AND T1.JobTDate = Job.JobTDate  /* Join on JobTDate to get the orginal */
   AND Job.JobId = File.JobId  /* Job/File record */
   AND T1.PathId = File.PathId 
Index: src/cats/sql_get.c
===
RCS file: /cvs/netpilot/GPL/bacula-5.0.1/WORK/src/cats/sql_get.c,v
retrieving revision 1.1
diff -u -r1.1 sql_get.c
--- src/cats/sql_get.c	26 Mar 2010 10:20:41 -	1.1
+++ src/cats/sql_get.c	7 Apr 2010 13:56:10 -
@@ -,7 +,7 @@
 #ifdef new_db_get_file_list
POOL_MEM buf2(PM_MESSAGE);
Mmsg(buf2, select_recent_version_with_basejob[db_type], 
-jobids, jobids, jobids, jobids);
+jobids, jobids, jobids);
Mmsg(buf,
 SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 
  FROM ( %s ) AS Temp 
--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread ebollengier



ebollengier wrote:
 
 
 
 Graham Keeling wrote:
 
 On Tue, Apr 06, 2010 at 09:01:13AM -0700, ebollengier wrote:
 Hello Graham,
 
 Hello, thanks for your reply.
 
 Graham Keeling wrote:
  
  Is there anything I can do that would speed it up?
  Perhaps even more importantly, what was it doing for the 33 hour
 period
  between Apr 4 18:56:02 and Apr 6 03:56:12? Was it just writing the
  bootstrap
  records?
  
  On the Path table, I have these:
 PRIMARY KEY(PathId),
 INDEX (Path(255))
  
  On the File table, I have these:
 PRIMARY KEY(FileId),
 INDEX (JobId, PathId, FilenameId),
 INDEX (PathId),
 INDEX (FilenameId)
  
  On the Job table, I have these:
 PRIMARY KEY(JobId),
 INDEX (Name(128)),
 INDEX (ClientId)
  
 
 Did you run tests with the original schema? It looks like you removed
 important index such as the one
 on JobId and you added others, and i'm not sure that MySQL can play
 properly
 with composed indexes (it doesn't looks like) in this treatment.
 
 In future, to avoid generating this sort of complaint, I will not fiddle
 with
 the default bacula indexes.
 However, I think that the problem is not to do with this.
 I shall explain below...
 
 I've made tests with 400,000,000 files and 40,000,000 filenames over 600
 jobs, and it doesn't run fast as PostgreSQL (factor 6 between both
 engines),
 but it should handle the job rather quickly. (on my pc, for 2M files,
 it's
 between 1min and 1min30 (15seconds with PostgreSQL))
 
 
 Here is the command that mysql takes hours over, and hence the one that I
 am
 using to test. In all of my tests, the result is the same whether I am
 using
 the default bacula indexes or my additional ones.
 
 
 SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5
 FROM
( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId,
 File.FilenameId AS FilenameId, LStat, MD5
  FROM Job, File,
 ( SELECT MAX(JobTDate) AS JobTDate, PathId, FilenameId
   FROM
   ( SELECT JobTDate, PathId, FilenameId
 FROM File
 JOIN Job USING (JobId)
 WHERE File.JobId IN (22,23,31,34,42,48,52)
 UNION ALL
 SELECT JobTDate, PathId, FilenameId
 FROM BaseFiles
 JOIN File USING (FileId)
 JOIN Job  ON(BaseJobId = Job.JobId)
 WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52)
   )
   AS tmp GROUP BY PathId, FilenameId
 ) AS T1
  WHERE (Job.JobId IN
 ( SELECT DISTINCT BaseJobId
 FROM BaseFiles
 WHERE JobId IN (22,23,31,34,42,48,52)
 )
)
  OR Job.JobId IN (22,23,31,34,42,48,52)
  AND T1.JobTDate = Job.JobTDate
  AND Job.JobId = File.JobId
  AND T1.PathId = File.PathId
  AND T1.FilenameId = File.FilenameId
) AS Temp
 JOIN Filename ON (Filename.FilenameId = Temp.FilenameId)
 JOIN Path ON (Path.PathId = Temp.PathId)
 WHERE FileIndex  0
 ORDER BY Temp.JobId, FileIndex ASC;
 
 
 
 Since I am not using base jobs, I did the experiment of removing the
 lines
 from the query to do with base jobs. When I did that, the query changed
 from
 taking hours, to taking under a second.
 
 I then tracked it down to this specific part (i.e, remove this, and get a
 truly magnificent speed-up):
 
(Job.JobId IN
 ( SELECT DISTINCT BaseJobId
 FROM BaseFiles
 WHERE JobId IN (22,23,31,34,42,48,52)
 )
)
  OR
 
 I think that mysql is repeatedly calling this section for each row that
 the SELECT returns.
 
 Now the question is what to do about it?
 
 
 It's very interesting, in my tests (with and without basejobs), it changes
 nothing... But i will
 check it again. (MySQL have a query cache, and if you run two times the
 same thing, the result
 is very fast)
 
 As we are using JobTDate to find the last record of each file, we use it
 to get JobId back from the
 Job table (yes, ugly, but MySQL doesn't have DISTINCT ON()), and if two
 jobs have the same JobTDate, we are in trouble. So, i limit the selection
 to this JobId list.
 
 Hope that you found the problem, it's rather easy to compute this list
 once, and
 use it after.
 
 Bye
 

I tweaked my test to compare both queries, and it shows no difference with
and without base job part... If you want to test queries on your side with
your data, you can download my tool (accurate-test.pl) on
http://bacula.git.sourceforge.net/git/gitweb.cgi?p=bacula/docs;a=tree;f=docs/techlogs;hb=HEAD

If you can tweak my script to reproduce your problem, i would be able to fix
things.

http://old.nabble.com/file/p28166612/diff-with-without-basejob.png 

Bye

-- 
View this message in context:

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread Graham Keeling

On Wed, Apr 07, 2010 at 08:22:09AM -0700, ebollengier wrote:
I tweaked my test to compare both queries, and it shows no difference with
and without base job part... If you want to test queries on your side with
your data, you can download my tool (accurate-test.pl) on
http://bacula.git.sourceforge.net/git/gitweb.cgi?p=bacula/docs;a=tree;f=docs/techlogs;hb=HEAD

If you can tweak my script to reproduce your problem, i would be able to fix
things.

http://old.nabble.com/file/p28166612/diff-with-without-basejob.png

I'm currently running your script to generate the test database. I think
that is going to take a long time, so I'll leave it overnight.

Meanwhile (unless I misunderstand), I think that your graph is labelled
strangely.

When I'm generating the file records, I get this:

...
JobId=10 files=20
Insert takes 445secs for 200 records
...
JobId=13 files=2
Insert takes 54secs for 20 records
...

So, JobIds 10 + 13:
22 files
220 records

Your script says:
do_bench(10,13, 220);
This is obviously the records number.

Your graph uses this bigger number and labels it 'Files'.

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread ebollengier

Graham Keeling wrote:

If you can tweak my script to reproduce your problem, i would be able to
fix
things.

http://old.nabble.com/file/p28166612/diff-with-without-basejob.png

I'm currently running your script to generate the test database. I think
that is going to take a long time, so I'll leave it overnight.

This is your first problem, on my server (just a workstation), it takes less
than 10s to add 200,000 records...

Meanwhile (unless I misunderstand), I think that your graph is labelled
strangely.

When I'm generating the file records, I get this:

...
JobId=10 files=20
Insert takes 445secs for 200 records
...
JobId=13 files=2
Insert takes 54secs for 20 records
...

So, JobIds 10 + 13:
22 files
220 records

Your script says:
do_bench(10,13, 220);
This is obviously the records number.

Your graph uses this bigger number and labels it 'Files'.

The accurate query on jobid=10,13 will work on 2,000,000 + 200,000 files
(this is
the argument), as each filenames are different, you can expect to retrieve
2,2M files.

It's quite possible that it miss a *10 mutiplication somewhere in debug
messages (I insert files 10 by 10)

Bye
--
View this message in context:
http://old.nabble.com/VirtualFull-mysql-query-blocks-other-jobs-for-a-long-time-tp28149748p28167751.html
Sent from the Bacula - Users mailing list archive at Nabble.com.

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-07 Thread ebollengier

ebollengier wrote:

Graham Keeling wrote:

If you can tweak my script to reproduce your problem, i would be able to
fix
things.

http://old.nabble.com/file/p28166612/diff-with-without-basejob.png

I'm currently running your script to generate the test database. I think
that is going to take a long time, so I'll leave it overnight.

This is your first problem, on my server (just a workstation), it takes
less
than 10s to add 200,000 records...

In fact, it's 4seconds, here...

JobId=15 files=2
Insert takes 4secs for 20 records
JobId=16 files=5
Insert takes 11secs for 50 records
JobId=20 files=20
Insert takes 43secs for 200 records
--
View this message in context:
http://old.nabble.com/VirtualFull-mysql-query-blocks-other-jobs-for-a-long-time-tp28149748p28167844.html
Sent from the Bacula - Users mailing list archive at Nabble.com.

[Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-06 Thread Graham Keeling

Hello,
I'm using bacula-5.0.1.
I have a 2.33GHz CPU with 2G of RAM.
I am using MySQL.
I had a VirtualFull scheduled for my client.

My log says the following:

Apr  4 18:56:02  Start Virtual Backup JobId 56, 
Job=Linux:cvs.2010-04-04_18.56.00_03
Apr  4 18:56:02  This Job is not an Accurate backup so is not equivalent to a 
Full backup.
Apr  6 03:56:12  Bootstrap records written to 
/var/lib/bacula/backup.dev.equiinet.com-.restore.1.bsr
Apr  6 10:23:20  Ready to read from volume backup-0002 on device Tower 2.0 
(/write/mnt/Tower 2).
Apr  6 10:23:20  Labeled new Volume backup-0046 on device Tower 2.1 
(/write/mnt/Tower 2).
...and now, backup-0046 is being written to.

At about Apr 6 10:00, I was logged on to the director, and attempting to run
a 'status, director' command on bconsole.
bconsole printed a few bits of information, then paused. Presumably
because JobId 56 had locked the database. After 10:23:20, the command printed
the rest of its information and gave me the prompt back.

At about Apr 6 10:00, I logged into mysql, and ran 'show full processlist;',
which gave the following.

| Id   | User | Host  | db | Command | Time  | State | Info
| 3032 | root | localhost | bacula | Query   | 22464 | Locked| UPDATE Job SE
T JobStatus='R',Level='F',StartTime='2010-04-06 03:56:14',ClientId=2,JobTDate=12
70522574,PoolId=7,FileSetId=2 WHERE JobId=56
| 3033 | root | localhost | bacula | Query   | 22464 | executing | SELECT Path.P
ath, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( SELECT FileId,
 Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId AS Filena
meId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate, PathId, Fil
enameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job USING (Job
Id) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT JobTDate, PathId
, FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job  ON(BaseJobId 
= Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp GROUP BY P
athId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId FROM B
aseFiles WHERE JobId IN (22,23,31,34,42,48,52)) OR Job.JobId IN (22,23,31,34,42,
48,52)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND T1.PathId 
= File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN Filename ON (Fi
lename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId = Temp.PathId) WH
ERE FileIndex  0 ORDER BY Temp.JobId, FileIndex ASC | 

'Time 22464' is equivalent to about 6 hours, so I presume that after 'Bootstrap
records written', bacula has been waiting for this sql command, during which
time it could not do anything else. Backups of other clients were failing
because I had Max Wait Time set to something less than 6 hours.

The following is a listing of the number of files that each previous backup
said it had backed up, which doesn't seem extraordinarily large to my mind.

Full: 810,400
Incr:  52,487
Incr:  52,485
Incr:  52,485
Incr: 332,649
Incr:  52,492
Incr:  52,593
Incr: 332,677
Incr:  52,497
Incr:  52,497
Incr:  52,497


So, my questions are:

Does it really need to take so long?
Is it expected that this Job blocks everything else for 6 hours?
Is there anything I can do that would speed it up?
Perhaps even more importantly, what was it doing for the 33 hour period
between Apr 4 18:56:02 and Apr 6 03:56:12? Was it just writing the bootstrap
records?

On the Path table, I have these:
   PRIMARY KEY(PathId),
   INDEX (Path(255))

On the File table, I have these:
   PRIMARY KEY(FileId),
   INDEX (JobId, PathId, FilenameId),
   INDEX (PathId),
   INDEX (FilenameId)

On the Job table, I have these:
   PRIMARY KEY(JobId),
   INDEX (Name(128)),
   INDEX (ClientId)


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-06 Thread Graham Keeling

On Tue, Apr 06, 2010 at 12:59:49PM +0200, Matija Nalis wrote:
 On Tue, Apr 06, 2010 at 11:13:44AM +0100, Graham Keeling wrote:
  I'm using bacula-5.0.1.
  
  At about Apr 6 10:00, I logged into mysql, and ran 'show full processlist;',
  which gave the following.
  
  | 3033 | root | localhost | bacula | Query   | 22464 | executing | SELECT 
  Path.P
  ath, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( SELECT 
  FileId,
   Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId AS 
  Filena
  meId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate, 
  PathId, Fil
  enameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job USING 
  (Job
  Id) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT JobTDate, 
  PathId
  , FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job  ON
  (BaseJobId 
  = Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp GROUP 
  BY P
  athId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId 
  FROM B
  aseFiles WHERE JobId IN (22,23,31,34,42,48,52)) OR Job.JobId IN 
  (22,23,31,34,42,
  48,52)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND 
  T1.PathId 
  = File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN Filename 
  ON (Fi
  lename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId = 
  Temp.PathId) WH
  ERE FileIndex  0 ORDER BY Temp.JobId, FileIndex ASC | 
  
  'Time 22464' is equivalent to about 6 hours, so I presume that after 
  'Bootstrap
  records written', bacula has been waiting for this sql command, during which
  time it could not do anything else. Backups of other clients were failing
  because I had Max Wait Time set to something less than 6 hours.
 
 Yes, it happens, especially with bigger File tables - what does 
 Select count(*) from File returns for you ?

3239328

 This slowdown was not present in 3.0.3 but is new to 5.0.x series and
 has been documented on http://bugs.bacula.org/view.php?id=1472
 which is in process of being debugged currently

Thanks for that.
A colleague of mine did exactly start several jobs; very soon the query
gets run and everything else stops this morning.


--
Download Intel#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

2010-04-06 Thread ebollengier


Hello Graham,


Graham Keeling wrote:
 
 Hello,
 I'm using bacula-5.0.1.
 I have a 2.33GHz CPU with 2G of RAM.
 I am using MySQL.
 I had a VirtualFull scheduled for my client.
 
 My log says the following:
 
 Apr  4 18:56:02  Start Virtual Backup JobId 56,
 Job=Linux:cvs.2010-04-04_18.56.00_03
 Apr  4 18:56:02  This Job is not an Accurate backup so is not equivalent
 to a Full backup.
 Apr  6 03:56:12  Bootstrap records written to
 /var/lib/bacula/backup.dev.equiinet.com-.restore.1.bsr
 Apr  6 10:23:20  Ready to read from volume backup-0002 on device Tower
 2.0 (/write/mnt/Tower 2).
 Apr  6 10:23:20  Labeled new Volume backup-0046 on device Tower 2.1
 (/write/mnt/Tower 2).
 ...and now, backup-0046 is being written to.
 
 At about Apr 6 10:00, I was logged on to the director, and attempting to
 run
 a 'status, director' command on bconsole.
 bconsole printed a few bits of information, then paused. Presumably
 because JobId 56 had locked the database. After 10:23:20, the command
 printed
 the rest of its information and gave me the prompt back.
 
 At about Apr 6 10:00, I logged into mysql, and ran 'show full
 processlist;',
 which gave the following.
 
 | Id   | User | Host  | db | Command | Time  | State | Info
 | 3032 | root | localhost | bacula | Query   | 22464 | Locked| UPDATE
 Job SE
 T JobStatus='R',Level='F',StartTime='2010-04-06
 03:56:14',ClientId=2,JobTDate=12
 70522574,PoolId=7,FileSetId=2 WHERE JobId=56  
   
 | 3033 | root | localhost | bacula | Query   | 22464 | executing | SELECT
 Path.P
 ath, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM ( SELECT
 FileId,
  Job.JobId AS JobId, FileIndex, File.PathId AS PathId, File.FilenameId AS
 Filena
 meId, LStat, MD5 FROM Job, File, ( SELECT MAX(JobTDate) AS JobTDate,
 PathId, Fil
 enameId FROM ( SELECT JobTDate, PathId, FilenameId FROM File JOIN Job
 USING (Job
 Id) WHERE File.JobId IN (22,23,31,34,42,48,52) UNION ALL SELECT JobTDate,
 PathId
 , FilenameId FROM BaseFiles JOIN File USING (FileId) JOIN Job  ON   
 (BaseJobId 
 = Job.JobId) WHERE BaseFiles.JobId IN (22,23,31,34,42,48,52) ) AS tmp
 GROUP BY P
 athId, FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId
 FROM B
 aseFiles WHERE JobId IN (22,23,31,34,42,48,52)) OR Job.JobId IN
 (22,23,31,34,42,
 48,52)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = File.JobId AND
 T1.PathId 
 = File.PathId AND T1.FilenameId = File.FilenameId ) AS Temp JOIN Filename
 ON (Fi
 lename.FilenameId = Temp.FilenameId) JOIN Path ON (Path.PathId =
 Temp.PathId) WH
 ERE FileIndex  0 ORDER BY Temp.JobId, FileIndex ASC | 
 
 'Time 22464' is equivalent to about 6 hours, so I presume that after
 'Bootstrap
 records written', bacula has been waiting for this sql command, during
 which
 time it could not do anything else. Backups of other clients were failing
 because I had Max Wait Time set to something less than 6 hours.
 
 The following is a listing of the number of files that each previous
 backup
 said it had backed up, which doesn't seem extraordinarily large to my
 mind.
 
 Full: 810,400
 Incr:  52,487
 Incr:  52,485
 Incr:  52,485
 Incr: 332,649
 Incr:  52,492
 Incr:  52,593
 Incr: 332,677
 Incr:  52,497
 Incr:  52,497
 Incr:  52,497
 
 
 So, my questions are:
 
 Does it really need to take so long?
 Is it expected that this Job blocks everything else for 6 hours?
 

I think that this part of the code doesn't use a dedicated connection, it
would be nice to
improve this area. (other part of the code are using a specific connection
to do this work)


Graham Keeling wrote:
 
 Is there anything I can do that would speed it up?
 Perhaps even more importantly, what was it doing for the 33 hour period
 between Apr 4 18:56:02 and Apr 6 03:56:12? Was it just writing the
 bootstrap
 records?
 
 On the Path table, I have these:
PRIMARY KEY(PathId),
INDEX (Path(255))
 
 On the File table, I have these:
PRIMARY KEY(FileId),
INDEX (JobId, PathId, FilenameId),
INDEX (PathId),
INDEX (FilenameId)
 
 On the Job table, I have these:
PRIMARY KEY(JobId),
INDEX (Name(128)),
INDEX (ClientId)
 

Did you run tests with the original schema? It looks like you removed
important index such as the one
on JobId and you added others, and i'm not sure that MySQL can play properly
with composed indexes (it doesn't looks like) in this treatment.

I've made tests with 400,000,000 files and 40,000,000 filenames over 600
jobs, and it doesn't run fast as PostgreSQL (factor 6 between both engines),
but it should handle the job rather quickly. (on my pc, for 2M files, it's
between 1min and 1min30 (15seconds with PostgreSQL))

Bye

-- 
View this message in context: 
http://old.nabble.com/VirtualFull-mysql-query-blocks-other-jobs-for-a-long-time-tp28149748p28153906.html
Sent from the Bacula - Users mailing list archive at Nabble.com.


--
Download Intel#174; Parallel Studio

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

[Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

Re: [Bacula-users] VirtualFull mysql query blocks other jobs for a long time

22 matches

Site Navigation

Mail list logo

Footer information