On Jul 23, 2011, at 3:50 PM, Eric Bollengier wrote:

> Hello Dan,
> 
> On 23/07/2011 04:14, Dan Langille wrote:
>> On Jul 21, 2011, at 12:51 AM, Eric Bollengier wrote:
>>> Yes I have an objection, it will slow down all backups
>> 
>> As mentioned elsewhere, someone consider restore more important than backups.
> 
> So, he can add this index, I don't advise it because I think that we can do 
> get the same improvement for restore with a code patch and without touching 
> the backup part.

Such a code patch is much welcomed, but I don't see anyone able to produce it 
yet.  I'm willing but have no time.

>> Slow down by how much?  Are we talking a huge performance hit here?
> 
> No real idea, this is very tricky on this part. I can imagine that the 
> Filename repartition is very special, so we can have surprises, current 
> indexes are well sorted (always by JobId on this table), so their update is 
> very fast.

Well, please don't reject an idea until we know what the results are.  I'm 
assuming you're willing to look at test results.

> As I said before, the code can be improved, no need to add an index. If it's 
> really needed, I want to see performance tests with "large" and concurrent 
> jobs.
> 
>>> to speed up very special restore case.
>> 
>> What aspect of this restore do you consider special?
> 
> For me, a normal restore is the option 3 or 5. I never had to use the option 
> involved in this problem. That's why I'm saying that.

Normal varies.  Widely.  We are special cases.  The wider user base decide what 
is normal.

> 
>>> I think that the problem is more on the database tuning or on the query
>>> itself. I have the same kind of query in Bweb and it runs instantly
>>> (that displays all version of a file for a client) on very large catalog.
>> 
>> 
>>> When you add new indexes on the File table it leads to support problems
>>> where people are complaining about backup speed...
>>> 
>>>>> bacula=# \d file
>>>>>                             Table "public.file"
>>>>>   Column   |  Type   |                       Modifiers
>>>>> ------------+---------+-------------------------------------------------------
>>>>> fileid     | bigint  | not null default 
>>>>> nextval('file_fileid_seq'::regclass)
>>>>> fileindex  | integer | not null default 0
>>>>> jobid      | integer | not null
>>>>> pathid     | integer | not null
>>>>> markid     | integer | not null default 0
>>>>> lstat      | text    | not null
>>>>> md5        | text    | not null
>>>>> filenameid | integer | not null
>>>>> Indexes:
>>>>>    "file_pkey" PRIMARY KEY, btree (fileid)
>>>>>    "file_filenameid_idx" btree (filenameid)
>>>>>    "file_jobid_idx" btree (jobid)
>>>>>    "file_jpfid_idx" btree (jobid, pathid, filenameid)
>>>>>    "file_pathid" btree (pathid)
>>>>>    "file_pathid_idx" btree (pathid)
>>>>>    "testing" btree (fileid)
>>> 
>>> Interesting to have two indexes on fileid, and two indexes on pathid :-)
>> 
>> Interesting indeed.  Testing is clearly for... testing.  :)
>> 
>> I don't know about file_pathid.  However, this database has been around 
>> since before the PostgreSQL module was added.
> 
> You can remove them :-)

I'm sure I can.  But no harm is being observed at present.  Other things have 
priority at present.

> 
> Bye
> 
> 

-- 
Dan Langille - http://langille.org


------------------------------------------------------------------------------
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
_______________________________________________
Bacula-devel mailing list
Bacula-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to