On Jul 23, 2011, at 3:50 PM, Eric Bollengier wrote:
> Hello Dan,
>
> On 23/07/2011 04:14, Dan Langille wrote:
>> On Jul 21, 2011, at 12:51 AM, Eric Bollengier wrote:
>>> Yes I have an objection, it will slow down all backups
>>
>> As mentioned elsewhere, someone consider restore more important than backups.
>
> So, he can add this index, I don't advise it because I think that we can do
> get the same improvement for restore with a code patch and without touching
> the backup part.
Such a code patch is much welcomed, but I don't see anyone able to produce it
yet. I'm willing but have no time.
>> Slow down by how much? Are we talking a huge performance hit here?
>
> No real idea, this is very tricky on this part. I can imagine that the
> Filename repartition is very special, so we can have surprises, current
> indexes are well sorted (always by JobId on this table), so their update is
> very fast.
Well, please don't reject an idea until we know what the results are. I'm
assuming you're willing to look at test results.
> As I said before, the code can be improved, no need to add an index. If it's
> really needed, I want to see performance tests with "large" and concurrent
> jobs.
>
>>> to speed up very special restore case.
>>
>> What aspect of this restore do you consider special?
>
> For me, a normal restore is the option 3 or 5. I never had to use the option
> involved in this problem. That's why I'm saying that.
Normal varies. Widely. We are special cases. The wider user base decide what
is normal.
>
>>> I think that the problem is more on the database tuning or on the query
>>> itself. I have the same kind of query in Bweb and it runs instantly
>>> (that displays all version of a file for a client) on very large catalog.
>>
>>
>>> When you add new indexes on the File table it leads to support problems
>>> where people are complaining about backup speed...
>>>
>>>>> bacula=# \d file
>>>>> Table "public.file"
>>>>> Column | Type | Modifiers
>>>>> ------------+---------+-------------------------------------------------------
>>>>> fileid | bigint | not null default
>>>>> nextval('file_fileid_seq'::regclass)
>>>>> fileindex | integer | not null default 0
>>>>> jobid | integer | not null
>>>>> pathid | integer | not null
>>>>> markid | integer | not null default 0
>>>>> lstat | text | not null
>>>>> md5 | text | not null
>>>>> filenameid | integer | not null
>>>>> Indexes:
>>>>> "file_pkey" PRIMARY KEY, btree (fileid)
>>>>> "file_filenameid_idx" btree (filenameid)
>>>>> "file_jobid_idx" btree (jobid)
>>>>> "file_jpfid_idx" btree (jobid, pathid, filenameid)
>>>>> "file_pathid" btree (pathid)
>>>>> "file_pathid_idx" btree (pathid)
>>>>> "testing" btree (fileid)
>>>
>>> Interesting to have two indexes on fileid, and two indexes on pathid :-)
>>
>> Interesting indeed. Testing is clearly for... testing. :)
>>
>> I don't know about file_pathid. However, this database has been around
>> since before the PostgreSQL module was added.
>
> You can remove them :-)
I'm sure I can. But no harm is being observed at present. Other things have
priority at present.
>
> Bye
>
>
--
Dan Langille - http://langille.org
------------------------------------------------------------------------------
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel