Aha!  I do have indexes on my actual app, but the delete is done with a 
sub-query, for which it appears the index isn’t being used e.g. “delete from 
table where id in (select id from Y)”.
Replacing this with multiple single deletes has solved the performance issue.

With the example below, I had no index on id, but as there were only 7 rows in 
the table I’d assumed it wouldn’t make a difference.  I’m slightly surprised 
that in such a scenario the size of the blob affects the sequential scan of the 
table so much, but now that I’m aware of it, it’s not an issue.  Adding an 
index to id on the table reduced all the delete times to around 15ms.

Many thanks!

                Andy

From: Geoff hendrey [mailto:[email protected]]
Sent: 20 April 2009 15:38
To: Derby Discussion
Subject: Re: Blobs & Delete speed

what indexes do you have? I would definitely put an index on id. I have blob 
tables with large image files, and I'm measuring end-to-end delete (from web 
browser application to server and back) at around 100 ms.

-geoff
“XML? Too much like HTML. It'll never work on the Web!”
-anonymous


________________________________
From: Andrew Lawrenson <[email protected]>
To: "[email protected]" <[email protected]>
Sent: Monday, April 20, 2009 2:53:14 AM
Subject: Blobs & Delete speed
Hi,

I’ve been experiencing a performance problem with deleting blobs in derby, and 
was wondering if anyone could offer any advice.

This is primarily with 10.4.2.0 under windows and solaris, although I’ve also 
tested with the new 10.5.1.1 release candidate (as it has many lob changes), 
but this makes no significant difference.

The problem is that with a table containing many large blobs, deleting a single 
row can take a long time (often over a minute).

I’ve reproduced this with a small test that creates a table, inserts a few rows 
with blobs of differing sizes, then deletes them.

The table schema is simple, just:

create table blobtest( id integer generated BY DEFAULT as identity, b blob )

and I’ve then created 7 rows with the following blob sizes : 1024 bytes, 1Mb, 
10Mb, 25Mb, 50Mb, 75Mb, 100Mb.
I’ve read the blobs back, to check they have been created properly and are the 
correct size.

They have then been deleted using the sql statement ( “delete from blobtest 
where id = X” ).

If I delete the rows in the order I created them, average timings to delete a 
single row are:

1024 bytes: 19.5 seconds
1Mb: 16 seconds
10Mb: 18 seconds
25Mb: 15 seconds
50Mb: 17 seconds
75Mb: 10 seconds
100Mb: 1.5 seconds

If I delete them in reverse order, the average timings to delete a single row 
are:
100Mb: 20 seconds
75Mb: 10 seconds
50Mb: 4 seconds
25Mb: 0.3 seconds
10Mb: 0.25 seconds
1Mb: 0.02 seconds
1024 bytes: 0.005 seconds

If I create seven small blobs, delete times are all instantaneous.
It thus appears that the delete time seems to be related to the overall size of 
the rows in the table more than the size of the blob being removed.
I’ve run the tests a few times, and the results seem reproducible.

So, does anyone have any explanation for the performance, and any suggestions 
on how to work around it or fix it?  It does make using large blobs quite 
problematic in a production environment…

Many thanks in advance,

   Andy

Reply via email to