Thanks for the detail. My guess is that the issue is log I/O to the log device which in your case is also competing with the database I/O.
When you do the delete it is likely also killing your cache as it needs
to read in all the old data pages to do the delete.

You can probably make even the new system more scalable if you
can build in something that would group the inserts into more than one
per commit. I would not be surprised if you could get at least a 10 times more inserts per second if you could group 10 per commit. This
will help even when you change to the partitioned structure as you will
be able to handle way more inserts as you add even more devices.

I think you could really optimize the insert speed if you could funnel
all the inserts to one thread and have it batch the inserts and commit
them something like once a second.  Since you are not cpu bound at all
I don't think the one thread would be a bottle neck.

I do agree though that the partitioned approach will likely be the most
scalable as the delete part will have almost no overhead in this scheme,
and is guaranteed to return all the space to the OS.

Bergquist, Brett wrote:
The data is actually coming from test measurements from devices in a network.  
For testing purposes, I was able to change the firmware in a device to generate 
more tests per second to up the rate.  So there are more things going on than 
just database work:

        - thread receiving test measurements from many devices and decoding data
        - passing data off to a Java Resource Connector (JCA Resource Adapter) 
to an endpoint (Message Driven Bean)
        - inserting the data

So yes, the inserts are being committed for each measurement.  In the real 
world, we are seeing about 8 inserts/second continuous every second of every 
day.  This is going up however as more devices are added to the network.  We 
have gone from about 100K measurements to almost 800K measurements in the last 
6 months, so growth is important.

The purging of the data is kicked off by a Quartz Job that purges out measurement results 
that are older than 30 days (configurable).  This is done in chunks of 1000 results per 
transaction commit to keep the transaction log size down but still have somewhat good 
performance.  Right now this takes about 2 hours of time with the inserts continuing to 
come in.  One problem is that the deletion is effecting the insertion rate.  The deletion 
does not really have to be done "as fast as possible", only that it be done by 
before the next day.  So I have done some changes to allow both the deletion to occur 
without affecting the insertion rate enough to where it becomes a problem.  This will 
hold us over for a few months.

In tuning this, I saw this issue that the insertion rate stays low long after 
the deletion is over and done with.  I can tell by the CPU usage and the disk 
I/O usage as I know the pattern of normal insertion.

While this has worked okay for the last few years, my model is flawed and a 
better usage will be partition the data.

The application is running at a customer site and the system is an old Sun V480 
running Solaris 10.  4 CPU's and 8Gb of memory.  The system is not CPU bound by 
any means as it is usually about 70% idle.  It might start to be I/O bound 
during the deletion phase as the disks are not that large or current.  I know I 
should move the transaction log onto a second disk, but the up until just 
recently the complexity of the setup outweighed the increase in performance.  
This is pretty much a headless system that is never looked at as long as it 
works correctly.  No sysadmin, no DBA looking at the system.  So up until the 
recent dramatic increase of the number of devices in the network, Derby has 
worked very well.

Again, I appreciate the feedback.  I will look next week to write a standalone 
Java application that will simulate this setup and see what I find.

Brett

-----Original Message-----
From: Mike Matrigali [mailto:[email protected]] Sent: Friday, August 06, 2010 1:57 PM
To: [email protected]
Subject: Re: Question/observations on purged rows and insert rates

Bergquist, Brett wrote:

Also after testing we are seeing the following. With a database with no deleted rows, my test application is inserting about 150 records/second into a table. I let this run for about 2 million records and the insert rate is consistent. Now I purge out 1.5 million records and run the test again. The insert rate is now about 25 records/second. Running the above compress with the PURGE_ROWS option and rerun the test and still about 25 records/second. Run full SYSCS_UTIL.SYSCS_COMPRESS_TABLE and rerun the test and the insert rate is back to 150 records/second.
Both of these insert rates seem really slow.  I am going to guess
that you do one commit per insert, is that right?




Reply via email to