Thanks for the detail. My guess is that the issue is log I/O to the log
device which in your case is also competing with the database I/O.
When you do the delete it is likely also killing your cache as it needs
to read in all the old data pages to do the delete.
You can probably make even the new system more scalable if you
can build in something that would group the inserts into more than one
per commit. I would not be surprised if you could get at least a 10
times more inserts per second if you could group 10 per commit. This
will help even when you change to the partitioned structure as you will
be able to handle way more inserts as you add even more devices.
I think you could really optimize the insert speed if you could funnel
all the inserts to one thread and have it batch the inserts and commit
them something like once a second. Since you are not cpu bound at all
I don't think the one thread would be a bottle neck.
I do agree though that the partitioned approach will likely be the most
scalable as the delete part will have almost no overhead in this scheme,
and is guaranteed to return all the space to the OS.
Bergquist, Brett wrote:
The data is actually coming from test measurements from devices in a network.
For testing purposes, I was able to change the firmware in a device to generate
more tests per second to up the rate. So there are more things going on than
just database work:
- thread receiving test measurements from many devices and decoding data
- passing data off to a Java Resource Connector (JCA Resource Adapter)
to an endpoint (Message Driven Bean)
- inserting the data
So yes, the inserts are being committed for each measurement. In the real
world, we are seeing about 8 inserts/second continuous every second of every
day. This is going up however as more devices are added to the network. We
have gone from about 100K measurements to almost 800K measurements in the last
6 months, so growth is important.
The purging of the data is kicked off by a Quartz Job that purges out measurement results
that are older than 30 days (configurable). This is done in chunks of 1000 results per
transaction commit to keep the transaction log size down but still have somewhat good
performance. Right now this takes about 2 hours of time with the inserts continuing to
come in. One problem is that the deletion is effecting the insertion rate. The deletion
does not really have to be done "as fast as possible", only that it be done by
before the next day. So I have done some changes to allow both the deletion to occur
without affecting the insertion rate enough to where it becomes a problem. This will
hold us over for a few months.
In tuning this, I saw this issue that the insertion rate stays low long after
the deletion is over and done with. I can tell by the CPU usage and the disk
I/O usage as I know the pattern of normal insertion.
While this has worked okay for the last few years, my model is flawed and a
better usage will be partition the data.
The application is running at a customer site and the system is an old Sun V480
running Solaris 10. 4 CPU's and 8Gb of memory. The system is not CPU bound by
any means as it is usually about 70% idle. It might start to be I/O bound
during the deletion phase as the disks are not that large or current. I know I
should move the transaction log onto a second disk, but the up until just
recently the complexity of the setup outweighed the increase in performance.
This is pretty much a headless system that is never looked at as long as it
works correctly. No sysadmin, no DBA looking at the system. So up until the
recent dramatic increase of the number of devices in the network, Derby has
worked very well.
Again, I appreciate the feedback. I will look next week to write a standalone
Java application that will simulate this setup and see what I find.
Brett
-----Original Message-----
From: Mike Matrigali [mailto:[email protected]]
Sent: Friday, August 06, 2010 1:57 PM
To: [email protected]
Subject: Re: Question/observations on purged rows and insert rates
Bergquist, Brett wrote:
Also after testing we are seeing the following. With a database with no
deleted rows, my test application is inserting about 150 records/second
into a table. I let this run for about 2 million records and the insert
rate is consistent. Now I purge out 1.5 million records and run the
test again. The insert rate is now about 25 records/second. Running
the above compress with the PURGE_ROWS option and rerun the test and
still about 25 records/second. Run full SYSCS_UTIL.SYSCS_COMPRESS_TABLE
and rerun the test and the insert rate is back to 150 records/second.
Both of these insert rates seem really slow. I am going to guess
that you do one commit per insert, is that right?