We are doing a “truss -p <pid> 2>&1 | grep open” but what I am seeing right now 
is that one complete pass was made through the logs and I thought that it was 
about done.  Now it is going into a pattern where it processes something like

        logxxxx359.dat
        logxxxx358.dat
        …
        logxxxx242.dat

It is updating some “seg0” files (either re-doing or undoing) and then 
sometimes creates “logxxxx360.dat” (ie. the next number) and then starts the 
process over again.   Been doing recovery now for over 5 hours.  Ouch indeed.

Restoring from a backup is about a 5 hour process as well as the database is 
about 550G.   Just un-taring the backup file takes a substantial amount of time.

The problem occurred when an external system all of a sudden issued 128 
concurrent requests to our server where each request removes 10’s of thousands 
of records in a transaction.  These all caused a gridlock on locking, lock 
timeouts, the cleanup messages being output to derby-?.log (rolling log 
support), etc.   The system administrator forcefully shutdown Derby and 
restarted.

I know the deletion of 10’s of thousands of records in a transaction is bad 
design.  It did not start out at this scale (a few hundred records to start) 
but grew into this problem.   The item being deleted represents a piece of 
network equipment and we can’t have a partial piece of equipment laying around 
in the database.   Shortly this will be re-designed to mark as deleted and then 
perform the deletion in the background at little chunks at a time, but of 
course the issue arose before the solution is complete.

So what would be useful would be something like:

Performing database recovery
Starting analysis pass
215 transactions detected to be processed
Starting redo pass
…. anything that could give some feedback
Starting undo pass
….anything that could give some feedback





> On Apr 20, 2016, at 12:10 AM, Bryan Pendleton <[email protected]> 
> wrote:
>
>> Another issue with about 1100 log files needing to be
>> processed after a restart of the database network server.
>
> Ouch. :(
>
>> is any logging that can tell when pass the database recovery is on
>
> Perhaps use an operating system level monitoring tool (Process Monitor
> on Windows, strace on Linux, etc.) to see if you can watch the store
> opening and accessing each log file.
>
> I'm supposing that the recovery processing essentially reads those
> logs sequentially, so if you watch it for a while you can see how
> long it takes to finish one log and move on to the next?
>
> bryan
>


Canoga Perkins
20600 Prairie Street
Chatsworth, CA 91311
(818) 718-6300

This e-mail and any attached document(s) is confidential and is intended only 
for the review of the party to whom it is addressed. If you have received this 
transmission in error, please notify the sender immediately and discard the 
original message and any attachment(s).

Reply via email to