https://issues.apache.org/jira/browse/JENA-1222
A mechanism to cause the journal to flush early if the size goes beyond
TransactionManager.JournalThresholdSize
Default is -1 (=> the mechanism is off)
Also JENA-1224 - limiting how long the journal can get in commits for
when readers lock out flushing.
Andy
On 23/08/16 13:44, Joint wrote:
Hi. Sorry for the delay. That worked, commits in batches of 10k take ~2 seconds
now. This increases to ~5 second when the TDB is almost 1TB but it's
predictable...
I'm looking at tdb2 in a test env...
Thank you.
Dick
-------- Original message --------
From: Andy Seaborne <[email protected]>
Date: 09/08/2016 13:34 (GMT+00:00)
To: [email protected]
Subject: Re: Stall when committing a write transaction.
On 08/08/16 12:33, Dick Murray wrote:
Hello.
Looking for ideas and if anyone else has come across this...
I have a bulk load (same as the previous OOME question) which auto commits
after 25k quads have been added then begins a new write transaction. All of
the commits average 2 seconds but one takes 42 seconds. ~500K quads are
added with ~500MB on disk storage. I've changed the underlying storage from
HHD to SSD, to USB MS and I still get the same symptoms.
Different files give different stalls, some have multiple stalls, typically
around 40 seconds but some are 2 minutes. iotop is not showing anything
"odd" and the GC isn't stressing. I can repeat this with a new TDB and a
25M quad TDB.
Is the TDB having to copy write new "blocks" to balance it's storage at
some point? Whilst it will stall at some point the point is not always the
same.
Jena 3.1, Ubuntu 16.04, 8 cores 16GB RAM, JVM Xmx 4GB G1GC.
Log below shows consistent ~2 second commits bar one.
TIA Dick.
Hi there,
The burstiness might be due to the commit batching though interactions
with the OS file system is also possible.
Try setting
TransactionManger.QueueBatchSize
to 0, 2, and a few other small integers (the default is 10).
If you could try that, it would be more data as to what is happening.
This is to amalgamate small commits - it would be better to factor in
the size of commits but it doesn't (the size of the journal is easy to
determime so a simple threshold there could work).
Have you had a moment to try TDB2? It will behave differently here -
the updates to the database happen as the transaction proceeds so they
happen once and have OS-level write buffering going on, rather than
happening exactly when told to. And they only write once, not once to
the journal and once in a random access pattern to the main DB which is
also potentially nasty.
The only issue with TDB2 at the moment is that the database grows. It
has all generations of the database available for all time. It needs a
process to reclaim old space (a GC problem) although access to temporal
versions could be considered a advantage as well.
Andy