Hi,
Thank you for the reply. It is a relatively complex test method to
reproduce, unfortunately, because it requires JDK 10 with the
jdk.incubator.httpclient. The client also connects to the app with
HTTP/2. I suspect that any asynchronous client, however, could
produce a
similar result.
Here are some details:
1. TrellisLDP has configuration options to write to an in memory or
TDB
filesystem dataset. Both of these options work with asynchronous
writes,
and TrellisLDP depends on jena-osgi and jena-tdb2 version 3.6.0.
2. Fuseki is deployed in a Tomcat 9 container and trellis-app
connects to
it over HTTPS using RDFConnection.connect("https:/
/fuseki:8443/fuseki/trellis"). (I do not suspect that Tomcat or the
protocol are consequential to the problem).
3. The tests always start from a clean dataset.
4. The client test does repeated PUTs of a 5k size n-triples
dataset (400
iterations).
5. The client test is run in JUnit5 in a JDK 10 JVM on localhost.
6. The trellis-app and Fuseki deployment is dockerized. The Fuseki
dataset
directory is bound to the localhost filesystem. The container
environments
are initialized clean for each run, which facilitates reproducible
test
output.
I have committed server and client access logs from a 3.4.0 deployment
and
a 3.6.0 deployment here:
https://github.com/pan-dora/ldp-client/tree/async/fuseki-issue
The 3.4.0 test starts at 12:19:05.807 (Central European Summer Time).
The
access log records 400 PUT requests all with a 201 response in 25
seconds)
The 3.6.0 test starts at 12:27:18.161 (Central European Summer Time).
The
access log records 266 PUT requests with a 201 response and 134 with a
500
response in 34 seconds)
The first exception occurs 23 seconds into the test (line 12369 in
fuseki-3.6.0-async.txt):
[2018-03-31 10:27:41] BindingTDB ERROR get1(?object)
org.apache.jena.tdb.base.file.FileException: In the middle of an
alloc-write
Thank you again for your assistance, and I am glad to do further
diagnostics. I have considered applying incremental commit patches
to a
3.4.0 build to isolate the changeset(s) as the next course of action.
Christopher Johnson
Scientific Associate
Universitätsbibliothek Leipzig
On 31 March 2018 at 09:26, Andy Seaborne <[email protected]> wrote:
Hi there,
The log files would be useful to know what the server was doing at
the
time and to see the stacktraces. The ideal is a complete, minimal
example
which we can use to investigate. Without a runnable setup, its a bit
harder to investigate.
Some questions to help me understand what is happening:
What does the the code in pan-dora/ldp-client is do? If it is
concurrent
operations, what else is the client doing at the time?
Does it work if you go back to 3.4.0 after using 3.5.0 where it went
wrong?
What's the Fuseki server configuration?
What's the history of the TDB database? How long ago was it
created and
has it been used outside of Fuseki or in different Fuseki
configurations?
JENA-1482 (which isn't in a released version of Jena) is a few
diagnostics
to catch error situations earlier.
Andy
On 30/03/18 07:06, Christopher Johnson wrote:
Hi list,
I apologize, but on further evaluation, I have determined that it
definitely works in 3.4.0 but not in 3.5.0. So, the scope of
changes
(since 2017 July) is a bit broader.
Also, for reference, here is a link to the client method
<https://github.com/pan-dora/ldp-client/blob/master/src/main
/java/cool/pandora/ldpclient/LdpClientImpl.java#L875-L879>.
If you need detailed server logs, please let me know.
Christopher Johnson
Scientific Associate
Universitätsbibliothek Leipzig
On 30 March 2018 at 07:08, Christopher Johnson
<[email protected]>
wrote:
Hi list,
I am developing a client that writes asynchronously to Fuseki via
TrellisLDP <https://github.com/trellis-ldp/trellis>. There
seems to
be
an issue since 3.6.0 that throws a BindingTBD error.
The exception is this:
fuseki | [2018-03-27 17:02:44] BindingTDB ERROR get1(?parent)
fuseki | java.nio.BufferOverflowException
fuseki | at java.nio.HeapByteBuffer.put(He
apByteBuffer.java:214)
fuseki | at sun.nio.ch.IOUtil.read(IOUtil.java:200)
I looked at the code in BindingTBD and reverted the change in
JENA-1482:
tests for null NodeIds at the TDB level" to see if a different
result
would
occur.
Now it throws this exception:
fuseki | [2018-03-30 03:39:34] BindingTDB ERROR get1(?object)
fuseki | org.apache.jena.tdb.base.file.FileException: In the
middle
of an alloc-write
fuseki | at org.apache.jena.tdb.base.objec
tfile.ObjectFileStorage.
read(ObjectFileStorage.java:311)
This looks like a concurrency issue. I do not know why it works in
3.5.0. Any ideas?
Thank you for your consideration of this.
Christopher Johnson
Scientific Associate
Universitätsbibliothek Leipzig