Hi Andy!

Okay, I finally found the time to package these for you. I will send you URL links in a separate message.

I can't really blame you if you decide not to investigate further. We were surprised by this problem and hadn't made proper preparations, so we don't have the full history of the database, unfortunately. But the basic idea is very simple: the data set consists of graphs, each graph contains one SKOS vocabulary (downloaded from the web), and we have all of those in files that we uploaded using s-put - some may have been PUT multiple times. Some of the graphs (STW and EuroVoc) were extended (using s-post) with mappings to other vocabularies too. No other updates to the database have been performed apart from those s-put and s-post operations, until the final tdb2.tdbcompact which failed with the traceback I gave.

We have now started building anew dataset from scratch and are trying to be more diligent in logging all the operations, so that if this happens again, we have a better idea of what led to it.

vg0-root is not shared in any way. For all I know the VMware environment just provides a normal block device to the VM kernel.

-Osma

Andy Seaborne kirjoitti 17.04.2018 klo 19:34:
If you could wrap up a database, and data+sequcne of s-puts that would be great.  I don't have a VMWare environment to try it on but I can try to replicate it.  I don't know what else to try.

I don't see why a VM would make difference but elsewhere they seem to, maybe because some file process runs on the real hardware (docker), or may be file locking can be interfered with.

Is vg0-root shared in anyway?

     Andy

On 16/04/18 15:21, Osma Suominen wrote:
Hi Andy!

Forgot to answer the VM part - yes, this is a VM running on VMWare. This is what mount shows:

/dev/mapper/vg0-root on / type ext4 (rw,relatime,errors=remount-ro,data=ordered)

I.e. a normal ext4 filesystem on LVM.

We have plenty of VMs pretty much exactly like this and haven't experienced any filesystem related problems that I know of.

-Osma

PS. I've been trying to reproduce this with smaller data, but so far to no avail. But by digging the logs I noticed that the same problem appeared also in another TDB2 database that's configured within the same Fuseki instance on the same server. Also in that case the errors appeared soon after loading some new triples into specific graphs, overwriting their previous content.

One other question - is any of this docker or a VM, if so, what is the filesystem setup?

No, this is not Docker. Ubuntu 16.04 LTS amd64 with Java 9:

openjdk version "9-internal"
OpenJDK Runtime Environment (build 9-internal+0-2016-04-14-195246.buildd.src) OpenJDK 64-Bit Server VM (build 9-internal+0-2016-04-14-195246.buildd.src, mixed mode)


Just tell me what you really need (considering the size of the files) and I will send them to you.

-Osma





--
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suomi...@helsinki.fi
http://www.nationallibrary.fi

Reply via email to