On 10/12/2018 04:24, Lee, Seokju | Daniel | TPDD wrote:
Hi Andy,
Thanks for the reply.
The in-memory dataset described above is fully transactional
Interesting that I didn't know that it is different from using TDB even I used
to use only for test purpose because I thought it is the same as TDB.
TDB in -memory if TDB with a location of "--mem--" (assembler) or
TDBFactory.createDataset().
The in-memory dataset is DatasetFactory.createTxnMem().
I have another question that is how you can keep this persistent because
in-memory means if application is crash with whatever the reason is, it would
be losed.
Am I right?
Yes.
Personally, I reload the data for each test or test suite or run Fuseki
(in-process or separately).
(Just for your understanding, to use ramdisk is just for dev and stg
environment for functional test not production. For prodecution we will use SSD)
For staging? I'd use the same setup as prod.
Andy
Thanks
Daniel
-----Original Message-----
From: Andy Seaborne <[email protected]>
Sent: Friday, December 7, 2018 8:09 PM
To: [email protected]
Subject: Re: Is there any way to keep same size between real data and TDB
On 07/12/2018 01:03, Lee, Seokju | Daniel | TPDD wrote:
Greetings,
I am using Apache Jena 3.7.0 now and encounter the following issue so I would
like to know how to solve it.
Background:
* We created our own sparql endpoint for using apache jena.
* Sometimes we need to clear data store and restore from new ttl file.
* For performance, we are using ram disk for TDB instead of SSD with our
own reason.
* We thought that we have big enough memory for TDB
Issue
* Our application was just down because of full of ram disk
* Back then, we repeated to restore from new ttl files
* The number of TTL is 1.5 millions and file size is around 250MB
* Ramdisk size is 4 GB (first time when we restore it, ram disk used
under 1 GB)
Have you considered using an in-memory Jena graph?
<#dataset> rdf:type ja:MemoryDataset;
ja:data "data.trig";
.
Investigating
* I think nodes.dat is real data and looks like SPO.dat, POS.dat, OSP.dat
looked didn't remove old data that I removed in my application.
Question
* Is there any way to keep same size between real data and TDB?
ja:MemoryDataset
* We are using for removal with "Model.removeAll()" and "TDB.sync()"
TDB.sync() is not necessary when using transactions. And not using
transactions is not a good idea for a SPARQL endpoint. TDB.sync is legacy and
for older single-threaded applications.
The in-memory dataset described above is fully transactional (Serialization
level isolation), uses the heap for storage so only uses what is needed and
deleted data gets garbage collected.
(TDB2 has a compaction operation but does mean there are times where there are
two copies of the database.)
Andy
Thanks
Daniel