Hi Andy,

Thanks for the reply.

>The in-memory dataset described above is fully transactional
Interesting that I didn't know that it is different from using TDB even I used 
to use only for test purpose because I thought it is the same as TDB.

I have another question that is how you can keep this persistent because 
in-memory means if application is crash with whatever the reason is, it would 
be losed.
Am I right?

(Just for your understanding, to use ramdisk is just for dev and stg 
environment for functional test not production. For prodecution we will use SSD)

Thanks
Daniel




-----Original Message-----
From: Andy Seaborne <[email protected]> 
Sent: Friday, December 7, 2018 8:09 PM
To: [email protected]
Subject: Re: Is there any way to keep same size between real data and TDB



On 07/12/2018 01:03, Lee, Seokju | Daniel | TPDD wrote:
> Greetings,
> 
> I am using Apache Jena 3.7.0 now and encounter the following issue so I would 
> like to know how to solve it.
> 
> Background:
> 
>    *   We created our own sparql endpoint for using apache jena.
>    *   Sometimes we need to clear data store and restore from new ttl file.
>    *   For performance, we are using ram disk for TDB instead of SSD with our 
> own reason.
>    *   We thought that we have big enough memory for TDB
> 
> Issue
> 
>    *   Our application was just down because of full of ram disk
>    *   Back then, we repeated to restore from new ttl files
>    *   The number of TTL is 1.5 millions and file size is around 250MB
>    *   Ramdisk size is 4 GB (first time when we restore it, ram disk used 
> under 1 GB)

Have you considered using an in-memory Jena graph?

<#dataset> rdf:type ja:MemoryDataset;
    ja:data "data.trig";
.

> 
> Investigating
> 
>    *   I think nodes.dat is real data and looks like SPO.dat, POS.dat, 
> OSP.dat looked didn't remove old data that I removed in my application.
> 
> Question
> 
>    *   Is there any way to keep same size between real data and TDB?

ja:MemoryDataset

>    *   We are using for removal with "Model.removeAll()" and "TDB.sync()"

TDB.sync() is not necessary when using transactions.  And not using 
transactions is not a good idea for a SPARQL endpoint. TDB.sync is legacy and 
for older single-threaded applications.

The in-memory dataset described above is fully transactional (Serialization 
level isolation), uses the heap for storage so only uses what is needed and 
deleted data gets garbage collected.

(TDB2 has a compaction operation but does mean there are times where there are 
two copies of the database.)

     Andy

> 
> Thanks
> Daniel
> 

Reply via email to