Hello,
unfortunately also in-memory appeared to be problematic in our use case.

I was running the attached update-script continously for 5 hours in an in-memry dataset (pxmeta_hub_fed).

After 5 hours I got this:


OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000007b9c00000, 102760448, 0) failed; error='Not enough space' (errno=12) #https://webmail.elisa.fi/?_task=mail&_action=compose&_id=1135307948604b34c97c460# # There is insufficient memory for the Java Runtime Environment to continue. # Native memory allocation (mmap) failed to map 102760448 bytes for committing reserved memory.
     # An error report file with more information is saved as:
     # /jena-fuseki/hs_err_pid1.log
     miettinj@sinivalas1:~> docker ps

jena-fuseki was executed in a docker container with miettinj/pxpro-jena-fuseki:fuseki3.17.0, which is the same image as blankdots/jena-fuseki:fuseki3.17.0 except that the 1st one uses JVM_ARGS=-Xmx2g.

So it seems that memory consumption increases also when using in-memory data bases.

Would you suggest any idea for fix ?

Br, Jaana


Andy Seaborne kirjoitti 10.3.2021 17:04:
On 10/03/2021 02:33, [email protected] wrote:
Hi, Thanks for your quick anserwer and pls see my answers below!

How many triples?
And is is new data to replace the old data or in addition to the existing data?

476955 triplets, most parts will bu just same as the old data, just some triplets may change. And some new triplets may be added.

This is a TDB1 database?

jena-fuseki UI does not mention TDB1, but this is persistent and not TDB2.

But in our use case also memory-based datasets might work, as far as I've been testing in my PC they seem to work even better than persistent ones. What do you think ?

In-memory should be fine. Obviously, its lost when the server exits
but it sounds like the data isn't the primary copy and loading 476955
triples at start up is not big.

The heap size by default is quite small in the scripts. It might be an
idea to increase it a bit to give query working space but 0.5 million
is really not very big.

    Andy


Br Jaana



Andy Seaborne kirjoitti 9.3.2021 19:58:
Hi Jaana,

On 09/03/2021 11:40, [email protected] wrote:
hello,

I've met the following problem with jena-fuseki (should I create bug ticket ?):

We need to update jena-fuseki dataset every 5 minutes by a 50 Mbytes ttl-file.

How many triples?
And is is new data to replace the old data or in addition to the existing data?

This causes the memory consumption in the machine where jena-fuseki is running to increase by gigas.

This was 1st detected with jena-fuseki 3.8 and later with jena-fuseki 3.17.

To be exact I executed blankdots/jena-fuseki:fuseki3.17.0 in a docker container posting continously that ttl-file into the same dataset (pxmeta_hub_fed_prod).

This is a TDB1 database?

TDB2 is better at this - the database still grows but there is a way
to compact the database live.

JENA-1987 exposes the compaction in Fuseki.
https://jena.apache.org/documentation/tdb2/tdb2_admin.html

The database grows for two reasons: it allocates space in sparse files
in 8M chunks but the space does not count in du until actually used.
The space for deleted data is not fully recycled across transactions
because it may be in-use in a concurrent operation. (TDB1 would be
very difficult to do block ref counting; in TDB2 the solution is
compaction.)

    Andy


see the output of command "du -h | sort -hr|head -30" below. attached the shell-script that I was executing during the time period.

root@3d53dc3fdf8d:/#alias du3="du -h | sort -hr|head -30"
root@3d53dc3fdf8d:/# du3
9.0G    .
8.5G    ./data/fuseki/databases/pxmeta_hub_fed_prod
8.5G    ./data/fuseki/databases
8.5G    ./data/fuseki
8.5G    ./data

root@3d53dc3fdf8d:/# date
Tue Mar  9 06:02:46 UTC 2021
root@3d53dc3fdf8d:/#


3.5G    .
3.0G    ./data/fuseki/databases/pxmeta_hub_fed_prod
3.0G    ./data/fuseki/databases
3.0G    ./data/fuseki
3.0G    ./data
root@3d53dc3fdf8d:/# date
Tue Mar  9 05:28:09 UTC 2021
root@3d53dc3fdf8d:/#

Br, Jaana
drop all;

# 
# A sparql to combine name graphs in server  
<http://ramen.<myorg>:3030/ds/query>  into one NG.
# This script should be always executed in the target dataset of the target 
jena-fuseki-server.
#
insert{
  graph ?g {
    ?s ?p ?o
  }
}
where{
  service <http://ramen.<myorg>:3030/ds/query> {        # alkuperäinen 
RDF-dataset
    {
      graph ?g { }.
    }
    graph ?g {
      ?s ?p ?o
    }
  }
};

# Muunnos: Variable objektina
delete{
  graph ?g {
    ?s ?p ?o_before.
  }
}
insert{
  graph ?g {
    ?s ?p ?o_after.
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?o_before a <http://www.<myorg>/rdf/ontologies/gsimsf/Variable>.
    ?s ?p ?o_before.
    bind(iri(concat("http://www.<myorg>/rdf/data/gsimsf/Variable/", 
strafter(str(?g), "http://www.<myorg>/tilasto/"), "/", strafter(str(?o_before), 
"http://www.<myorg>/rdf/data/gsimsf/Variable/"))) as ?o_after)
  }
};

# Muunnos: Variable subjektina
delete{
  graph ?g {
        ?s_before ?p ?o.
  }
}
insert{
  graph ?g {
        ?s_after ?p ?o.
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?s_before a <http://www.<myorg>/rdf/ontologies/gsimsf/Variable>.
    ?s_before ?p ?o.    
    bind(iri(concat("http://www.<myorg>/rdf/data/gsimsf/Variable/", 
strafter(str(?g), "http://www.<myorg>/tilasto/"), "/", strafter(str(?s_before), 
"http://www.<myorg>/rdf/data/gsimsf/Variable/"))) as ?s_after)
  }
};


# Muunnos: EnumeratedValueDomain objektina
delete{
  graph ?g {
    ?s ?p ?o_before.
  }
}
insert{
  graph ?g {
    ?s ?p ?o_after.
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?o_before a 
<http://www.<myorg>/rdf/ontologies/gsimsf/EnumeratedValueDomain>.
    ?s ?p ?o_before.
    
bind(iri(concat("http://www.<myorg>/rdf/data/gsimsf/EnumeratedValueDomain/", 
strafter(str(?g), "http://www.<myorg>/tilasto/"), "/", strafter(str(?o_before), 
"http://www.<myorg>/rdf/data/gsimsf/EnumeratedValueDomain/"))) as ?o_after)
  }
};

# Muunnos: EnumeratedValueDomain subjektina
delete{
  graph ?g {
        ?s_before ?p ?o.
  }
}
insert{
  graph ?g {
        ?s_after ?p ?o.
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?s_before a 
<http://www.<myorg>/rdf/ontologies/gsimsf/EnumeratedValueDomain>.
    ?s_before ?p ?o.    
    
bind(iri(concat("http://www.<myorg>/rdf/data/gsimsf/EnumeratedValueDomain/", 
strafter(str(?g), "http://www.<myorg>/tilasto/"), "/", strafter(str(?s_before), 
"http://www.<myorg>/rdf/data/gsimsf/EnumeratedValueDomain/"))) as ?s_after)
  }
};

# Muunnos: DescribedValueDomain objektina
delete{
  graph ?g {
    ?s ?p ?o_before.
  }
}
insert{
  graph ?g {
    ?s ?p ?o_after.
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?o_before a <http://www.<myorg>/rdf/ontologies/gsimsf/DescribedValueDomain>.
    ?s ?p ?o_before.
    bind(iri(concat("http://www.<myorg>/rdf/data/gsimsf/DescribedValueDomain/", 
strafter(str(?g), "http://www.<myorg>/tilasto/"), "/", strafter(str(?o_before), 
"http://www.<myorg>/rdf/data/gsimsf/DescribedValueDomain/"))) as ?o_after)
  }
};

# Muunnos: DescribedValueDomain subjektina
delete{
  graph ?g {
        ?s_before ?p ?o.
  }
}
insert{
  graph ?g {
        ?s_after ?p ?o.
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?s_before a <http://www.<myorg>/rdf/ontologies/gsimsf/DescribedValueDomain>.
    ?s_before ?p ?o.    
    bind(iri(concat("http://www.<myorg>/rdf/data/gsimsf/DescribedValueDomain/", 
strafter(str(?g), "http://www.<myorg>/tilasto/"), "/", strafter(str(?s_before), 
"http://www.<myorg>/rdf/data/gsimsf/DescribedValueDomain/"))) as ?s_after)
  }
};


# Lisää ?CodedVariable 
<http://www.<myorg>/rdf/data/gsimsf/isPresentationOfRepresentedVariable> 
?Variable
insert{
  graph ?g {
        ?CodedVariable 
<http://www.<myorg>/rdf/data/gsimsf/isPresentationOfRepresentedVariable> 
?Variable
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?Variable a <http://www.<myorg>/rdf/ontologies/gsimsf/Variable>;
      <http://www.<myorg>/rdf/data/gsimsf/hasEnumeratedValueDomain> 
?EnumeratedValueDomain.
    ?CodedVariable a <http://www.<myorg>/rdf/ontologies/pxt/PxCodedVariable> ;
      <http://www.<myorg>/rdf/data/pxt/isPresentationOfEnumeratedValueDomain> 
?EnumeratedValueDomain.
  }
};

# Lisää ?NumericalVariable 
<http://www.<myorg>/rdf/data/gsimsf/isPresentationOfRepresentedVariable> 
?Variable
insert{
  graph ?g {
        ?NumericalVariable 
<http://www.<myorg>/rdf/data/gsimsf/isPresentationOfRepresentedVariable> 
?Variable
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?Variable a <http://www.<myorg>/rdf/ontologies/gsimsf/Variable>;
      <http://www.<myorg>/rdf/data/gsimsf/hasDescribedValueDomain> 
?DescribedValueDomain.
    ?NumericalVariable a 
<http://www.<myorg>/rdf/ontologies/pxt/PxNumericalVariable> ;
      <http://www.<myorg>/rdf/data/pxt/isPresentationOfDescribedValueDomain> 
?DescribedValueDomain.
  }
};

# refersToOutputFile IRIksi
delete{
  graph ?g {
        ?CubeMeta <http://www.<myorg>/rdf/data/cubemeta/refersToOutputFile> 
?outputFileRef.
  }
}
insert{
  graph ?g {
        ?CubeMeta <http://www.<myorg>/rdf/data/cubemeta/refersToOutputFile> 
?OutputFile.
  }
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?CubeMeta a <http://www.<myorg>/rdf/ontologies/cubemeta/CubeMeta> ;
      <http://www.<myorg>/rdf/data/cubemeta/refersToOutputFile> ?outputFileRef.
    bind(iri(concat("http://www.<myorg>/rdf/data/outchannel/OutputFile/", 
substr(?outputFileRef, 1,4), "/", substr(?outputFileRef, 7,6))) as ?OutputFile)
  }
};

# Yhdistetään named graphit
insert{
  ?s ?p ?o.
}
where{
  {
    graph ?g { }.
  }
  graph ?g {
    ?s ?p ?o.
  }
};

# säilytetään vain default
drop named;

# refersToOutputChannel IRIksi
delete{
  ?OutputFile <http://www.<myorg>/rdf/data/outchannel/refersToOutputChannel> 
?outputChannelRef.
}
insert{
  ?OutputFile <http://www.<myorg>/rdf/data/outchannel/refersToOutputChannel> 
?OutputChannel.
}
where{
  ?OutputFile a <http://www.<myorg>/rdf/ontologies/outchannel/OutputFile> ;
    <http://www.<myorg>/rdf/data/outchannel/refersToOutputChannel> 
?outputChannelRef.
  ?OutputChannel a <http://www.<myorg>/rdf/ontologies/outchannel/OutputChannel> 
;
    <http://www.<myorg>/rdf/data/outchannel/hasOutputChannelId> 
?outputChannelRef.
};

Reply via email to