On 04/04/17 08:45, Dimov, Stefan wrote:
Thanks Dave,

Yes, I’m using TDB.

Using memory would be faster, I guess, but would the machine be able to handle 
millions of triples? Is Jena optimized for that?

Fundamentally the Jena reasoner isn't that scalable so whether it can handle "millions of triples" depends on the specific rules and whether that's ~ 2 million or ~200 million.

As a rule of thumb for just storage and management I would allow 1k per triple (depending on literal sizes) so a 2MT dataset would need ~2Gb. For simple inference you might "only" need a few times that that but sky's the limit. Thing to do is give it a try with say 10GB and see where you get.

Your other option, depending again on the specifics of what you are trying to do, is to not use the reasoner at all but perform equivalent processing using SPARQL updates which can then run directly over TDB.

Dave


S.

On 4/4/17, 12:34 AM, "Dave Reynolds" <[email protected]> wrote:

    The reasoners store all their information in memory.

    Your mention of transactions suggests that you are storing into a TDB or
    other database-backed store. That will not enable the reasoner to scale
    and will just slow things down. You'll get better performance by loading
    the data into memory and then applying the reasoner to that.

    You will, of course, need to allocate enough memory to this.

    What performance is like and how much memory is needed will depend on
    the details of your rules. After all the core RDFS is less than ten rules!

    Dave

    On 04/04/17 04:07, Dimov, Stefan wrote:
    > … and after an hour or so, eventually it failed with:
    >
    > Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    >                 at 
org.apache.jena.reasoner.rulesys.impl.BindingVectorMultiSet.getPartialEnv(BindingVectorMultiSet.java:119)
    >                 at 
org.apache.jena.reasoner.rulesys.impl.BindingVectorMultiSet.put(BindingVectorMultiSet.java:159)
    >                 at 
org.apache.jena.reasoner.rulesys.impl.BindingVectorMultiSet.add(BindingVectorMultiSet.java:91)
    >                 at 
org.apache.jena.reasoner.rulesys.impl.RETEQueue.fire(RETEQueue.java:105)
    >                 at 
org.apache.jena.reasoner.rulesys.impl.RETEClauseFilter.fire(RETEClauseFilter.java:227)
    >                 at 
org.apache.jena.reasoner.rulesys.impl.RETEEngine.inject(RETEEngine.java:492)
    >                 at 
org.apache.jena.reasoner.rulesys.impl.RETEEngine.runAll(RETEEngine.java:474)
    >                 at 
org.apache.jena.reasoner.rulesys.impl.RETEEngine.fastInit(RETEEngine.java:163)
    >                 at 
org.apache.jena.reasoner.rulesys.FBRuleInfGraph.prepare(FBRuleInfGraph.java:471)
    >                 at 
org.apache.jena.reasoner.BaseInfGraph.requirePrepared(BaseInfGraph.java:530)
    >                 at 
org.apache.jena.reasoner.rulesys.FBRuleInfGraph.findWithContinuation(FBRuleInfGraph.java:557)
    >                 at 
org.apache.jena.reasoner.rulesys.FBRuleInfGraph.graphBaseFind(FBRuleInfGraph.java:587)
    >                 at 
org.apache.jena.reasoner.BaseInfGraph.graphBaseFind(BaseInfGraph.java:359)
    >                 at 
org.apache.jena.graph.impl.GraphBase.find(GraphBase.java:241)
    >                 at 
org.apache.jena.graph.GraphUtil.findAll(GraphUtil.java:99)
    >                 at 
org.apache.jena.graph.GraphUtil.addInto(GraphUtil.java:151)
    >                 at 
org.apache.jena.rdf.model.impl.ModelCom.add(ModelCom.java:225)
    >
    > S.
    >
    > From: Stefan Dimov <[email protected]>
    > Date: Monday, April 3, 2017 at 4:37 PM
    > To: "[email protected]" <[email protected]>
    > Subject: Long time to load the reasoner ...
    >
    > Hi all,
    >
    > I’m loading my Jena with a few million triples in chunks (every chunk in 
a separate transaction). It takes a few minutes.
    >
    > Then I’m loading the reasoner (in a separate transaction), which contains 
less than ten rules and it takes a loooong time.
    >
    > Why is this? Am I doing something wrong or that’s to be expected? Should 
I change some settings? Increase the memory?
    >
    > S.
    >


Reply via email to