[ 
https://issues.apache.org/jira/browse/JENA-901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14372470#comment-14372470
 ] 

ASF GitHub Bot commented on JENA-901:
-------------------------------------

GitHub user stain opened a pull request:

    https://github.com/apache/jena/pull/47

    JENA-901 LPDRuleEngine cache Guava from jena-shadowed-ext

    This patch builds on the discussion pull request #45 - so if #45 goes 
through, this is how to use Guava within Jena - importing 
`org.apache.jena.ext.com.google.common.cache.Cache`.
    
    The considerations of #45 apply here - so because `jena-core` gets a new 
dependency `jena-shadowed-ext` - even though that is our own brand new JAR - 
that should not go out as a patch-level upgrade, but forces a new minor version 
on pretty much everything jena-core and downstream. (We've been told of for 
adding dependencies in a patch before)
    
    See also a second pull request #48.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/stain/jena JENA-901-guava

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/jena/pull/47.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #47
    
----
commit 50511efc0e3cdeaee95ee7254483b1ea9c1f9fea
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-19T02:52:08Z

    JENA-901
    
    A simple BoundedMap - based on LinkedHashMap

commit 06abde7d170bdc76ac3f9a7ef4159bb625028cfe
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-19T02:52:30Z

    JENA-901 Use BoundedMap for tabledGoals cache
    
    configurable with system property
    jena.rulesys.lp.max_cached_tabled_goals

commit 0b2e72b7d604940c5b084f026c437484856e929a
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-19T02:55:06Z

    JENA-901 Test saturation of tabledGoals

commit 83d7e8cfdd50f3de849731bf56cadf1d1daec15c
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-19T03:53:02Z

    measure size of engine

commit 938db892729d7f1849c0a6f71b27ebd4d1be3ff9
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-19T04:16:02Z

    more reset.. still uses lots of memory

commit 3d01fffa0d9ab6ee34b7f0e8ac540a53f3bf22c3
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T16:23:36Z

    JENA-901 use a LinkedList instead of ArrayList
    
    for activeInterpreters. They could be closed in arbitrary order,
    and also this avoids a large ArrayList remaining after a big burst
    of concurrent queries.
    
    Simplify test. Check activeInterpreters is empty - this didn't
    happen as I had forgotten to call it.hasNext() (which any use of
    an iterator would normally do).
    
    Test now with MAX=1024 and 128M find() calls.

commit 4bc846cd10f07cdce923e1c0597cac1273103cd9
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T16:31:26Z

    JENA-901 BoundedMap -> BoundedLRUMap
    
    .. to point out that it's the least accessed entry
    that is pruned first

commit a293eca312f031c421e04cc4c7571f1321548a22
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T22:16:51Z

    Not using java-sizeof for testing anymore

commit 623543a6c31a255d83be0bdb525c09a2f7fd0fac
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T23:13:49Z

    jena-shadowed-ext containing guava 18.0

commit e8b53617dbb23eedb335a4f2ef3677e9607d1f00
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T23:24:22Z

    Ignore dependency-reduced-pom.xml
    
    Setting <dependencyReducedPomLocation>target/dependency-reduced-pom.xml 
might also work,
    but see
    
    
https://maven.apache.org/plugins/maven-shade-plugin/shade-mojo.html#dependencyReducedPomLocation

commit a0a75f46dbef59cf389cb1a53c5ed345ca63bd21
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T23:25:27Z

    Bump minor versions as jena-core gains a dependency

commit 07b65e0f1e6024a0e9735e5ab497aae349d3f70b
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T23:27:41Z

    typo

commit 0af6d343549e73fe79a71759774e4188d51e1b7d
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T23:42:50Z

    manage Guava version ranges
    
    I did not use this shaded jar within Elephas as it does not itself use
    Guava - Elephas does however have conflicting version dependencies on
    Guava 11.0.2 (from provided hadoop-common 2.6.0), 16.0.1 (from
    curator-client) and 18.0 (from airline dependency in
    jena-elephas-stats). I therefore changed jena-elephas outdated
    dependencyManagement on 11.0.2 to [18.0,) -- although this could be
    reduced to 16 as jena-elephas-stats is a demo-app and it can upgrade to
    18 locally in its own pom (ideally airline etc. should themselves use
    ranges)

commit 4544becf9cc48cc13e41df2dddafdfb09e4560ec
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T23:45:17Z

    jena-shadowed-ext dependency
    
    .. to be used by JENA-901

commit 01fd71cee73a9a04f9d936aaea9448cd5547aa0e
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T23:51:09Z

    jena-core dependency on jena-shadowed-ext
    
    .. in preparation for JENA-901

commit abad826df28caee70b49705615043015fab19db3
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-20T23:52:18Z

    Merge branch 'JENA-901-LPDRuleEngine-bound-cache' into JENA-901-guava

commit 5a9cd0ed41105652fd456c4814f8c87b64692727
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-21T00:08:38Z

    JENA-901 Replace BoundedLRUMap with Guava's Cache
    
    FIXME: those Callables needs some de-duplications

commit 83808a96eb1ab197822951b05250ebc4967341e2
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-21T00:31:05Z

    JENA-901 Javadoc warning about cache hit ignoring clauses parameter
    
    (Q: does this really matter?)

commit 4f512a5bdca68e86a091777a3a8a448cadef88b9
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-21T00:38:34Z

    Test for cache hit

commit a3ba98fe524f0088b920fd428104a7603e5cd5af
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-21T00:39:07Z

    JENA-901 Test for cache hits with Guava

commit c2eb92acf1460322dac4f57998ac20058a996bea
Author: Stian Soiland-Reyes <[email protected]>
Date:   2015-03-21T00:45:42Z

    getEngineForGraph method

----


> Make the cache of LPBRuleEngine bounded to avoid out-of-memory
> --------------------------------------------------------------
>
>                 Key: JENA-901
>                 URL: https://issues.apache.org/jira/browse/JENA-901
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: Reasoners
>    Affects Versions: Jena 2.12.1
>            Reporter: Jan De Beer
>
> The class "com.hp.hpl.jena.reasoner.rulesys.impl.LPBRuleEngine" uses an 
> in-memory cache named "tabledGoals", which has no limit as to the size/number 
> of entries stored.
> {noformat}
>     /** Table mapping tabled goals to generators for those goals.
>      *  This is here so that partial goal state can be shared across multiple 
> queries. */
>     protected HashMap<TriplePattern, Generator> tabledGoals = new HashMap<>();
> {noformat}
> We have experienced out-of-memory issues because of the cache being filled 
> with millions of entries in just a few days under normal query usage 
> conditions and a heap memory set to 3GB.
> In our setup, we have a dataset containing multiple graphs, some of them are 
> actual data graphs (backed by TDB), and then there are two which are ontology 
> models using a "TransitiveReasoner" and an "OWLMicroFBRuleReasoner", 
> respectively. A typical query may run over all the graphs in the dataset, 
> including the ontology ones (see below for a query template). Eventhough the 
> ontology graphs would not yield any additional results for data queries 
> (which is fine), the above mentioned cache would still fill up with new 
> entries.
> {noformat}
> SELECT ?p ?o
> WHERE {
>   GRAPH ?g {
>     <some resource of interest> ?p ?o .
>   }
> }
> {noformat}
> As there is no upper bound to the cache, soon or later all available heap 
> memory will be consumed by the cache, giving rise to an out-of-memory 
> criticial error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to