[ 
https://issues.apache.org/jira/browse/JENA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16227076#comment-16227076
 ] 

Michał Woźniak commented on JENA-1414:
--------------------------------------

The proposed solution seems as a workaround to the issue that the operation has 
now O(n+m) complexity. It used to be O(m) in Jena 3.1.

I think I see a way to adapt GraphUtil.deleteFrom(Graph,Graph) [see 
graph_util_improve.patch]. Instead of computing targetGraph.size() by iterating 
over all triples it is enough to stop the iteration once the counter exceeds 
some threshold. See the method below:

{code:java}
    static int compareSizeTo(Graph g, int size) {
        int counter = 0;
        ExtendedIterator<Triple> it = g.find(Triple.ANY);
        try {
          while (it.hasNext()) {
            it.next();
            counter += 1;
            if (counter > size) {
              return 1;
            }
          }
          return counter == size ? 0 : -1;
        } finally {
          it.close();
        }
    }
{code}

Similar trick can be used to compare sizes of two graphs - get two iterators 
and start consuming them until the smaller ends. This way O(min(n,m)) 
complexity can be achieved.

{code:java}
  static int compareSizes(Graph g1, Graph g2) {
      ExtendedIterator<Triple> it1 = g1.find(Triple.ANY);
      try {
        ExtendedIterator<Triple> it2 = g2.find(Triple.ANY);
        try {
          while (it1.hasNext() && it2.hasNext()) {
            it1.next();
            it2.next();
          }
          if (it1.hasNext()) {
            return 1;
          } else if (it2.hasNext()) {
            return -1;
          }
          return 0;
        } finally {
          it2.close();
        }
      } finally {
          it1.close();
      }
  }

{code}



> Performance regression in Model.remove(Model m) method
> ------------------------------------------------------
>
>                 Key: JENA-1414
>                 URL: https://issues.apache.org/jira/browse/JENA-1414
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: Core
>    Affects Versions: Jena 3.3.0, Jena 3.4.0
>            Reporter: Michał Woźniak
>         Attachments: graph_util_improve.patch
>
>
> The Model.remove(Model) works very slow on large models, as it propagates to 
> GraphUtil.deleteFrom(Graph, Graph), which computes size of the target graph 
> by iterating over all triples. This computation takes nearly 100% of the time 
> of the Model.remove(Model) operation.
> It seems this commit introduced the issue: 
> https://github.com/apache/jena/commit/781895ce64e062c7f2268a78189a777c39b92844#diff-fbb4d11dc804464f94c27e33e11b18e8
> Due to this bug deletion of a concept scheme on a large ontology may take 
> several minutes. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to