[
https://issues.apache.org/jira/browse/JENA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259704#comment-16259704
]
ASF GitHub Bot commented on JENA-1414:
--------------------------------------
Github user ajs6f commented on a diff in the pull request:
https://github.com/apache/jena/pull/306#discussion_r152085996
--- Diff: jena-core/src/main/java/org/apache/jena/graph/GraphUtil.java ---
@@ -246,43 +282,214 @@ private static void deleteIteratorWorkerDirect(Graph
graph, Iterator<Triple> it)
}
}
- private static final int sliceSize = 1000 ;
- /** A safe and cautious remove() function that converts the remove to
- * a number of {@link Graph#delete(Triple)} operations.
+ private static int MIN_SRC_SIZE = 1000 ;
+ // If source and destination are large, limit the search for the best
way round to "deleteFrom"
+ private static int MAX_SRC_SIZE = 1000*1000 ;
+ private static int DST_SRC_RATIO = 2 ;
+
+ /**
+ * Delete triples in the destination (arg 1) as given in the source
(arg 2).
+ *
+ * @implNote
+ * This is designed for the case of {@code dstGraph} being comparable
or much larger than
+ * {@code srcGraph} or {@code srcGraph} having a lot of triples to
actually be
+ * deleted from {@code dstGraph}. This includes large, persistent
{@code dstGraph}.
+ * <p>
+ * It is not designed for a large {@code srcGraph} and large {@code
dstGraph}
+ * with only a few triples in common delete from {@code dstGraph}. It
is better to
+ * calculate the difference in someway, and copy into a small graph
to use as the {@srcGraph}.
--- End diff --
typo: some way
> Performance regression in Model.remove(Model m) method
> ------------------------------------------------------
>
> Key: JENA-1414
> URL: https://issues.apache.org/jira/browse/JENA-1414
> Project: Apache Jena
> Issue Type: Improvement
> Components: Core
> Affects Versions: Jena 3.3.0, Jena 3.4.0
> Reporter: Michał Woźniak
> Assignee: Andy Seaborne
> Attachments: graph_util_improve.patch
>
>
> The Model.remove(Model) works very slow on large models, as it propagates to
> GraphUtil.deleteFrom(Graph, Graph), which computes size of the target graph
> by iterating over all triples. This computation takes nearly 100% of the time
> of the Model.remove(Model) operation.
> It seems this commit introduced the issue:
> https://github.com/apache/jena/commit/781895ce64e062c7f2268a78189a777c39b92844#diff-fbb4d11dc804464f94c27e33e11b18e8
> Due to this bug deletion of a concept scheme on a large ontology may take
> several minutes.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)