[
https://issues.apache.org/jira/browse/JENA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848208#comment-15848208
]
ASF GitHub Bot commented on JENA-1284:
--------------------------------------
Github user rvesse commented on the issue:
https://github.com/apache/jena/pull/212
I presume this was motivated by a real-world performance concern? It would
be interesting to know how much difference it makes.
My only concern is what happens when there is only a slight difference
between the size of the affected graphs? It's looks like when there is a slight
difference then you potentially do twice the work because on one code path you
do both a `find()` and a `delete()`/`add()` for every triple.
Would it be worth making the behaviour based a configurable percentage
difference e.g. If the graphs are within 10% of each others size don't use the
new path.
Also is materialising the list of triples a potential memory issue when the
iterator is over a large amount of data? Click
> Improve GraphUtil operations by considering relative graph sizes.
> -----------------------------------------------------------------
>
> Key: JENA-1284
> URL: https://issues.apache.org/jira/browse/JENA-1284
> Project: Apache Jena
> Issue Type: Improvement
> Reporter: Andy Seaborne
> Assignee: Andy Seaborne
>
> Some of the bulk `GraphUtil` operations, `addInto` and `deleteFrom`, could be
> improved to loop on the smaller graph and have different algorithms depending
> on whether source or destination graph is smaller.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)