[ 
https://issues.apache.org/jira/browse/JENA-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15848208#comment-15848208
 ] 

ASF GitHub Bot commented on JENA-1284:
--------------------------------------

Github user rvesse commented on the issue:

    https://github.com/apache/jena/pull/212
  
    I presume this was motivated by a real-world performance concern? It would 
be interesting to know how much difference it makes.
    
    My only concern is what happens when there is only a slight difference 
between the size of the affected graphs? It's looks like when there is a slight 
difference then you potentially do twice the work because on one code path you 
do both a `find()` and a `delete()`/`add()` for every triple.
    
    Would it be worth making the behaviour based a configurable percentage 
difference e.g. If the graphs are within 10% of each others size don't use the 
new path.
    
    Also is materialising the list of triples a potential memory issue when the 
iterator is over a large amount of data? Click


> Improve GraphUtil operations by considering relative graph sizes.
> -----------------------------------------------------------------
>
>                 Key: JENA-1284
>                 URL: https://issues.apache.org/jira/browse/JENA-1284
>             Project: Apache Jena
>          Issue Type: Improvement
>            Reporter: Andy Seaborne
>            Assignee: Andy Seaborne
>
> Some of the bulk `GraphUtil` operations, `addInto` and `deleteFrom`, could be 
> improved to loop on the smaller graph and have different algorithms depending 
> on whether source or destination graph is smaller.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to