[GitHub] jena issue #212: JENA-1284: Improvements for bulk graph operations in GraphU...

rvesse Wed, 01 Feb 2017 02:29:14 -0800

Github user rvesse commented on the issue:

    https://github.com/apache/jena/pull/212
  
    I presume this was motivated by a real-world performance concern? It would 
be interesting to know how much difference it makes.
    
    My only concern is what happens when there is only a slight difference 
between the size of the affected graphs? It's looks like when there is a slight 
difference then you potentially do twice the work because on one code path you 
do both a `find()` and a `delete()`/`add()` for every triple.
    
    Would it be worth making the behaviour based a configurable percentage 
difference e.g. If the graphs are within 10% of each others size don't use the 
new path.
    
    Also is materialising the list of triples a potential memory issue when the 
iterator is over a large amount of data? Click



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

[GitHub] jena issue #212: JENA-1284: Improvements for bulk graph operations in GraphU...

Reply via email to