[jira] [Commented] (JENA-1769) Dataset#listNames slow for large TDB2 datasets

Andy Seaborne (Jira) Wed, 16 Oct 2019 07:50:15 -0700


    [ 
https://issues.apache.org/jira/browse/JENA-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952903#comment-16952903
 ]


Andy Seaborne commented on JENA-1769:
-------------------------------------

Ugh - that's bad. It is definitely a bug/regression.

TDB2 can do as well as it ever did by overriding 
{{DatasetGraph.listGraphNodes}}.
It still iterates over the store but can do it much, much faster than the 
general implementation in {{DatasetGraphStorage}}. There is a lot of "small 
object" churn that is unnecessary.

And I very much appreciate the reports and their accuracy.




> Dataset#listNames slow for large TDB2 datasets
> ----------------------------------------------
>
>                 Key: JENA-1769
>                 URL: https://issues.apache.org/jira/browse/JENA-1769
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB2
>    Affects Versions: Jena 3.13.0
>            Reporter: Damien Obrist
>            Assignee: Andy Seaborne
>            Priority: Major
>              Labels: performance
>
> With Jena 3.13.0, the running time of {{Dataset#listNames}} has increased 
> significantly for TDB2 datasets.
> I have compared the running times for a sample TDB2 dataset containing 
> *1'000'000 triples*. I have observed a running time of *~270ms* with Jena 
> 3.12.0 and *~13.5s* with Jena 3.13.0.
> We're using a dataset with many millions of triples and for our use case, the 
> running time has increased from seconds to minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (JENA-1769) Dataset#listNames slow for large TDB2 datasets

Reply via email to