[jira] [Commented] (JENA-1769) Dataset#listNames slow for large TDB2 datasets

Damien Obrist (Jira) Thu, 17 Oct 2019 08:45:18 -0700


    [ 
https://issues.apache.org/jira/browse/JENA-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16953861#comment-16953861
 ]


Damien Obrist commented on JENA-1769:
-------------------------------------

[~andy] thanks for the information and for looking into this!

I tested using a sample dataset that I had created for a previous issue, 
JENA-1619. The dataset is contained in the attachment in that issue 
({{jena-transaction-exception-master.zip}}), inside the {{sample-data}} folder. 
It consists of 1'000'000 dummy quads of the form
{noformat}
<http://www.test.com/826449> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> 
<http://www.test.com/number> <http://www.test.com/numbers>
{noformat}

> Dataset#listNames slow for large TDB2 datasets
> ----------------------------------------------
>
>                 Key: JENA-1769
>                 URL: https://issues.apache.org/jira/browse/JENA-1769
>             Project: Apache Jena
>          Issue Type: Bug
>          Components: TDB2
>    Affects Versions: Jena 3.13.0
>            Reporter: Damien Obrist
>            Assignee: Andy Seaborne
>            Priority: Major
>              Labels: performance
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> With Jena 3.13.0, the running time of {{Dataset#listNames}} has increased 
> significantly for TDB2 datasets.
> I have compared the running times for a sample TDB2 dataset containing 
> *1'000'000 triples*. I have observed a running time of *~270ms* with Jena 
> 3.12.0 and *~13.5s* with Jena 3.13.0.
> We're using a dataset with many millions of triples and for our use case, the 
> running time has increased from seconds to minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (JENA-1769) Dataset#listNames slow for large TDB2 datasets

Reply via email to