[ 
https://issues.apache.org/jira/browse/JENA-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453634#comment-17453634
 ] 

Andy Seaborne commented on JENA-2204:
-------------------------------------

> Is there any way we can reach near single digits of storage usage ?

50K LUBM triples is 8.4M bytes of n-triples data. There are 810Kbytes of unique 
RDF-terms.

Loaded into a default graph on a 32 bit setup it is 7.5M
It is 18M for a named graph. 
TDB2 uses favours more indexing - it has complete indexes for all lookup orders 
- and large B+tree blocks because it's faster.

If you have 50k triples, loaded into memory - disk space zero - it is going to 
be a few Mbytes. In a warm JVM it takes 1-2 seconds to load. Jena parses 
N-triples at faster than 200K/s.

It looks like the 330M wasn't a fresh or empty database. Compacted isn't going 
to be 190M.
Show the directory listing with sizes.


> Storage required by TDB2 is much higher than TDB1, How to Fix ?
> ---------------------------------------------------------------
>
>                 Key: JENA-2204
>                 URL: https://issues.apache.org/jira/browse/JENA-2204
>             Project: Apache Jena
>          Issue Type: Question
>            Reporter: Hemant Tiwari
>            Priority: Minor
>
> The storage required by TDB2 is much higher than TDB1
> For 100k statements - TDB1 takes about 90 MB, while TDB2 is taking ~ close to 
> 1 GB.
> Why is there such a difference and is there any solution available to reduce 
> the storage size?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to