[
https://issues.apache.org/jira/browse/JENA-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453634#comment-17453634
]
Andy Seaborne commented on JENA-2204:
-------------------------------------
> Is there any way we can reach near single digits of storage usage ?
50K LUBM triples is 8.4M bytes of n-triples data. There are 810Kbytes of unique
RDF-terms.
Loaded into a default graph on a 32 bit setup it is 7.5M
It is 18M for a named graph.
TDB2 uses favours more indexing - it has complete indexes for all lookup orders
- and large B+tree blocks because it's faster.
If you have 50k triples, loaded into memory - disk space zero - it is going to
be a few Mbytes. In a warm JVM it takes 1-2 seconds to load. Jena parses
N-triples at faster than 200K/s.
It looks like the 330M wasn't a fresh or empty database. Compacted isn't going
to be 190M.
Show the directory listing with sizes.
> Storage required by TDB2 is much higher than TDB1, How to Fix ?
> ---------------------------------------------------------------
>
> Key: JENA-2204
> URL: https://issues.apache.org/jira/browse/JENA-2204
> Project: Apache Jena
> Issue Type: Question
> Reporter: Hemant Tiwari
> Priority: Minor
>
> The storage required by TDB2 is much higher than TDB1
> For 100k statements - TDB1 takes about 90 MB, while TDB2 is taking ~ close to
> 1 GB.
> Why is there such a difference and is there any solution available to reduce
> the storage size?
--
This message was sent by Atlassian Jira
(v8.20.1#820001)