[ https://issues.apache.org/jira/browse/JENA-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453634#comment-17453634 ]
Andy Seaborne commented on JENA-2204: ------------------------------------- > Is there any way we can reach near single digits of storage usage ? 50K LUBM triples is 8.4M bytes of n-triples data. There are 810Kbytes of unique RDF-terms. Loaded into a default graph on a 32 bit setup it is 7.5M It is 18M for a named graph. TDB2 uses favours more indexing - it has complete indexes for all lookup orders - and large B+tree blocks because it's faster. If you have 50k triples, loaded into memory - disk space zero - it is going to be a few Mbytes. In a warm JVM it takes 1-2 seconds to load. Jena parses N-triples at faster than 200K/s. It looks like the 330M wasn't a fresh or empty database. Compacted isn't going to be 190M. Show the directory listing with sizes. > Storage required by TDB2 is much higher than TDB1, How to Fix ? > --------------------------------------------------------------- > > Key: JENA-2204 > URL: https://issues.apache.org/jira/browse/JENA-2204 > Project: Apache Jena > Issue Type: Question > Reporter: Hemant Tiwari > Priority: Minor > > The storage required by TDB2 is much higher than TDB1 > For 100k statements - TDB1 takes about 90 MB, while TDB2 is taking ~ close to > 1 GB. > Why is there such a difference and is there any solution available to reduce > the storage size? -- This message was sent by Atlassian Jira (v8.20.1#820001)