i have specific questiosn in relation to what ajs6f said:
i have a TDB store with 1/3 triples with very small literals (3-5 char),
where the same sequence is often repeated. would i get smaller store and
better performance if these were URI of the character sequence (stored
once for each repeated case)? any guess how much I could improve?
does the size of the URI play a role in the amount of storage used. i
observe that i have for 33 M triples a TDB size (files) of 13 GB, which
means about 300 byte per triple. the literals are all short (very seldom
more than 10 char, mostly 5 - words from english text). is is a named
graph, if this makes a difference.
thank you!
andrew
On 11/25/2017 06:42 AM, ajs6f wrote:
Andy may be able to be more precise, but I can tell you right away that it's not a
straightforward function. How many literals are there "per triple"? How big are
the literals, on average? How many unique bnodes and URIs? All of these things will
change the eventual size of the database.
ajs6f
On Nov 25, 2017, at 6:40 AM, Laura Morales <laure...@mail.com> wrote:
Is it possible to estimate the size of a TDB2 store from one of nt/turtle/xml
input file, without actually creating the store? Is there maybe a tool for this?
--
em.o.Univ.Prof. Dr. sc.techn. Dr. h.c. Andrew U. Frank
+43 1 58801 12710 direct
Geoinformation, TU Wien +43 1 58801 12700 office
Gusshausstr. 27-29 +43 1 55801 12799 fax
1040 Wien Austria +43 676 419 25 72 mobil