Re: Questions around TDBLoader

Andy Seaborne Wed, 16 Jan 2013 01:26:21 -0800

(you probably want the users list [email protected])


On 16/01/13 05:32, Charles Li wrote:

Greeting to eveyone! New to Semantic Web. Trying to Use TDBLoader to load a
4 GB RDF/XML file.

[OS] Windows 7 - 64-bit
[Jena Version] 2.7.3
[Java Version]  1.6.0_24-b07, 64-bit
[TDBLoader Command] tdbloader.bat -loc bigRDF myRDF.xml
[TDBLoader Command Output]
................................
10:22:56 WARN  riot                 :: {W108} Not an XML Name:
'8961d5a3-2964-4373-b53d-02c9f2e764f8'
10:22:56 WARN  riot                 :: {W108} Not an XML Name:
'7ff4d865-1693-43ed-8a6e-368360006b05'
................................
10:23:02 WARN  riot                 :: {W108} Not an XML Name:
'1ef2dfba-30cc-4efa-b20b-05b45e979649'
10:23:02 INFO  loader               :: -- Finish triples data phase
10:23:02 INFO  loader               :: 58,064,426 triples loaded in
1,523.02 seconds [Rate: 38,124.51 per second]
10:23:02 INFO  loader               :: -- Start triples index phase
10:23:02 INFO  loader               :: Index SPO->POS: 100,000 slots
(Batch: 203,665 slots/s / Avg: 203,665 slots/s)
................................
10:41:48 INFO  loader               :: ** Index SPO->OSP: 58,064,426 slots
indexed in 837.14 seconds [Rate: 69,360.14 per second]
10:41:48 INFO  loader               :: -- Finish triples index phase
10:41:48 INFO  loader               :: ** 58,064,426 triples indexed in
1,126.68 seconds [Rate: 51,535.68 per second]
10:41:48 INFO  loader               :: -- Finish triples load
10:41:48 INFO  loader               :: ** Completed: 58,064,426 triples
loaded in 2,649.71 seconds [Rate: 21,913.51 per second]

Questions:

- Should I worry about the warnings? How can I get rid of the warnings?


Ideally, yes.

They are only warnings, the system will work with them as best it can.It maybe that the data itself is not formatted correctly.

- The following files are created under the "bigRDF" folder:
GOSP.dat  GPOS.dat  GSPO.dat  OSP.dat  OSPG.dat  POS.dat  POSG.dat  SPO.dat
  SPOG.dat  journal.jrnl  node2id.idn  prefix2id.dat  prefixIdx.dat
  prefixes.dat
GOSP.idn  GPOS.idn  GSPO.idn  OSP.idn  OSPG.idn  POS.idn  POSG.idn  SPO.idn
  SPOG.idn  node2id.dat   nodes.dat    prefix2id.idn  prefixIdx.idn
  stats.opt
   and they are all binary. How would I query them? And where can I find
documentation about them?


tdbquery will execute SPARQL queries on the database.

tdbquery --help
http://jena.apache.org/documentation/tdb/commands.html

Or any of the Jena API

Dataset ds = TDBFactory.createDataset("bigRDF") ;
Model model = ds.getDefaultModel() ;
...


        Andy

Re: Questions around TDBLoader

Reply via email to