> I have been giving some thought to using Lucene as an RDF database.
> I'm specifically thinking about the RDF model and not the RDF syntax.

Excellent idea! In fact, I am going to try it out right now...


> Can anyone see any problems here?  This database will eventually grow
> to around 2TB in the next month so performance issues are non-trivial.

We have something like 20 million statements per gigabyte (RDF/XML).
Even if what you store into the index is more compact (let's say 2
million statements per gigabyte), with terrabytes of data you might
eventually run out of document identifiers (positive integers).


I have previously done some tests with storing RDF into a relational
database. Following is a list of fields I used; I believe the same
should work for a Lucene index.

  model_ns
  model
  statement_ns
  statement
  subject_ns
  subject
  predicate_ns
  predicate
  resource_ns
  resource
  literal

model_ns/model is used to create logical groups of statements. Either
resource_ns/resource or literal is used to store the object of the
statement. Latter can only be queried properly when using a database
that supports large text fields such as MySQL or PostreSQL; obviously
not an issue for Lucene.


--
Eric Jain


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to