Great. Do you think it would be possible to have a default
configuration for a small index of the top 10000 entities as measured
by popularity?

I am also thinking of building maven artifacts to embed the opennlp
models in version 1.5 without checking them in the Stanbol svn repo. I
could help you bundle a set of small entity indexes.

Also could you write a howto for building indexes? I think such howto
should better be written as text file in the stanbol source tree or
better as a new documentation page for the stanbol website (using the
markdown syntax) rather than a new wikipage on the IKS wiki).

As soon as you have such an howto ready I would be glad to write a
bunch of pig scripts to build indexes for topics (rather than
entities) so as to be able to perform document level topic assignment
rather than occurrence-based entity lookups.

-- 
Olivier

Reply via email to