Re: [nepomuk-kde] natural language processing + nepomuk

Jordi Polo Mon, 10 Nov 2008 08:14:05 -0800

A couple of links here for (my own) reference:
List of wordnets: http://www.globalwordnet.org/gwa/wordnet_table.htm  A lot
of them with restrictive licenses.
RDF/OWL representation of wordnet
http://www.w3.org/2001/sw/BestPractices/WNET/wn-conversion.html  Don't know
if would be useful.




>
>
>
> [...]
>
> > > Again, what i think is most important here is to allow users to use
> > > different
> > > words that mean the same thing plus some fuzzyness. If I misspell
> > > something the system should correct me internally.
> >
> > IMHO this would be very difficult with the current runner infrastructure.
> > Unless more and more data is inserted in Nepomuk and the query library
> get
> > the fuzzyness, detection of errors, etc. I think that the work on the
> query
> > library may be part of what I would need for the natural language
> > interface. So, yes, you give a good insight, more data to Nepomuk + more
> > functionality in the query =  more powerful features.
> > So, for instance I mentioned the services runner above. First, create a
> > strigi plugin (is Strigi in charge of this?) parse the service files and
> > put the info in Nepomuk. Later change KServiceTypeTrader and friends to
> use
> > Nepomuk instead of whatever they are using now.
>
> I would rather not use Strigi for that. Its Api is way to restricted. You
> need
> to be able to create a whole graph of information. Strigi can only add
> fields
> to one resource (a file actually).
> Thus, you would be better off coding a nepomuk service for that.
>

The more I think about it the more the following strategy makes sense to me:
KICS:  Keep It Connected Stupid
What in the desktop world means all the data to Nepomuk. And I mean KDE
reading any file that is not pure data (music, document data) via Nepomuk,
for instance .desktop files data throught Nepomuk.

On this regard, performance worries me. If I understood it well (and surely
I am not, I don't know how many methods one has to transverse to understand
it all. Things like the real implementation of Model::listStatements being
found in Client::ClientConnection is far from obvious, docs needed! )
mostly everything will end up calling NepomukMainModel which calls Soprano
Model methods  that will end up being SPARQL or a binary encoded command
(very clever this one) over DBUS or local socket (why not always use local
socket?) the server should deserialize, use the backend, serialize the
results and everything trips back. So the real performance depends on the
backends of Soprano which may not have optimization as main priority. I
guess cache can help a lot here.



>
> I hope you are still on board. :)


Absolutely,
I knew nothing about semantic web a couple of weeks ago. I have being
reading/understanding the basic concepts and reading some code of Nepomuk.
Now I upgraded myself to the state of newbie.


-- 
Jordi Polo Carres
NLP laboratory - NAIST
http://www.bahasara.org

_______________________________________________
nepomuk-kde mailing list
nepomuk-kde@semanticdesktop.org
http://lists.semanticdesktop.org/mailman/listinfo/nepomuk-kde

Re: [nepomuk-kde] natural language processing + nepomuk

Reply via email to