Re: [Dbpedia-discussion] Italian short / long abstract problem

Dimitris Kontokostas Tue, 27 Sep 2011 04:22:03 -0700

Some more info for the (current) abstract extraction process...
You will have to install a local modified mediawiki and load the
wikipedia dumps (after you clean then with the script)
The detailed process is described here:
http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/dbpedia/file/945c24bdc54c/abstractExtraction


> I will also dare to give another idea. The guys behind Sweble
> (http://sweble.org/) claim it is very thorough, and there seems to be a lot
> of activity behind it.

This could be a new approach to the framework, not only for abstracts,
but to replace the SimpleWikiParser.
I think the current parser is LL and maybe we could change to an LR
Parser to handle better recursive syntax.
I haven't checked at sweble yet, but we could look into it

Cheers,
Dimitris

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Dbpedia-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Re: [Dbpedia-discussion] Italian short / long abstract problem

Reply via email to