Some more info for the (current) abstract extraction process... You will have to install a local modified mediawiki and load the wikipedia dumps (after you clean then with the script) The detailed process is described here: http://dbpedia.hg.sourceforge.net/hgweb/dbpedia/dbpedia/file/945c24bdc54c/abstractExtraction
> I will also dare to give another idea. The guys behind Sweble > (http://sweble.org/) claim it is very thorough, and there seems to be a lot > of activity behind it. This could be a new approach to the framework, not only for abstracts, but to replace the SimpleWikiParser. I think the current parser is LL and maybe we could change to an LR Parser to handle better recursive syntax. I haven't checked at sweble yet, but we could look into it Cheers, Dimitris ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 _______________________________________________ Dbpedia-discussion mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion
