Hi Olivier, Can you please update the the defaultdata bundle and publish it at the nuxeo maven server
Please note that the URL for the SolrIndex in the commit is wrong. The correct file can be found at http://www.salzburgresearch.at/~rwesten/stanbol/dbpedia_43k.solrindex.zip you need to copy this file into the src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/index folder. It would be nice if you could also include the OpenNLP POS Tagger and the chunker for the english language http://opennlp.sourceforge.net/models-1.5/en-pos-perceptron.bin http://opennlp.sourceforge.net/models-1.5/en-chunker.bin I will need them (at least the POS Tagger) for the upcoming taxonomy linking engine. This files need to be present in the src/main/resources/org/apache/stanbol/defaultdata/opennlp folder thx Rupert On Wed, Jun 29, 2011 at 3:37 PM, <[email protected]> wrote: > > * added small (43k entities) solr index > * added configurations for the dbpedia (referenced site, cache, solr index, > SolrYard, EntityTaggingEngine) > * added the path of the dbpedia solr index to the Data-Files header > * added the Install-Path header with the path for the dbpedia configuration > to trigger the installation of this files by the sling installer framework > * updated the version to "0.0.3" > > NOTE: > > * the addition of the configuration files will allow to remove the same files > form the different launchers. This makes it easier to maintain such > configurations > * the solr index (15MByte) is not contained in the SVN repository and need to > be manually downloaded before building this bundle. Currently it is available > form > http://www.salzburgresearch.at/~rwesten/stanbol/dbpedia_150k_en_de_fr_it/dbpedia.solrindex.zip) > > As soon as this bundle is available I will commit the according changes in > the full launcher and the stable launcher (STANBOL-241) > > Added: > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/ > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/ > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/ > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/dbpedia_43k.solrindex.ref > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine-dbpedia.config > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.core.site.CacheImpl-dbpedia.config > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.site.referencedSite-dbpedia.config > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dbpedia.config > > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/index/ > Modified: > incubator/stanbol/trunk/defaultdata/pom.xml > > Modified: incubator/stanbol/trunk/defaultdata/pom.xml > URL: > http://svn.apache.org/viewvc/incubator/stanbol/trunk/defaultdata/pom.xml?rev=1141099&r1=1141098&r2=1141099&view=diff > ============================================================================== > --- incubator/stanbol/trunk/defaultdata/pom.xml (original) > +++ incubator/stanbol/trunk/defaultdata/pom.xml Wed Jun 29 13:37:57 2011 > @@ -14,7 +14,7 @@ > See also: > https://issues.apache.org/jira/browse/OPENNLP-68 > --> > - <version>0.0.2</version> > + <version>0.0.3</version> > <packaging>bundle</packaging> > > <name>Apache Stanbol Default Data</name> > @@ -61,7 +61,7 @@ > DataFileProvider > --> > <Data-Files> > - org/apache/stanbol/defaultdata/opennlp > + > org/apache/stanbol/defaultdata/opennlp,org/apache/stanbol/defaultdata/site/dbpedia/index > </Data-Files> > <!-- > Use a priority lower than 0 to allow providers without a > @@ -70,6 +70,9 @@ > <Data-Files-Priority> > -100 > </Data-Files-Priority> > + <Install-Path> > + org/apache/stanbol/defaultdata/site/dbpedia > + </Install-Path> > </instructions> > </configuration> > </plugin> > > Added: > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/dbpedia_43k.solrindex.ref > URL: > http://svn.apache.org/viewvc/incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/dbpedia_43k.solrindex.ref?rev=1141099&view=auto > ============================================================================== > --- > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/dbpedia_43k.solrindex.ref > (added) > +++ > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/dbpedia_43k.solrindex.ref > Wed Jun 29 13:37:57 2011 > @@ -0,0 +1,3 @@ > +Name=SolrIndex for dbpedia > +Description=DBpedia.org > +Index-Archive=dbpedia_43k.solrindex.zip > > Added: > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine-dbpedia.config > URL: > http://svn.apache.org/viewvc/incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine-dbpedia.config?rev=1141099&view=auto > ============================================================================== > --- > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine-dbpedia.config > (added) > +++ > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.enhancer.engines.entitytagging.impl.NamedEntityTaggingEngine-dbpedia.config > Wed Jun 29 13:37:57 2011 > @@ -0,0 +1,8 @@ > +org.apache.stanbol.enhancer.engines.entitytagging.nameField="rdfs:label" > +org.apache.stanbol.enhancer.engines.entitytagging.personType="dbp-ont:Person" > +org.apache.stanbol.enhancer.engines.entitytagging.personState=B"true" > +org.apache.stanbol.enhancer.engines.entitytagging.referencedSiteId="dbpedia" > +org.apache.stanbol.enhancer.engines.entitytagging.placeState=B"true" > +org.apache.stanbol.enhancer.engines.entitytagging.organisationState=B"true" > +org.apache.stanbol.enhancer.engines.entitytagging.organisationType="dbp-ont:Organisation" > +org.apache.stanbol.enhancer.engines.entitytagging.placeType="dbp-ont:Place" > > Added: > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.core.site.CacheImpl-dbpedia.config > URL: > http://svn.apache.org/viewvc/incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.core.site.CacheImpl-dbpedia.config?rev=1141099&view=auto > ============================================================================== > --- > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.core.site.CacheImpl-dbpedia.config > (added) > +++ > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.core.site.CacheImpl-dbpedia.config > Wed Jun 29 13:37:57 2011 > @@ -0,0 +1,4 @@ > +org.apache.stanbol.entityhub.yard.name="dbpedia\ Cache" > +org.apache.stanbol.entityhub.yard.cacheYardId="dbpediaDefaultdataIndex" > +org.apache.stanbol.entityhub.yard.id="dbpediaDefaultdataIndex" > +org.apache.stanbol.entityhub.yard.description="Cache\ for\ the\ dbpedia\ > Referenced\ Site\ using\ the\ dbpediaIndex." > > Added: > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.site.referencedSite-dbpedia.config > URL: > http://svn.apache.org/viewvc/incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.site.referencedSite-dbpedia.config?rev=1141099&view=auto > ============================================================================== > --- > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.site.referencedSite-dbpedia.config > (added) > +++ > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.site.referencedSite-dbpedia.config > Wed Jun 29 13:37:57 2011 > @@ -0,0 +1,18 @@ > +org.apache.stanbol.entityhub.site.attributionUrl="http://wiki.dbpedia.org/About" > +org.apache.stanbol.entityhub.site.cacheId="dbpediaDefaultdataIndex" > +org.apache.stanbol.entityhub.site.name="dbpedia" > +org.apache.stanbol.entityhub.site.dereferencerType="org.apache.stanbol.entityhub.dereferencer.SparqlDereferencer" > +org.apache.stanbol.entityhub.site.defaultMappedEntityState="proposed" > +org.apache.stanbol.entityhub.site.fieldMappings=("#\ ---\ Define\ the\ > Languages\ for\ all\ fields\ ---","\ |\ @\=null;en;de;fr;it","#\ ---\ RDF,\ > RDFS\ and\ OWL\ Mappings\ ---","rdfs:label\ |\ > d\=entityhub:text","rdfs:comment\ |\ d\=entityhub:text","rdf:type\ |\ > d\=entityhub:ref","rdfs:seeAlso\ |\ d\=entityhub:ref","#\ map\ also\ > wikiPageRedirects\ to\ rdfs:seeAlso","dbp-ont:wikiPageRedirects\ |\ > d\=entityhub:ref\ >\ rdfs:seeAlso","","#\ used\ by\ LOD\ to\ link\ to\ URIs\ > used\ to\ identify\ the\ same\ Entity","owl:sameAs\ |\ > d\=entityhub:ref","","#\ ---\ Dublin\ Core\ (dc\ terms\ and\ dc\ elements)\ > ---","#\ dc:*","#\ DC\ terms\ subject\ is\ used\ by\ some\ DBpedia\ entities\ > to\ store\ the\ Categories","#\ Others\ use\ skos:subject.\ Therefore\ > ensure\ that\ dc:subjects\ are\ references","dc:subject\ |\ > d\=entityhub:ref","#\ and\ copy\ the\ values\ to\ skos:subject","#\ > dc:modified\ |\ d\=xsd:dateTime","#\ dc:created\ |\ d\=xsd:dateTime","","#\ > all\ DC\ Elements\ (one\ co > uld\ also\ define\ the\ mappings\ to\ the\ DC\ Terms\ counterparts\ > here","#\ do\ not\ store\ dc-elements\ properties,\ but\ map\ them\ to\ the\ > dc\ terms\ counterpart","#dc-elements:*","#\ dc-elements:contributor\ >\ > dc:contributor","#\ dc-elements:coverage\ >\ dc:coverage","#\ > dc-elements:creator\ >\ dc:creator","#\ dc-elements:date\ |\ d\=xsd:dateTime\ > >\ dc:date","#\ dc-elements:description\ >\ dc:description","#\ > dc-elements:format\ >\ dc:format","#\ dc-elements:identifier\ >\ > dc:identifier","#\ dc-elements:language\ >\ dc:language","#\ > dc-elements:publisher\ >\ dc:publisher","#\ dc-elements:relation\ >\ > dc:relation","#\ dc-elements:rights\ >\ dc:rights","#\ dc-elements:source\ >\ > dc:source","dc-elements:subject\ |\ d\=entityhub:ref\ >\ dc:subject","#\ > copy\ subjects\ also\ to\ skos:subject","#\ dc-elements:subject\ |\ > d\=entityhub:ref\ >\ skos:subject","#\ dc-elements:title\ >\ dc:title","#\ > dc-elements:type\ >\ dc:type","","#\ ---\ Spatial\ Things\ ---","geo:lat\ |\ d > \=xsd:double","geo:long\ |\ d\=xsd:double","geo:alt\ |\ d\=xsd:int","#\ one\ > can\ also\ copy\ the\ valued\ from\ the\ DBpedia\ properties","#\ use\ the\ > elevation\ if\ present","dbp-ont:elevation\ |\ d\=xsd:int\ >\ geo:alt","","#\ > ---\ Thesaurus\ (via\ SKOS)\ ---","#SKOS\ can\ be\ used\ to\ define\ > hierarchical\ terminologies","skos:*","skos:broader\ |\ > d\=entityhub:ref","skos:narrower\ |\ d\=entityhub:ref","skos:related\ |\ > d\=entityhub:ref","skos:member\ |\ d\=entityhub:ref","skos:subject\ |\ > d\=entityhub:ref\ >\ dc:subject","skos:inScheme\ |\ > d\=entityhub:ref","skos:hasTopConcept\ |\ > d\=entityhub:ref","skos:topConceptOf\ |\ d\=entityhub:ref","","#\ ---\ > Social\ Networks\ (via\ foaf)\ ---","#The\ Friend\ of\ a\ Friend\ schema\ > often\ used\ to\ describe\ social\ relations\ between\ people","#\ foaf:*\ > ","#\ foaf:knows\ |\ d\=entityhub:ref","#\ foaf:made\ |\ > d\=entityhub:ref","#\ foaf:maker\ |\ d\=entityhub:ref","#\ foaf:member\ |\ > d\=entityhub:ref","foaf:homepage\ |\ d\=xsd > :anyURI","foaf:depiction\ |\ d\=xsd:anyURI","#\ also\ use\ the\ DBpedia\ > thumbnail\ as\ oaf:depiction","dbp-ont:thumbnail\ |\ d\=xsd:anyURI\ >\ > foaf:depiction","foaf:img\ |\ d\=xsd:anyURI","foaf:logo\ |\ > d\=xsd:anyURI","#\ Documents\ about\ the\ entity","foaf:page\ |\ > d\=xsd:anyURI","","#\ ---\ dbpedia\ specific","#\ the\ \"dbp-ont\"\ defines\ > knowledge\ mapped\ to\ the\ DBPedia\ ontology","#\ > dbp-ont:*","dbp-ont:birthDate\ |\ d\=xsd:dateTime","dbp-ont:deathDate\ |\ > d\=xsd:dateTime","dbp-ont:populationTotal\ |\ > d\=xsd:long","dbp-ont:wikiPageExternalLink\ |\ > d\=xsd:anyURI","dbpedia-owl:areaTotal\ |\ d\=xsd:double","","#\ the\ > \"DBpedia\ properties\ are\ all\ key\ values\ pairs\ extracted\ from\ the\ > info\ boxes","#\ on\ the\ right\ hand\ side\ of\ Wikipedia\ pages.","#\ Data\ > Quality\ is\ to\ low\ to\ use\ them\ efficiently","#\ dbp-prop:*","","#\ > Copy\ only\ population\ for\ now\ (one\ could\ add\ additional\ if\ > necessary)!","#\ use\ dbp-ont:populationTotal\ instead","#\ db > p-prop:population\ |\ d\=xsd:long","","#Deactivated\ mappings\ based\ on\ > dbprops","#dbp-prop:latitude\ |\ d\=xsd:double\ >\ > geo:lat","#dbp-prop:longitude\ |\ d\=xsd:double\ >\ > geo:long","#dbp-prop:elevation\ |\ d\=xsd:int;xsd:float\ >\ > geo:alt","#dbp-prop:website\ |\ d\=xsd:anyURI\ >\ foaf:homepage") > +org.apache.stanbol.entityhub.site.licenseName=["Creative\ Commons\ > Attribution-ShareAlike\ 3.0","GNU\ Free\ Documentation\ License"] > +org.apache.stanbol.entityhub.site.defaultSymbolState="proposed" > +org.apache.stanbol.entityhub.site.searcherType="org.apache.stanbol.entityhub.searcher.VirtuosoSearcher" > +org.apache.stanbol.entityhub.site.defaultExpireDuration=I"0" > +org.apache.stanbol.entityhub.site.cacheStrategy="all" > +org.apache.stanbol.entityhub.site.attribution="DBpedia.org" > +org.apache.stanbol.entityhub.site.accessUri="http://dbpedia.org/sparql/" > +org.apache.stanbol.entityhub.site.id="dbpedia" > +org.apache.stanbol.entityhub.site.entityPrefix=["http://dbpedia.org/resource/","http://dbpedia.org/ontology/"] > +org.apache.stanbol.entityhub.site.licenseUrl=["http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License","http://en.wikipedia.org/wiki/Wikipedia:Text_of_the_GNU_Free_Documentation_License"] > +org.apache.stanbol.entityhub.site.queryUri="http://dbpedia.org/sparql" > +org.apache.stanbol.entityhub.site.description="DBpedia.org\ " > > Added: > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dbpedia.config > URL: > http://svn.apache.org/viewvc/incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dbpedia.config?rev=1141099&view=auto > ============================================================================== > --- > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dbpedia.config > (added) > +++ > incubator/stanbol/trunk/defaultdata/src/main/resources/org/apache/stanbol/defaultdata/site/dbpedia/config/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dbpedia.config > Wed Jun 29 13:37:57 2011 > @@ -0,0 +1,7 @@ > +org.apache.stanbol.entityhub.yard.solr.solrUri="dbpedia_43k" > +org.apache.stanbol.entityhub.yard.name="dbpedia\ default\ data\ index" > +org.apache.stanbol.entityhub.yard.solr.multiYardIndexLayout=B"false" > +org.apache.stanbol.entityhub.yard.solr.useDefaultConfig=B"false" > +org.apache.stanbol.entityhub.yard.solr.documentBoost="http://www.iks-project.eu/ontology/rick/model/entityRank" > +org.apache.stanbol.entityhub.yard.id="dbpediaDefaultdataIndex" > +org.apache.stanbol.entityhub.yard.description="Small\ local\ index\ with\ > 43000\ entities\ for\ the\ Referenced\ Site\ \"dbpedia\"." > > > -- | Rupert Westenthaler [email protected] | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen
