Hi Harish

On Thu, May 16, 2013 at 8:06 PM, harish suvarna <[email protected]> wrote:
> Rupert,
> 1. dbpedia ontology has dbpedia-owl properties also. In order to index
> these,
> we need to uncomment  #dbp-prop* line in mappings.txt. Is my understanding
> Right?

No you need to specify 'dbp-ont:*'. Enabling 'dbp-prop*' would index
all 'http://dbpedia.org/property/' properties.

> Looks like dbpedia-owl:birthDate IS dbp-ont:birthDate in mappings.txt.

Stanbol uses the dbp-ont prefix for the dbpedia ontology namespace.
But 'dbo' should also work as it is the one defined by prefix.cc.

>
> dbpedia-owl:birthDate <http://dbpedia.org/ontology/birthDate>
>
>    - 1961-08-04 (xsd:date)
>
> dbpedia-owl:birthPlace <http://dbpedia.org/ontology/birthPlace>
>
>    - dbpedia:Honolulu <http://dbpedia.org/resource/Honolulu>
>    - dbpedia:United_States <http://dbpedia.org/resource/United_States>
>    - dbpedia:Hawaii <http://dbpedia.org/resource/Hawaii>
>
>
>
>
> 2. mappings.txt already has
> dbp-ont* line uncommented. It then indexes birthDate etc.  It is also
> specified that birthdate is a dateTime object and populationTotal is a long
> integer. Are these must to specify the data types? if we dont specify, what
> happens?
>

True. Looks like the default config has this enabled. But I have never
used it to build an actual index. So if you use the default config you
should get the dbpedia-owl information (if you also imported the
according RDF dump files).

> # --- dbpedia specific
> # the "dbp-ont" defines knowledge mapped to the DBPedia ontology
> dbp-ont:*
> dbp-ont:birthDate | d=xsd:dateTime
> dbp-ont:populationTotal | d=xsd:long
>
>
> 3. what is the difference between dbp-ont* and dbp-prop* in mappings.txt?
>

As stated above this are two different namespaces. "dbp-ont" is used
for Infobox properties that are semantically mapped. "dbp-prop" is
used for extracted key/values pairs that are not aligned with the
ontology. You should find more information on the DBpedia webpage.

> 4. How do I tell to index some dbpedia-owl properties and not some? For ex,
> the dbpedia-owl:abstract has lot of text. I may want to not index it.

You can exclude properties by adding a '!' at the first position (e.g.
!dbp-ont:abstract). The documentation of the mapping language can be
found at [1] (I should definitely move this over to the Stanbol
Webpage).

best
Rupert


[1] http://wiki.iks-project.eu/index.php/RepresentationMapping

>
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Wed, May 15, 2013 at 5:53 AM, Rupert Westenthaler <
> [email protected]> wrote:
>
>> On Wed, May 15, 2013 at 12:20 PM, Manish Aggarwal <[email protected]>
>> wrote:
>> > Hi Rupert,
>> >
>> > Thanks for your response. I will look into this.
>> >
>> > Meanwhile I was trying the FieldQuery option available with stanbol ...
>> >
>> > I have created the following FieldQuery
>> >
>> > {
>> >     "selected": [
>> >         "http:\/\/dbpedia.org\/ontology\/foundingYear",
>> >         "http:\/\/dbpedia.org\/ontology\/foundingDate",
>> >         "http:\/\/dbpedia.org\/ontology\/keyPerson",
>> >         "http:\/\/dbpedia.org\/ontology\/industry"],
>> >     "offset": "0",
>> >     "limit": "10",
>> >     "constraints": [{
>> >         "type": "reference",
>> >         "field": "http:\/\/www.w3.org\/1999\/02\/22-rdf-syntax-ns#type",
>> >         "value": "http:\/\/dbpedia.org\/ontology\/Organisation",
>> >     }]
>> > }
>> >
>> > and running the command:
>> >
>> > curl -X POST -H "Content-Type:application/json" --data "@fieldQuery.json"
>> > http://localhost:8080/entityhub/site/dbpedia/query
>> >
>> > I am using the dbpedia.solrindex (version 3.8)  (from link
>> > http://dev.iks-project.eu/downloads/stanbol-indices/dbpedia-3.8/)
>> >
>> >
>> > But I am not getting any of the attributes like foundingYear, industry
>> etc
>> > in the result. Is this because that the dbpedia.solrindex doesn't
>> contains
>> > these attributes?
>>
>> The reason is that those attributes are not included in the Index.
>> Adding all dbpedia-owl namespace properties to the index would
>> probably increase the index size for more than 20GByte (The last time
>> I have done this was for dbpedia 3.6). If you need them in your index
>> you will need to build your own DBpedia index with an adapted
>> configuration (you will need to add those properties or the whole
>> namespace to the 'indexing/config/mappings.txt' file)
>>
>> > I think this is also going to follow the same workflow that you had told
>> in
>> > the last mail through sample code?
>>
>> Yes, but my code assumed that you do already know the ID of Entities.
>> The above would search for Entities that are of type
>> dbpedia-owl:Organisation. The according Java code would look like
>> follows
>>
>>     Site site; //the site (as obtained by the SiteManager)
>>     FieldQuery query = site.getQueryFactory().createFieldQuery();
>>     query.setConstraint(RDF_TYPE, new
>> ReferenceConstraint(DBPEDIA_ORGANIZATION));
>>     query.addSelectedField(RDF_TYPE);
>>     query.addSelectedField(DBPEDIA_INDUSTRY);
>>     query.addSelectedField(DBPEDIA_KEY_PERSON);
>>     query.setLimit(100);
>>     //and all the others
>>     QueryResultList<Representation> results = solrYard.find(query);
>>     for(Representation result : results){
>>         //process the results
>>     }
>>
>> best
>> Rupert
>>
>> > Regards,
>> > Manish
>> >
>> >
>> >
>> >
>> > On Wed, May 15, 2013 at 12:46 PM, Rupert Westenthaler <
>> > [email protected]> wrote:
>> >
>> >> Hi Manish,
>> >>
>> >> Yes, typically this is done by using the Stanbol Entiyhub in
>> >> combination with a referenced site for dbpedia.
>> >>
>> >> (1) Configuring the dbpedia ReferencedSite
>> >> ------
>> >>
>> >> But NOTE that all pre-build dbpedia indexes for the Entityhub do not
>> >> include the dbpedia-owl:industry property. Meaning that you will need
>> >> to create your own DBpedia index that does include this property by
>> >> using the Entityhub Indexing Tool for dbpedia.
>> >>
>> >> As alternative you could also configure a Referenced Site for dbpedia
>> >> that directly accesses the dbpedia SPARQL endpoint and stores
>> >> retrieved entities in a local cache. For that you can install the
>> >>
>> >>     <groupId>org.apache.stanbol</groupId>
>> >>
>> <artifactId>org.apache.stanbol.data.sites.dbpedia.cached</artifactId>
>> >>     <version>1.2.0-SNAPSHOT</version>
>> >>
>> >> NOTE: with revision http://svn.apache.org/r1482702 I changed the name
>> >> of the ReferencedSite configured by this bundle from 'dbpedia' to
>> >> 'dbpedia-cached' so that it does not conflict with the default
>> >> 'dbpedia' ReferencedSite that does use a full local index.
>> >>
>> >> (2) Access the information of the dbpedia ReferencedSite
>> >> ---------
>> >>
>> >> To get the required information you will need to use the Entityhub
>> >> API. See the code samples below.
>> >>
>> >>     import org.apache.stanbol.entityhub.servicesapi.site.SiteManager
>> >>
>> >>     //inject a reference to the Entityhub SiteManager
>> >>     @Reference
>> >>     SiteManager siteManager
>> >>
>> >>     //siteName is the name of the Referenced Site (most likely
>> >> 'dbpedia' or 'dbpedia-cached')
>> >>     private someMethod(String siteName, String entityId){
>> >>
>> >>         Site site = siteManager.getSite(siteName);
>> >>         //check for not null (site with that name is not active)
>> >>         Entity entity = site.getEntity(entityId);
>> >>         Representation data = entity.getRepresentation();
>> >>         //get the RDF type values of the Entity
>> >>         Iterator<Reference> types = data.getReferences(RDF_TYPE);
>> >>         //iterate over the types and check for dbp-ont:Organisation
>> >> (the full URI)
>> >>
>> >>         Iterator<Reference> industryValues =
>> >> data.gerReferences(DBPEDIA_INDUSTRY);
>> >>         //iterate over the values for the industry values
>> >>
>> >>     }
>> >>
>> >> If you prefer to use the Clerezza RDF API instead of the API of
>> >> Representation you can also convert the Representation to RDF
>> >>
>> >>     import org.apache.stanbol.entityhub.model.clerezza.RdfValueFactory;
>> >>     import import org.apache.clerezza.rdf.utils.GraphNode;
>> >>
>> >>     private static RdfValueFactory vf = RdfValueFactory.getInstance();
>> >>
>> >>     private GraphNode convertRepresentationToRdf(Representation r){
>> >>         return new GraphNode(new UriRef(r.getId()),
>> >>             vf.toRdfRepresentation(rep).getRdfGraph();
>> >>     }
>> >>
>> >> The above code would create a new MGraph for each Representation. If
>> >> you want to add multiple Representation to the same Graph you need to
>> >> use
>> >>
>> >>     import org.apache.stanbol.commons.indexedgraph.IndexedMGraph;
>> >>
>> >>     private someMethod(...) {
>> >>
>> >>     MGrpah graph = new IndexedMGrpah(); //a fast in-memory graph
>> >> implementation
>> >>     RdfValueFactory vf = new RdfValueFactory(graph);
>> >>
>> >>     //now use this RdfValueFactory to convert all Representations
>> >>     GraphNode entityRdfData =
>> convertRepresentationToRdf(vf,representation)
>> >>
>> >>     }
>> >>
>> >>     private GraphNode convertRepresentationToRdf(RdfValueFactory vf,
>> >> Representation r){
>> >>         return new GraphNode(new UriRef(r.getId()),
>> >>             vf.toRdfRepresentation(rep).getRdfGraph();
>> >>    }
>> >>
>> >> hope this helps
>> >>
>> >> best
>> >> Rupert
>> >>
>> >> On Tue, May 14, 2013 at 2:29 PM, Manish Aggarwal <[email protected]>
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > Is it possible to query dbpedia database from a custom enhancement
>> engine
>> >> > and find out more about a keyword. For example, if the keyword
>> classify
>> >> > under organization (dbp-ont:Organisation), I will be interested to
>> know
>> >> > what is the industry (dbpedia-owl:industry) this keyword belongs to.
>> >> > In a custom enhancement engine how can I get the required information?
>> >> >
>> >> > Regards,
>> >> > Manish
>> >>
>> >>
>> >>
>> >> --
>> >> | Rupert Westenthaler             [email protected]
>> >> | Bodenlehenstraße 11                             ++43-699-11108907
>> >> | A-5500 Bischofshofen
>> >>
>>
>>
>>
>> --
>> | Rupert Westenthaler             [email protected]
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>
>
>
>
> --
> Thanks
> Harish



--
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to