Hi Alessandro,

The Entityhub Indexing Tool supports two indexing modes:

Copied from the README of the genericrdf indexing tool
> 1. Iterate over the data and lookup the scores for entities (default).
> For this mode the "entityDataIterable" and an "entityScoreProvider" MUST BE
> configured. If no entity scores are available, a default entityScoreProvider
> provides no entity scores. This mode is typically used to index all entities 
> of
> a dataset.
> 2. Iterate over the entity IDs and Scores and lookup the data. For this Mode 
> an
> "entityIdIterator" and an "entityDataProvider" MUST BE configured. This mode 
> is
> typically used if only a small sub-set of a large dataset is indexed. This 
> might
> be the case if Entity-Scores are available and users want only to index the 
> e.g.
> 10000 most important Entities or if a dataset contains Entities of many 
> different
> types but one wants only include entities of a specific type (e.g. Species in
> DBpedia).

The genericRDF Indexing Tool uses (1) as default. The DBpedia Indexing
Tool uses (2) as default. But you can change the default by changing
the configuration of the indexing.properties file.

If you want Scores you will need to configure a EntityScoreProvider
when using indexing mode (1) and a EntityIterator if you use indexing
mode (2)

For mode (1) an example is the EntityFieldScoreProvider. It reads the
Float score from a property in the RDF. So if you have scores you can
just add them to imported RDF data and use this implementation. If you
do not have scores available you can use the NoEntityScoreProvider or
the StaticEntityScoreProvider that will provide the same score for all
Entities.

For mode (2) an example is the LineBasedEntityIterator. It reads
entity ids and score from a text file (e.g. the incoming_links.txt
file for dbpedia).

There is also a EntityIneratorToScoreProviderAdapter, but note that
this one will load all data from the incoming_links.txt to an
in-memory map.

So depending what indexing mode you would prefer for the British
National Bibliography you might want to

* generate an incoming link file to be used in mode (1) with the
LineBasedEntityIterator
* generate scores and store them in an RDF file; import those file;
use mode (2) with the EntityFieldScoreProvider

best
Rupert


On Fri, Aug 30, 2013 at 12:57 PM, Alessandro Adamou <ada...@cs.unibo.it> wrote:
> On 29/08/2013 19:16, Rupert Westenthaler wrote:
>>
>> [...] So when using
>>
>> the default incoming_links file those entities will not be indexed.
>> However if you build the incoming_links file by using [1] they will be
>> indexed as [1] assigns the number of incoming links for the redirected
>> page also to the entities that redirect to those. [1] was also used to
>> build the index on dev.iks-project.eu
>>
>> [1]
>> http://svn.apache.org/repos/asf/stanbol/trunk/entityhub/indexing/dbpedia/dbpedia-3.8/entityrankings.sh
>
>
> Question: does it work the same way for ranking any other site, if I want
> entities to be scored by incoming links?
>
> I mean, if I build an incoming_links.txt file how can I configure the
> genericrdf indexer so that it will use it? Which entityScoreProvider should
> I set in the indexing.properties ?
>
> The site would be the British National Bibliography. Its resources have URIs
> of the form
>
> http://bnb.data.bl.uk/id/[resource|person|series]/{identifier}
>
>
> Thanks
>
> Alessandro
>
>
>
>>
>>> maybe it is so for the default index too - could that be it?
>>>
>>> best,
>>> Alessandro
>>>
>>>
>>>
>>> On 29/08/2013 15:45, Rupert Westenthaler wrote:
>>>>
>>>> Hi Manish,
>>>>
>>>> thats correct. In the default dbpedia index dbp-ont:wikiPageRedirects
>>>> are copied over to rdfs:seeAlso. So sending
>>>>
>>>> {
>>>>       "selected": [
>>>>           "rdfs:label" ],
>>>>       "offset": "0",
>>>>       "limit": "5",
>>>>       "constraints": [{
>>>>           "type": "reference",
>>>>           "field": "rdfs:seeAlso",
>>>>           "value": "http://dbpedia.org/resource/Athletic_shoe";
>>>>        }]
>>>> }
>>>>
>>>> to http://dev.iks-project.eu:8081/entityhub/site/dbpedia/query
>>>>
>>>> e.g. by using
>>>>
>>>>       curl -X POST -H "Content-Type:application/json" \
>>>>           --data "@testQuery.json" \
>>>>           http://dev.iks-project.eu:8081/entityhub/site/dbpedia/query
>>>>
>>>> should give you also the results you are expecting
>>>>
>>>> best
>>>> Rupert
>>>>
>>>> On Thu, Aug 29, 2013 at 4:26 PM, Manish Aggarwal <mani.i...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Tennis_shoe redirects to Athletic_shoe.
>>>>>
>>>>> Now, I want to find all the dbpedia resources that redirects to
>>>>> "Athletic_shoe". How could I achieve this through FieldQuery?
>>>>>
>>>>> Note given below sample field query doesn't result in anything
>>>>>
>>>>> {
>>>>>       "selected": [
>>>>>           "rdfs:label" ],
>>>>>       "offset": "0",
>>>>>       "limit": "5",
>>>>>       "constraints": [{
>>>>>           "type": "reference",
>>>>>           "field": "http://dbpedia.org/ontology/wikiPageRedirects";,
>>>>>           "value": "http://dbpedia.org/resource/Athletic_shoe";
>>>>>        }]
>>>>> }
>>>>>
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>> Regards,
>>>>> Manish
>>>>>
>>>>> P.S. I have already created dbpedia.solrindex file
>>>>> with dbpedia-owl:wikiPageRedirects mapping.
>>>>
>>>>
>>>>
>>>
>>> --
>>> Alessandro Adamou, Ph.D.
>>>
>>> Knowledge Media Institute
>>> The Open University
>>> Walton Hall, Milton Keynes MK7 6AA
>>> United Kingdom
>>>
>>>
>>> "I will give you everything, just don't demand anything."
>>> (Ettore Petrolini, 1917)
>>>
>>> Not sent from my iSnobTechDevice
>>>
>>
>>
>
>
> --
> Alessandro Adamou, Ph.D.
>
> Knowledge Media Institute
> The Open University
> Walton Hall, Milton Keynes MK7 6AA
> United Kingdom
>
>
> "I will give you everything, just don't demand anything."
> (Ettore Petrolini, 1917)
>
> Not sent from my iSnobTechDevice
>



-- 
| Rupert Westenthaler             rupert.westentha...@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to