Hi Rafa, Could you please open an issue and attach the file there? In any case you can send it also to my email. I will look into that in the next week.
Best, Suat On Fri, Sep 14, 2012 at 4:55 PM, Rafa Haro <rh...@zaizi.com> wrote: > Hi Suat, > > I'm pretty sure. I can send you the Enhancement graph in RDF if you want > to check by your own. I was to post here but is pretty large. > > Regards > > El 14/09/12 15:27, Suat Gonul escribió: > > Hi Rafa, >> >> Are you sure the enhancements of this text do not contain other >> entities. The contexts(URIs) on which the LDPath program is executed are >> obtained as follows: >> >> Iterator<Triple> it = sci.getMetadata().filter(null, >> Properties.ENHANCER_ENTITY_**REFERENCE, null); >> >> In other words, the source of the URIs is the metadata of the >> ContentItem, could you please look into the enhancement graph of your >> ContentItem whether there exists any other Orange related entities? >> >> Best, >> Suat >> >> >> On 09/14/2012 04:15 PM, Rafa Haro wrote: >> >>> Hi all, >>> >>> I have been playing around with DBPedia Spotlight engines these days. >>> With Rupert's help (thanks again) I was able to successfully install >>> and configure it as default engine. My next step was to create a >>> custom index in ContentHub to extract some data about the detected >>> entities and store it in Solr. Specifically, I want to store in Solr >>> the labels of each entities and its types (rdf:types). For example, >>> for the entity President Obama I would get: >>> >>> Labels: >>> >>> Presidency of Barack Obama >>> Présidence de Barack Obama >>> Barack Obama >>> >>> Types: >>> foaf:Person >>> dbpedia-owl:Person >>> dbpedia-owl:OfficeHolder >>> dbpedia-owl:Agent >>> >>> In order to achieve this, I have tried to extend default ContentHub >>> LDPath Program with this line: >>> >>> concepts = fn:concat(rdfs:label[@en]," ", rdf:type) :: xsd:string; >>> >>> I know that it might give me exactly what I want, but it was just a >>> first test. Anyway, I found some issues when I submitted a document to >>> store it in my new index: >>> >>> 1. Recognized entities weren't exactly the same that you can get using >>> DBPedia Spotlight demo >>> (http://dbpedia-spotlight.**github.com/demo/index.html<http://dbpedia-spotlight.github.com/demo/index.html>), >>> which results >>> are more accurate. I think that's because the 'No common words' >>> feature in the demo. I have been trying to configure it in the engine, >>> but I wasn't able to. >>> >>> 2. The LDPath program is executed also for entities that are not >>> recognized by the engine. For example, using the following text: >>> >>> " /Orange is a tropical to semitropical, evergreen, small flowering >>> tree growing to about 5 to 8 m tall and bears seasonal fruits that >>> measure about 3 inches in diameter and weighs about 100-150 g. Oranges >>> are classified into two general categories, sweet and bitter, with the >>> former being the type most commonly consumed. Popular varieties of the >>> sweet orange include Valencia, Navel, Persian variety, and blood >>> orange./" >>> >>> The enhancer only recognized Orange (fruit) but, when I submit the >>> text to the content hub I also get results for Orange, Texas (Place). >>> I would need to store only the information of the disambiguated entity. >>> >>> Thanks. Regards >>> >>> This message should be regarded as confidential. If you have received >>> this email in error please notify the sender and destroy it >>> immediately. Statements of intent shall only become binding when >>> confirmed in hard copy by an authorised signatory. >>> >>> Zaizi Ltd is registered in England and Wales with the registration >>> number 6440931. The Registered Office is 222 Westbourne Studios, 242 >>> Acklam Road, London W10 5JJ, UK. >>> >>> > This message should be regarded as confidential. If you have received this > email in error please notify the sender and destroy it immediately. > Statements of intent shall only become binding when confirmed in hard copy > by an authorised signatory. > > Zaizi Ltd is registered in England and Wales with the registration number > 6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road, > London W10 5JJ, UK. > >