Re: Clerezza Yard setup and SPARQL

Rajan Shah Tue, 02 Jun 2015 09:14:52 -0700

Hi Rupert,

Thanks again for the response.


At present, it's just an observation that mainly with vcard I had issue
with queries. At the same time, I could get results with either custom
entities or even foaf.

I will keep an eye on it and if observe it again, will submit JIRA issue.

With best regards,
Rajan



On Tue, Jun 2, 2015 at 11:03 AM, Rupert Westenthaler <
[email protected]> wrote:

> Hi Rajan,
>
> Sorry I do not have enough time for a detailed answer. But the
> baseline is. EntityLinking does not work with the Clerezza Yard. Even
> if you would not encounter errors both performance and results would
> be much worse as with a SolrYard. This is because EntityLinking
> depends on features that are Solr Exclusive (e.g. the Solr Analyzers
> doing Stemming ... and the ranking of query results).
>
> If you find failing SPARQL queries in the log feel free to report as
> Issues in Jira. I will have a look.
>
> best
> Rupert
>
> On Sat, May 30, 2015 at 11:14 PM, Rajan Shah <[email protected]> wrote:
> > Hi,
> >
> > I can create Clerezza Yard successfully and query the data using SPARQL.
> Now,
> > when it comes to Named Entity Recognition the same issue persists.
> >
> > I would appreciate, if someone can provide some insight or potential
> > resolution.
> >
> > Thanks in advance,
> > Rajan
> >
> > These are the steps I followed:
> >
> > 1. Uploaded relevant ontology to local ontonet
> >
> > 2. Created Managed Site, uploaded triples
> >
> > 3. Verified the data exists via SPARQL query:
> >
> > <binding>
> > <result>
> > <binding name="ticker"><literal>AAPL</literal>
> > </binding><binding name="issuer"><literal>Apple Inc.</literal>
> > </binding><binding name="exchange"><literal>NASDAQ</literal></binding>
> > <binding name="currency"><literal>USD</literal>
> > </binding><binding name="instr">
> > <uri>http://finance.intellimind.io/secmaster/djia/AAPL</uri>
> > </binding>
> > </result>
> > </results></sparql>
> >
> > 4. Entityhub Linking
> >
> > Assuming prefix imind being http://finance.intellimind.io/secmaster (so
> > that namespace prefix can be verified)
> >
> > In the entityhub linking setup, within type mapping I am trying to map
> >
> > a. Type Mapping Setup
> > imind:ticker > rdfs:label
> > imind:exchange > rdfs:label
> > ...
> >
> > b. Select "Case Sensitivity"
> >
> >
> > 5. Chain setup
> >
> > When included it in the list chain, it doesn't capture single entity
> > whereas it spent most of the time in this paricular chain.
> >
> >
> >
> >    - *tika* ( optional , TikaEngine)
> >    - *langdetect* ( required , LanguageDetectionEnhancementEngine)
> >    - *opennlp-sentence* ( required , OpenNlpSentenceDetectionEngine)
> >    - *opennlp-token* ( required , OpenNlpTokenizerEngine)
> >    - *opennlp-pos* ( required , OpenNlpPosTaggingEngine)
> >    - *opennlp-ner* ( required , NamedEntityExtractionEnhancementEngine)
> >    - *refdata-linking* ( required , EntityLinkingEngine)
> >    -
> >
> >
> > *Sample Text:*
> >
> > The Apple Inc. CEO Tim Cook spoke at dev conference. The Apple Inc. has
> > headquarter in US. It's ticker symbol is AAPL, which trades on NASDAQ.
> >
> > On Mon, May 25, 2015 at 12:04 AM, Rajan Shah <[email protected]> wrote:
> >
> >> Hi,
> >>
> >> In order to use Clerezza Yard setup, I tried very simple example
> outlined
> >> at the end.
> >>
> >> I would really appreciate, if someone can shed some light on
> >>
> >> a. Is there anything I am just completely missing here pertaining to
> >> "Named Graph" vs "Unions of Graphs" and reference? If that's the case,
> >> could you please clarify what would be relevant URI/IRI?
> >>
> >> b. What is the best way to debug such an issue? If SPARQL query fails,
> >> where should I see the logs indicating any issue as it doesn't appear in
> >> stdout logs?
> >>
> >> c. Is there any other simple alternative compare to this to achieve
> >> similar functionality? Is storing in Kiwi beneficial compared to this
> >> approach or do I have to have Apache Maramotta installed in order to use
> >> Kiwi?
> >>
> >> Thanks in advance,
> >> Rajan
> >>
> >>
> >> *1. Apache Stanbol Entityhub Yard: Clerezza Yard Configuration*
> >>
> >> Set following parameters
> >>
> >> ID: testYard
> >> Graph URI: http://test.io/ns/friends#
> >>
> >> *2. Setup Clerezza - SCB Jena TDB Storage Provider*
> >>
> >> Jena TDB directory: /<stanbol_dir>/<tdb_store>
> >> Default Graph Name: http://test.io/ns
> >> Weight: 105
> >>
> >> *3. Save the .ttl file into /<stanbol_dir>/<tdb_store>*
> >>
> >> @prefix vcard: <http://www.w3.org/2006/vcard/ns#> .
> >> @prefix rdfa: <http://www.w3.org/ns/rdfa#> .
> >> @prefix friends: <http://test.io/ns/friends#> .
> >>
> >> <http://test.io/ns/friends#AndrewSmith> a vcard:Individual;
> >>                     vcard:fn "Andrew Smith";
> >>                     vcard:title "Founder";
> >>                     vcard:org "ABC LLC";
> >>                     vcard:orgunit "Startup";
> >>                     vcard:hasAddress [
> >>                                         a vcard:Work;
> >>                                         vcard:country-name "USA";
> >>                                         vcard:locality "New York";
> >>                                         vcard:region "New York"
> >>                     ] .
> >>
> >> *4. I do see that, upon startup, it creates necessary index files
> within *
> >> /<stanbol_dir>/<tdb_store>
> >> directory. In addition, within UI, it also registers following
> >> TripleCollections in SPARQL Endpoint
> >>
> >> http://test.io/ns/friends#
> >>
> >> *5. SPARQL Query*
> >> -- query1
> >> PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
> >> PREFIX friends: <http://test.io/ns/friends#>
> >> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> >>
> >> SELECT ?fn ?title ?org
> >> WHERE {
> >>   ?s vcard:fn ?fn ;
> >>     vcard:title ?title ;
> >>     vcard:org ?org .
> >> }
> >>
> >> OR
> >>
> >> -- query2
> >> PREFIX hmgr: <http://test.io/ns/friends#>
> >> PREFIX vcard: <http://www.w3.org/2006/vcard/ns#>
> >>
> >> SELECT ?Individual ?title
> >> WHERE { ?title  vcard:title  "Founder" }
> >>
> >>
> >> *Observations:*
> >>
> >> The above queries work perfectly fine on either command-line or Jena
> Fuseki
> >> as follows
> >>
> >> a. tdbquery --loc /<stanbol_dir/<tdb_store> --query query1
> >> b. using fuseki user interface
> >>
> >> I tried couple alternatives such as GRAPH, NAMED, etc... however nothing
> >> helps. Is there any specific syntax need to be used for the SPARQL
> stanbol
> >> interface?
> >>
> >>
> >>
> >>
>
>
>
> --
> | Rupert Westenthaler             [email protected]
> | Bodenlehenstraße 11                              ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..........................................................................
> | http://redlink.co/
>

Re: Clerezza Yard setup and SPARQL

Reply via email to