Hi Rupert, Thanks a lot for the clarification!
It makes sense. With best regards, Rajan On Mon, Jun 8, 2015 at 3:50 AM, Rupert Westenthaler < rupert.westentha...@gmail.com> wrote: > Hi, > > The SolrYard does not support BNodes and the VCard RDF tends to use those. > > If you use the Entityhub Indexing Tool for importing the data you can > try to set the "bnode-prefix" for the rdf indexing source (see > STANBOL-765 [1] for details) > > best > Rupert > > [1] https://issues.apache.org/jira/browse/STANBOL-765 > > On Tue, Jun 2, 2015 at 6:13 PM, Rajan Shah <raja...@gmail.com> wrote: > > Hi Rupert, > > > > Thanks again for the response. > > > > At present, it's just an observation that mainly with vcard I had issue > > with queries. At the same time, I could get results with either custom > > entities or even foaf. > > > > I will keep an eye on it and if observe it again, will submit JIRA issue. > > > > With best regards, > > Rajan > > > > > > > > On Tue, Jun 2, 2015 at 11:03 AM, Rupert Westenthaler < > > rupert.westentha...@gmail.com> wrote: > > > >> Hi Rajan, > >> > >> Sorry I do not have enough time for a detailed answer. But the > >> baseline is. EntityLinking does not work with the Clerezza Yard. Even > >> if you would not encounter errors both performance and results would > >> be much worse as with a SolrYard. This is because EntityLinking > >> depends on features that are Solr Exclusive (e.g. the Solr Analyzers > >> doing Stemming ... and the ranking of query results). > >> > >> If you find failing SPARQL queries in the log feel free to report as > >> Issues in Jira. I will have a look. > >> > >> best > >> Rupert > >> > >> On Sat, May 30, 2015 at 11:14 PM, Rajan Shah <raja...@gmail.com> wrote: > >> > Hi, > >> > > >> > I can create Clerezza Yard successfully and query the data using > SPARQL. > >> Now, > >> > when it comes to Named Entity Recognition the same issue persists. > >> > > >> > I would appreciate, if someone can provide some insight or potential > >> > resolution. > >> > > >> > Thanks in advance, > >> > Rajan > >> > > >> > These are the steps I followed: > >> > > >> > 1. Uploaded relevant ontology to local ontonet > >> > > >> > 2. Created Managed Site, uploaded triples > >> > > >> > 3. Verified the data exists via SPARQL query: > >> > > >> > <binding> > >> > <result> > >> > <binding name="ticker"><literal>AAPL</literal> > >> > </binding><binding name="issuer"><literal>Apple Inc.</literal> > >> > </binding><binding name="exchange"><literal>NASDAQ</literal></binding> > >> > <binding name="currency"><literal>USD</literal> > >> > </binding><binding name="instr"> > >> > <uri>http://finance.intellimind.io/secmaster/djia/AAPL</uri> > >> > </binding> > >> > </result> > >> > </results></sparql> > >> > > >> > 4. Entityhub Linking > >> > > >> > Assuming prefix imind being http://finance.intellimind.io/secmaster > (so > >> > that namespace prefix can be verified) > >> > > >> > In the entityhub linking setup, within type mapping I am trying to map > >> > > >> > a. Type Mapping Setup > >> > imind:ticker > rdfs:label > >> > imind:exchange > rdfs:label > >> > ... > >> > > >> > b. Select "Case Sensitivity" > >> > > >> > > >> > 5. Chain setup > >> > > >> > When included it in the list chain, it doesn't capture single entity > >> > whereas it spent most of the time in this paricular chain. > >> > > >> > > >> > > >> > - *tika* ( optional , TikaEngine) > >> > - *langdetect* ( required , LanguageDetectionEnhancementEngine) > >> > - *opennlp-sentence* ( required , OpenNlpSentenceDetectionEngine) > >> > - *opennlp-token* ( required , OpenNlpTokenizerEngine) > >> > - *opennlp-pos* ( required , OpenNlpPosTaggingEngine) > >> > - *opennlp-ner* ( required , > NamedEntityExtractionEnhancementEngine) > >> > - *refdata-linking* ( required , EntityLinkingEngine) > >> > - > >> > > >> > > >> > *Sample Text:* > >> > > >> > The Apple Inc. CEO Tim Cook spoke at dev conference. The Apple Inc. > has > >> > headquarter in US. It's ticker symbol is AAPL, which trades on NASDAQ. > >> > > >> > On Mon, May 25, 2015 at 12:04 AM, Rajan Shah <raja...@gmail.com> > wrote: > >> > > >> >> Hi, > >> >> > >> >> In order to use Clerezza Yard setup, I tried very simple example > >> outlined > >> >> at the end. > >> >> > >> >> I would really appreciate, if someone can shed some light on > >> >> > >> >> a. Is there anything I am just completely missing here pertaining to > >> >> "Named Graph" vs "Unions of Graphs" and reference? If that's the > case, > >> >> could you please clarify what would be relevant URI/IRI? > >> >> > >> >> b. What is the best way to debug such an issue? If SPARQL query > fails, > >> >> where should I see the logs indicating any issue as it doesn't > appear in > >> >> stdout logs? > >> >> > >> >> c. Is there any other simple alternative compare to this to achieve > >> >> similar functionality? Is storing in Kiwi beneficial compared to this > >> >> approach or do I have to have Apache Maramotta installed in order to > use > >> >> Kiwi? > >> >> > >> >> Thanks in advance, > >> >> Rajan > >> >> > >> >> > >> >> *1. Apache Stanbol Entityhub Yard: Clerezza Yard Configuration* > >> >> > >> >> Set following parameters > >> >> > >> >> ID: testYard > >> >> Graph URI: http://test.io/ns/friends# > >> >> > >> >> *2. Setup Clerezza - SCB Jena TDB Storage Provider* > >> >> > >> >> Jena TDB directory: /<stanbol_dir>/<tdb_store> > >> >> Default Graph Name: http://test.io/ns > >> >> Weight: 105 > >> >> > >> >> *3. Save the .ttl file into /<stanbol_dir>/<tdb_store>* > >> >> > >> >> @prefix vcard: <http://www.w3.org/2006/vcard/ns#> . > >> >> @prefix rdfa: <http://www.w3.org/ns/rdfa#> . > >> >> @prefix friends: <http://test.io/ns/friends#> . > >> >> > >> >> <http://test.io/ns/friends#AndrewSmith> a vcard:Individual; > >> >> vcard:fn "Andrew Smith"; > >> >> vcard:title "Founder"; > >> >> vcard:org "ABC LLC"; > >> >> vcard:orgunit "Startup"; > >> >> vcard:hasAddress [ > >> >> a vcard:Work; > >> >> vcard:country-name "USA"; > >> >> vcard:locality "New York"; > >> >> vcard:region "New York" > >> >> ] . > >> >> > >> >> *4. I do see that, upon startup, it creates necessary index files > >> within * > >> >> /<stanbol_dir>/<tdb_store> > >> >> directory. In addition, within UI, it also registers following > >> >> TripleCollections in SPARQL Endpoint > >> >> > >> >> http://test.io/ns/friends# > >> >> > >> >> *5. SPARQL Query* > >> >> -- query1 > >> >> PREFIX vcard: <http://www.w3.org/2006/vcard/ns#> > >> >> PREFIX friends: <http://test.io/ns/friends#> > >> >> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > >> >> > >> >> SELECT ?fn ?title ?org > >> >> WHERE { > >> >> ?s vcard:fn ?fn ; > >> >> vcard:title ?title ; > >> >> vcard:org ?org . > >> >> } > >> >> > >> >> OR > >> >> > >> >> -- query2 > >> >> PREFIX hmgr: <http://test.io/ns/friends#> > >> >> PREFIX vcard: <http://www.w3.org/2006/vcard/ns#> > >> >> > >> >> SELECT ?Individual ?title > >> >> WHERE { ?title vcard:title "Founder" } > >> >> > >> >> > >> >> *Observations:* > >> >> > >> >> The above queries work perfectly fine on either command-line or Jena > >> Fuseki > >> >> as follows > >> >> > >> >> a. tdbquery --loc /<stanbol_dir/<tdb_store> --query query1 > >> >> b. using fuseki user interface > >> >> > >> >> I tried couple alternatives such as GRAPH, NAMED, etc... however > nothing > >> >> helps. Is there any specific syntax need to be used for the SPARQL > >> stanbol > >> >> interface? > >> >> > >> >> > >> >> > >> >> > >> > >> > >> > >> -- > >> | Rupert Westenthaler rupert.westentha...@gmail.com > >> | Bodenlehenstraße 11 ++43-699-11108907 > >> | A-5500 Bischofshofen > >> | REDLINK.CO > >> > .......................................................................... > >> | http://redlink.co/ > >> > > > > -- > | Rupert Westenthaler rupert.westentha...@gmail.com > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen > | REDLINK.CO > .......................................................................... > | http://redlink.co/ >