Thanks Daniel, Yes I saw that. Stupid mistake from me, I thought there were all stored. I think I found the problem with my boost. Currently the condition is expressed as a lucene query but fails the syntax when the key is a RDF uriref: http://askagfdasd.jasd#toto:hello is not valid.
I’m thinking about slug-fying the name of the field instead of having the raw uri used as the field key. What do you think? _Stephane On 13 Jan 2014, at 17:34, Daniel Spicar <[email protected]> wrote: > Hi Stephane > > This is a prefix added to Lucene stored fields (the fields that actually > get stored "as is" or unmodified in the document and returned by Lucene > when asking for Documents). Lucene also creates (or can be told to do so) > fields which are not "stored", thus one can analyze, tokenize, etc the > original value and create fields by which Lucene can search/sort - but > those fields are not returned as part of the document. > > We add them to all stored fields before indexing (in the > GraphIndexer.resourceToDocument method). I am not sure anymore why exactly > this was needed. I think there was a peculiar problem with the sort order > when this was missing but I am not sure what exactly needed this > "workaround". > > Daniel > > > > 2014/1/13 Stephane Gamard <[email protected]> > >> Hi all, >> >> I am trying to implement new conditions for CRIS and I’ve come around a >> peculiar problem. I’ve create a “BoostCondition” based on the same >> principle than the WildCardCondition. Here’s it’s Ctor and query method: >> >> public BoostCondition(VirtualProperty property, String value, Float >> boost) { >> this.property = property; >> this.value = value; >> this.boost = boost; >> } >> >> public BoostCondition(UriRef uriRefProperty, String value, Float boost) { >> this(new PropertyHolder(uriRefProperty,false), value,boost); >> } >> >> @Override >> protected Query query() { >> TermQuery termQuery = new TermQuery(new Term(property.getStringKey(), >> value)); >> termQuery.setBoost(boost); >> return termQuery; >> } >> >> Nothing fancy and here is how it is used: >> >> conditions.add(new BoostCondition(RDF.type, "< >> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>", new >> Float(0.5))); >> >> final List<NonLiteral> matchingNodes = >> indexService.findResources(conditions, facetCollector); >> node.addPropertyValue(ECS.contentsCount, matchingNodes.size()); >> >> All is well EXCEPT that in CRIS it will look for the field ‘RDF.type’ >> while when indexed it is indexed as: “_STORED_”+RDF.type as per the >> following lucene Query: >> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* + >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:< >> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5 >> >> >> Attached is log with and without the custom condition >> >> INDEXING >> ========== >> >> 13.01.2014 16:05:50.165 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer CRIS Reindex Thread[386]: cache >> full or writes have ceased. Indexing... >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing < >> http://fusepool.info/doc/pmc/3470790> considering 3 properties >> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1, >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a, >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e]) >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 1 >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1( >> http://purl.org/dc/elements/1.1/subject) with value >> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2 >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a( >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value >> http://fusepool.eu/ontologies/ecs#ContentItem >> 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a( >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value >> http://purl.org/ontology/bibo/Document >> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1 >> 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2) >> with value Two barriers for sodium in vascular endothelium? Vascular >> endothelium plays a key role in blood pressure regulation. Recently, it has >> been shown that a 5% increase of plasma sodium concentration (sodium >> excess) stiffens endothelial cells by about 25%, leading to cellular >> dysfunction. Surface measurements demonstrated that the endothelial >> glycocalyx (eGC), an anionic biopolymer, deteriorates when sodium is >> elevated. In view of these results, a two-barrier model for sodium exiting >> the circulation across the endothelium is suggested. The first sodium >> barrier is the eGC which selectively buffers sodium ions with its >> negatively charged prote-oglycans.The second sodium barrier is the >> endothelial plasma membrane which contains sodium channels. Sodium excess, >> in the presence of aldosterone, leads to eGC break-down and, in parallel, >> to an up-regulation of plasma membrane sodium channels. The following >> hypothesis is postulated: Sodium excess increases vascular sodium >> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day >> ingested sodium, instead of being readily buffered by the eGC and then >> rapidly excreted by the kidneys, is distributed in the whole body before >> being finally excreted. Gradually, the sodium overload damages the organism. >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing < >> http://fusepool.info/doc/pmc/3581062> considering 3 properties >> ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1, >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a, >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e]) >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 2 >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1( >> http://purl.org/dc/elements/1.1/subject) with value >> http://fusepool.info/id/4cfa649e-5eca-4349-bbb5-f782b87089d4 >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1( >> http://purl.org/dc/elements/1.1/subject) with value >> http://fusepool.info/id/f421cc4a-619c-4189-a3ef-3c2025e50ac9 >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2 >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a( >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value >> http://fusepool.eu/ontologies/ecs#ContentItem >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a( >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value >> http://purl.org/ontology/bibo/Document >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1 >> 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] >> org.apache.clerezza.rdf.cris.GraphIndexer indexing >> org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2) >> with value Diagnosis and treatment of mitochondrial myopathies >> Mitochondrial disorders are a heterogeneous group of disorders resulting >> from primary dysfunction of the respiratory chain. Muscle tissue is highly >> metabolically active, and therefore myopathy is a common element of the >> clinical presentation of these disorders, although this may be overshadowed >> by central neurological features. This review is aimed at a general medical >> and neurologist readership and provides a clinical approach to the >> recognition, investigation, and treatment of mitochondrial myopathies. >> Emphasis is placed on practical management considerations while including >> some recent updates in the field. >> >> >> >> SEARCH WITHOUT CUSTOM CONDITION >> =============================== >> >> 13.01.2014 16:07:48.343 *INFO* [627421185@qtp-612005121-38] >> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery: >> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] >> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_ >> http://purl.org/dc/elements/1.1/subject : >> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] >> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_ >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type : >> http://fusepool.eu/ontologies/ecs#ContentItem >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] >> org.apache.clerezza.rdf.cris.GraphIndexer _STORED_ >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type : >> http://purl.org/ontology/bibo/Document >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] >> org.apache.clerezza.rdf.cris.GraphIndexer >> _STORED_J683e9b57eca321d4a268d4b24df62c9bfb7169b2 : Two barriers for sodium >> in vascular endothelium? Vascular endothelium plays a key role in blood >> pressure regulation. Recently, it has been shown that a 5% increase of >> plasma sodium concentration (sodium excess) stiffens endothelial cells by >> about 25%, leading to cellular dysfunction. Surface measurements >> demonstrated that the endothelial glycocalyx (eGC), an anionic biopolymer, >> deteriorates when sodium is elevated. In view of these results, a >> two-barrier model for sodium exiting the circulation across the endothelium >> is suggested. The first sodium barrier is the eGC which selectively buffers >> sodium ions with its negatively charged prote-oglycans.The second sodium >> barrier is the endothelial plasma membrane which contains sodium channels. >> Sodium excess, in the presence of aldosterone, leads to eGC break-down and, >> in parallel, to an up-regulation of plasma membrane sodium channels. The >> following hypothesis is postulated: Sodium excess increases vascular sodium >> permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day >> ingested sodium, instead of being readily buffered by the eGC and then >> rapidly excreted by the kidneys, is distributed in the whole body before >> being finally excreted. Gradually, the sodium overload damages the organism. >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] >> org.apache.clerezza.rdf.cris.GraphIndexer resource-uri : >> http://fusepool.info/doc/pmc/3470790 >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] >> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site >> registered for Entity >> http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b >> 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] >> org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site >> registered for Entity http://purl.org/ontology/bibo/Document >> >> >> SEARCH WITH CUSTOM CONDITION >> ============================ >> >> 13.01.2014 16:14:32.746 *INFO* [806435093@qtp-612005121-40] >> org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery: >> +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* + >> http://www.w3.org/1999/02/22-rdf-syntax-ns#type:< >> http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5 >> >>
