Hi Stephane This is a prefix added to Lucene stored fields (the fields that actually get stored "as is" or unmodified in the document and returned by Lucene when asking for Documents). Lucene also creates (or can be told to do so) fields which are not "stored", thus one can analyze, tokenize, etc the original value and create fields by which Lucene can search/sort - but those fields are not returned as part of the document.
We add them to all stored fields before indexing (in the GraphIndexer.resourceToDocument method). I am not sure anymore why exactly this was needed. I think there was a peculiar problem with the sort order when this was missing but I am not sure what exactly needed this "workaround". Daniel 2014/1/13 Stephane Gamard <[email protected]> > Hi all, > > I am trying to implement new conditions for CRIS and I’ve come around a > peculiar problem. I’ve create a “BoostCondition” based on the same > principle than the WildCardCondition. Here’s it’s Ctor and query method: > > public BoostCondition(VirtualProperty property, String value, Float > boost) { > this.property = property; > this.value = value; > this.boost = boost; > } > > public BoostCondition(UriRef uriRefProperty, String value, Float boost) { > this(new PropertyHolder(uriRefProperty,false), value,boost); > } > > @Override > protected Query query() { > TermQuery termQuery = new TermQuery(new Term(property.getStringKey(), > value)); > termQuery.setBoost(boost); > return termQuery; > } > > Nothing fancy and here is how it is used: > > conditions.add(new BoostCondition(RDF.type, "< > http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>", new > Float(0.5))); > > final List<NonLiteral> matchingNodes = > indexService.findResources(conditions, facetCollector); > node.addPropertyValue(ECS.contentsCount, matchingNodes.size()); > > All is well EXCEPT that in CRIS it will look for the field ‘RDF.type’ > while when indexed it is indexed as: “_STORED_”+RDF.type as per the > following lucene Query: > +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* + > http://www.w3.org/1999/02/22-rdf-syntax-ns#type:< > http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5 > > > Attached is log with and without the custom condition > > INDEXING > ========== > > 13.01.2014 16:05:50.165 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer CRIS Reindex Thread[386]: cache > full or writes have ceased. Indexing... > 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing < > http://fusepool.info/doc/pmc/3470790> considering 3 properties > ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1, > org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a, > org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e]) > 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 1 > 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1( > http://purl.org/dc/elements/1.1/subject) with value > http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b > 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2 > 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a( > http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value > http://fusepool.eu/ontologies/ecs#ContentItem > 13.01.2014 16:05:50.379 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a( > http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value > http://purl.org/ontology/bibo/Document > 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1 > 13.01.2014 16:05:50.380 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2) > with value Two barriers for sodium in vascular endothelium? Vascular > endothelium plays a key role in blood pressure regulation. Recently, it has > been shown that a 5% increase of plasma sodium concentration (sodium > excess) stiffens endothelial cells by about 25%, leading to cellular > dysfunction. Surface measurements demonstrated that the endothelial > glycocalyx (eGC), an anionic biopolymer, deteriorates when sodium is > elevated. In view of these results, a two-barrier model for sodium exiting > the circulation across the endothelium is suggested. The first sodium > barrier is the eGC which selectively buffers sodium ions with its > negatively charged prote-oglycans.The second sodium barrier is the > endothelial plasma membrane which contains sodium channels. Sodium excess, > in the presence of aldosterone, leads to eGC break-down and, in parallel, > to an up-regulation of plasma membrane sodium channels. The following > hypothesis is postulated: Sodium excess increases vascular sodium > permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day > ingested sodium, instead of being readily buffered by the eGC and then > rapidly excreted by the kidneys, is distributed in the whole body before > being finally excreted. Gradually, the sodium overload damages the organism. > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing < > http://fusepool.info/doc/pmc/3581062> considering 3 properties > ([org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1, > org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a, > org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e]) > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1 with values 2 > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1( > http://purl.org/dc/elements/1.1/subject) with value > http://fusepool.info/id/4cfa649e-5eca-4349-bbb5-f782b87089d4 > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@b309d7c1( > http://purl.org/dc/elements/1.1/subject) with value > http://fusepool.info/id/f421cc4a-619c-4189-a3ef-3c2025e50ac9 > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a with values 2 > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a( > http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value > http://fusepool.eu/ontologies/ecs#ContentItem > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.PropertyHolder@f5e5585a( > http://www.w3.org/1999/02/22-rdf-syntax-ns#type) with value > http://purl.org/ontology/bibo/Document > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e with values 1 > 13.01.2014 16:05:50.388 *INFO* [CRIS Reindex Thread[386]] > org.apache.clerezza.rdf.cris.GraphIndexer indexing > org.apache.clerezza.rdf.cris.JoinVirtualProperty@cf3ff15e(J683e9b57eca321d4a268d4b24df62c9bfb7169b2) > with value Diagnosis and treatment of mitochondrial myopathies > Mitochondrial disorders are a heterogeneous group of disorders resulting > from primary dysfunction of the respiratory chain. Muscle tissue is highly > metabolically active, and therefore myopathy is a common element of the > clinical presentation of these disorders, although this may be overshadowed > by central neurological features. This review is aimed at a general medical > and neurologist readership and provides a clinical approach to the > recognition, investigation, and treatment of mitochondrial myopathies. > Emphasis is placed on practical management considerations while including > some recent updates in the field. > > > > SEARCH WITHOUT CUSTOM CONDITION > =============================== > > 13.01.2014 16:07:48.343 *INFO* [627421185@qtp-612005121-38] > org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery: > +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* > 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] > org.apache.clerezza.rdf.cris.GraphIndexer _STORED_ > http://purl.org/dc/elements/1.1/subject : > http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b > 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] > org.apache.clerezza.rdf.cris.GraphIndexer _STORED_ > http://www.w3.org/1999/02/22-rdf-syntax-ns#type : > http://fusepool.eu/ontologies/ecs#ContentItem > 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] > org.apache.clerezza.rdf.cris.GraphIndexer _STORED_ > http://www.w3.org/1999/02/22-rdf-syntax-ns#type : > http://purl.org/ontology/bibo/Document > 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] > org.apache.clerezza.rdf.cris.GraphIndexer > _STORED_J683e9b57eca321d4a268d4b24df62c9bfb7169b2 : Two barriers for sodium > in vascular endothelium? Vascular endothelium plays a key role in blood > pressure regulation. Recently, it has been shown that a 5% increase of > plasma sodium concentration (sodium excess) stiffens endothelial cells by > about 25%, leading to cellular dysfunction. Surface measurements > demonstrated that the endothelial glycocalyx (eGC), an anionic biopolymer, > deteriorates when sodium is elevated. In view of these results, a > two-barrier model for sodium exiting the circulation across the endothelium > is suggested. The first sodium barrier is the eGC which selectively buffers > sodium ions with its negatively charged prote-oglycans.The second sodium > barrier is the endothelial plasma membrane which contains sodium channels. > Sodium excess, in the presence of aldosterone, leads to eGC break-down and, > in parallel, to an up-regulation of plasma membrane sodium channels. The > following hypothesis is postulated: Sodium excess increases vascular sodium > permeability. Under such con-ditions (e.g. high-sodium diet), day-by-day > ingested sodium, instead of being readily buffered by the eGC and then > rapidly excreted by the kidneys, is distributed in the whole body before > being finally excreted. Gradually, the sodium overload damages the organism. > 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] > org.apache.clerezza.rdf.cris.GraphIndexer resource-uri : > http://fusepool.info/doc/pmc/3470790 > 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] > org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site > registered for Entity > http://fusepool.info/id/caa7fc7a-f024-47d8-925d-151eb8600b6b > 13.01.2014 16:07:48.361 *INFO* [627421185@qtp-612005121-38] > org.apache.stanbol.entityhub.core.impl.SiteManagerImpl No Referenced Site > registered for Entity http://purl.org/ontology/bibo/Document > > > SEARCH WITH CUSTOM CONDITION > ============================ > > 13.01.2014 16:14:32.746 *INFO* [806435093@qtp-612005121-40] > org.apache.clerezza.rdf.cris.GraphIndexer luceneQuery: > +J683e9b57eca321d4a268d4b24df62c9bfb7169b2:*sodium* + > http://www.w3.org/1999/02/22-rdf-syntax-ns#type:< > http://www.patexpert.org/ontologies/pmo.owl#PatentPublication>^0.5 > >
