RE: PathHierarchyTokenizerFactory and facet_count
Yes, that solved my problem. There must be an implisite facet.limit set because I tried the same url query with face.limit=1. And got back records with "EARTH SCIENCE>GEOGRAPHIC REGION>ARCTIC" Cheers! Endre -Original Message- From: Upayavira [mailto:u...@odoko.co.uk] Sent: 28. september 2015 14:01 To: solr-user@lucene.apache.org Subject: Re: PathHierarchyTokenizerFactory and facet_count There is also facet.limit which says how many facet entries to return. Is that catching you? The document either matches your query, or doesn't. If it does, then all values of the Parameter field should be included in your faceting. But, perhaps not all facet buckets are being returned to you - hence try facet.limit = 100 or such Upayavira On Mon, Sep 28, 2015, at 11:47 AM, Moen Endre wrote: > How does facet_count work with a facet field that is defined as solr. > PathHierarchyTokenizerFactory? > > I have multiple records that contains field Parameter which is of type > PathHierarchyTokenizerFactory. > E.g > "Parameter": [ > "EARTH SCIENCE>OCEANS>OCEAN TEMPERATURE>WATER TEMPERATURE", > "EARTH SCIENCE>OCEANS>OCEAN PRESSURE>WATER PRESSURE", > "EARTH SCIENCE>OCEANS>OCEAN ACOUSTICS>ACOUSTIC VELOCITY", > "EARTH SCIENCE>ACOUSTIC", > "EARTH SCIENCE>VELOCITY", > "EARTH SCIENCE>ACOBAR | ACOUSTIC TECHNOLOGY FOR OBSERVING THE > INTERIOR OF THE ARCTIC OCEAN", > "EARTH SCIENCE>GEOGRAPHIC REGION>POLAR", > "EARTH SCIENCE>GEOGRAPHIC REGION>ARCTIC" > ], > > But when I run a query to get all facet counts for Parameter - with > this > query: > http://localhost:8983/solr/nmdc/query? > q=*:*&facet=true&rows=0&facet.mincount=1&facet.field=Parameter > > the two last entries from this record; "EARTH SCIENCE>GEOGRAPHIC > REGION>POLAR", "EARTH SCIENCE>GEOGRAPHIC REGION>ARCTIC" > > is missing from the facet_count - which looks like: > > "facet_counts":{ > > "facet_queries":{}, > > "facet_fields":{ > > "Parameter":[ > > "EARTH SCIENCE",228, > > "EARTH SCIENCE>OCEANS",128, > > "EARTH SCIENCE>OCEANS>OCEAN TEMPERATURE",100, > > "EARTH SCIENCE>OCEANS>SALINITY/DENSITY",90, > ... > > Im running solr 5.0 > > Why does the query seem to omit some of the Parameter entries from > records? > Path is configured with: > > > class="solr.PathHierarchyTokenizerFactory" > delimiter=">" /> > > > /> > > > > Cheers > Endre >
PathHierarchyTokenizerFactory and facet_count
How does facet_count work with a facet field that is defined as solr. PathHierarchyTokenizerFactory? I have multiple records that contains field Parameter which is of type PathHierarchyTokenizerFactory. E.g "Parameter": [ "EARTH SCIENCE>OCEANS>OCEAN TEMPERATURE>WATER TEMPERATURE", "EARTH SCIENCE>OCEANS>OCEAN PRESSURE>WATER PRESSURE", "EARTH SCIENCE>OCEANS>OCEAN ACOUSTICS>ACOUSTIC VELOCITY", "EARTH SCIENCE>ACOUSTIC", "EARTH SCIENCE>VELOCITY", "EARTH SCIENCE>ACOBAR | ACOUSTIC TECHNOLOGY FOR OBSERVING THE INTERIOR OF THE ARCTIC OCEAN", "EARTH SCIENCE>GEOGRAPHIC REGION>POLAR", "EARTH SCIENCE>GEOGRAPHIC REGION>ARCTIC" ], But when I run a query to get all facet counts for Parameter - with this query: http://localhost:8983/solr/nmdc/query? q=*:*&facet=true&rows=0&facet.mincount=1&facet.field=Parameter the two last entries from this record; "EARTH SCIENCE>GEOGRAPHIC REGION>POLAR", "EARTH SCIENCE>GEOGRAPHIC REGION>ARCTIC" is missing from the facet_count - which looks like: "facet_counts":{ "facet_queries":{}, "facet_fields":{ "Parameter":[ "EARTH SCIENCE",228, "EARTH SCIENCE>OCEANS",128, "EARTH SCIENCE>OCEANS>OCEAN TEMPERATURE",100, "EARTH SCIENCE>OCEANS>SALINITY/DENSITY",90, ... Im running solr 5.0 Why does the query seem to omit some of the Parameter entries from records? Path is configured with: Cheers Endre
RE: testing with EmbeddedSolrServer
Mikhail, The purpose of using EmbeddedSolrServer is for testing, not for running as main(). Is there a best practice for doing integration-testing of solr? Or of validating that queries to solr returns the expected result? E.g. I have this bit of production code: private String getStartAndStopDateIntersectsRange( Date beginDate, Date EndDate) { ... dateQuery = "( (Start_Date:[* TO "+ endDate +"] AND Stop_Date:["+beginDate+" TO *])"+ " OR (Start_Date:[* TO "+ endDate +"] AND !Stop_Date:[* TO *])" + " OR (!Start_Date:[* TO *] AND Stop_Date:["+beginDate+" TO *]) )"; .. } And I would like to write a test-case that only returns the records that intersects a given daterange. Cheers Endre -Original Message- From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com] Sent: 31. august 2015 15:02 To: solr-user Subject: Re: testing with EmbeddedSolrServer Endre, As I suggested before, consider to avoid test framework, just put all code interacting with EmbeddedSolrServer into main() method. On Mon, Aug 31, 2015 at 12:15 PM, Moen Endre wrote: > Hi Mikhail, > > Im trying to read 7-8 xml files of data that contain realistic data > from our production server. Then I would like to read this data into > EmbeddedSolrServer to test for edge cases for our custom date search. > The use of EmbeddedSolrServer is purely to separate the data testing > from any environment that might change over time. > > I would also like to avoid writing plumbing-code to import each field > from the xml since I already have a working DIH. > > I tried adding synchronous=true but it doesn’t look like it makes solr > complete the import before doing a search. > > Looking at the log it doesn’t seem process the import request: > [searcherExecutor-6-thread-1-processing-{core=nmdc}] DEBUG > o.apache.solr.core.SolrCore.Request - [nmdc] webapp=null path=null > params={q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false > &event=firstSearcher} > ... > [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE > 20DD5CE]] INFO org.apache.solr.core.CoreContainer - registering core: > nmdc > 10:48:31.613 > [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE > 20DD5CE]] INFO o.apache.solr.core.SolrCore.Request - [nmdc] > webapp=null > path=/dataimport2 > params={qt=%2Fdataimport2&command=full-import%26clean%3Dtrue%26synchro > nous%3Dtrue} > status=0 QTime=1 > > {responseHeader={status=0,QTime=1},initArgs={defaults={config=dih-conf > ig.xml}},command=full-import&clean=true&synchronous=true,status=idle,i > mportResponse=,statusMessages={}} > [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE > 20DD5CE]] DEBUG o.apache.solr.core.SolrCore.Request - [nmdc] > webapp=null path=/select params={q=*%3A*} > [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE > 20DD5CE]] DEBUG o.a.s.h.component.QueryComponent - process: > q=*:*&df=text&rows=10&echoParams=explicit > [searcherExecutor-6-thread-1-processing-{core=nmdc}] DEBUG > o.a.s.h.component.QueryComponent - process: > q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&df=text > &event=firstSearcher&rows=10&echoParams=explicit > [searcherExecutor-6-thread-1-processing-{core=nmdc}] DEBUG > o.a.s.search.stats.LocalStatsCache - ## GET > {q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&df=tex > t&event=firstSearcher&rows=10&echoParams=explicit} > [searcherExecutor-6-thread-1-processing-{core=nmdc}] INFO > o.apache.solr.core.SolrCore.Request - [nmdc] webapp=null path=null > params={q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false > &event=firstSearcher} > hits=0 status=0 QTime=36 > [searcherExecutor-6-thread-1-processing-{core=nmdc}] INFO > org.apache.solr.core.SolrCore - QuerySenderListener done. > [searcherExecutor-6-thread-1-processing-{core=nmdc}] INFO > org.apache.solr.core.SolrCore - [nmdc] Registered new searcher > Searcher@28be2785[nmdc] > main{ExitableDirectoryReader(UninvertingDirectoryReader())} > ... > [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE > 20DD5CE]] INFO org.apache.solr.update.SolrCoreState - Closing > SolrCoreState > [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE > 20DD5CE]] INFO o.a.solr.update.DefaultSolrCoreState - SolrCoreState > ref count has reached 0 - closing IndexWriter > [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE > 20DD5CE]] INFO o.a.solr.update.DefaultSolrCoreState - closing > IndexWriter with IndexWriterCloser > [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor
RE: testing with EmbeddedSolrServer
Hi Mikhail, Im trying to read 7-8 xml files of data that contain realistic data from our production server. Then I would like to read this data into EmbeddedSolrServer to test for edge cases for our custom date search. The use of EmbeddedSolrServer is purely to separate the data testing from any environment that might change over time. I would also like to avoid writing plumbing-code to import each field from the xml since I already have a working DIH. I tried adding synchronous=true but it doesn’t look like it makes solr complete the import before doing a search. Looking at the log it doesn’t seem process the import request: [searcherExecutor-6-thread-1-processing-{core=nmdc}] DEBUG o.apache.solr.core.SolrCore.Request - [nmdc] webapp=null path=null params={q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&event=firstSearcher} ... [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE20DD5CE]] INFO org.apache.solr.core.CoreContainer - registering core: nmdc 10:48:31.613 [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE20DD5CE]] INFO o.apache.solr.core.SolrCore.Request - [nmdc] webapp=null path=/dataimport2 params={qt=%2Fdataimport2&command=full-import%26clean%3Dtrue%26synchronous%3Dtrue} status=0 QTime=1 {responseHeader={status=0,QTime=1},initArgs={defaults={config=dih-config.xml}},command=full-import&clean=true&synchronous=true,status=idle,importResponse=,statusMessages={}} [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE20DD5CE]] DEBUG o.apache.solr.core.SolrCore.Request - [nmdc] webapp=null path=/select params={q=*%3A*} [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE20DD5CE]] DEBUG o.a.s.h.component.QueryComponent - process: q=*:*&df=text&rows=10&echoParams=explicit [searcherExecutor-6-thread-1-processing-{core=nmdc}] DEBUG o.a.s.h.component.QueryComponent - process: q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&df=text&event=firstSearcher&rows=10&echoParams=explicit [searcherExecutor-6-thread-1-processing-{core=nmdc}] DEBUG o.a.s.search.stats.LocalStatsCache - ## GET {q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&df=text&event=firstSearcher&rows=10&echoParams=explicit} [searcherExecutor-6-thread-1-processing-{core=nmdc}] INFO o.apache.solr.core.SolrCore.Request - [nmdc] webapp=null path=null params={q=static+firstSearcher+warming+in+solrconfig.xml&distrib=false&event=firstSearcher} hits=0 status=0 QTime=36 [searcherExecutor-6-thread-1-processing-{core=nmdc}] INFO org.apache.solr.core.SolrCore - QuerySenderListener done. [searcherExecutor-6-thread-1-processing-{core=nmdc}] INFO org.apache.solr.core.SolrCore - [nmdc] Registered new searcher Searcher@28be2785[nmdc] main{ExitableDirectoryReader(UninvertingDirectoryReader())} ... [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE20DD5CE]] INFO org.apache.solr.update.SolrCoreState - Closing SolrCoreState [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE20DD5CE]] INFO o.a.solr.update.DefaultSolrCoreState - SolrCoreState ref count has reached 0 - closing IndexWriter [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE20DD5CE]] INFO o.a.solr.update.DefaultSolrCoreState - closing IndexWriter with IndexWriterCloser [TEST-TestSolrEmbeddedServer.testNodeConfigConstructor-seed#[41C3C11DE20DD5CE]] DEBUG o.apache.solr.update.SolrIndexWriter - Closing Writer DirectUpdateHandler2 Cheers Endre -Original Message- From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com] Sent: 25. august 2015 19:43 To: solr-user Subject: Re: testing with EmbeddedSolrServer Hello, I'm trying to guess what are you doing. It's not clear so far. I found http://stackoverflow.com/questions/11951695/embedded-solr-dih My conclusion, if you play with DIH and EmbeddedSolrServer you'd better to avoid the third beast, you don't need to bother with tests. I guess that main() is over while DIH runs in background thread. You need to loop status command until import is over. or add synchronous=true parameter to full-import command it should switch to synchronous mode: https://github.com/apache/lucene-solr/blob/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataImportHandler.java#L199 Take care On Tue, Aug 25, 2015 at 4:41 PM, Moen Endre wrote: > Is there an example of integration-testing with EmbeddedSolrServer > that loads data from a data importhandler - then queries the data? Ive > tried doing this based on > org.apache.solr.client.solrj.embedded.TestEmbeddedSolrServerConstructors. > > But no data is being imported. Here is the test-class ive tried: > https://gist.github.com/emoen/5d0a28df91c4c1127238 > > Ive also tried writing a test by extending AbstractSolrTestCase - but &g
testing with EmbeddedSolrServer
Is there an example of integration-testing with EmbeddedSolrServer that loads data from a data importhandler - then queries the data? Ive tried doing this based on org.apache.solr.client.solrj.embedded.TestEmbeddedSolrServerConstructors. But no data is being imported. Here is the test-class ive tried: https://gist.github.com/emoen/5d0a28df91c4c1127238 Ive also tried writing a test by extending AbstractSolrTestCase - but havnt got this working. Ive documented some of the log output here: http://stackoverflow.com/questions/32052642/solrcorestate-already-closed-with-unit-test-using-embeddedsolrserver-v-5-2-1 Should I extend AbstractSolrTestCase or SolrTestCaseJ4 when writing tests? Cheers Endre