Re: is replication eating up OldGen space
Some more info, after one week the servers have the following status: Master (indexing only) + looks good and has heap size of about 6g from 10g OldGen + has loaded meanwhile 2 times the index from scratch via DIH + has added new documents into existing index via DIH + has optimized and replicated + no full GC within one week Slave A (search only) Online - looks bad and has heap size of 9.5g from 10g OldGen + was replicated - several full GC Slave B (search only) Backup + looks good has heap size of 4 g from 10g OldGen + was replicated + no full GC within one week Conclusion: + DIH, processing, indexing, replication are fine - the search is crap and "eats up" OldGen heap which can't be cleaned up by full GC. May be memory leaks or what ever... Due to this Solr 3.1 can _NOT_ be recommended as high-availability, high-search-load search engine because of unclear heap problems caused by the search. The search is "out of the box", so no self produced programming errors. Any tools available for JAVA to analyze this? (like valgrind or electric fence for C++) Is it possible to analyze a heap dump produced with jvisualvm? Which tools? Bernd Am 30.05.2011 15:51, schrieb Bernd Fehling: Dear list, after switching from FAST to Solr I get the first _real_ data. This includes search times, memory consumption, perfomance of solr,... What I recognized so far is that something eats up my OldGen and I assume it might be replication. Current Data: one master - indexing only two slaves - search only over 28 million docs single instance single core index size 140g current heap size 16g After startup I have about 4g heap in use and about 3.5g of OldGen. After one week and some replications OldGen is filled close to 100 percent. If I start an optimize under this condition I get OOM of heap. So my assumption is that something is eating up my heap. Any idea how to trace this down? May be a memory leak somewhere? Best regards Bernd
Re: How to display solr search results in Json format
Thanks for reply, But i want to know how Json does it internally, I mean how it display results as Field:value. - Thanks & Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-display-solr-search-results-in-Json-format-tp3004734p3004768.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to display solr search results in Json format
Hi Romi, When querying the Solr index, use 'wt=json' as part of your query string to get the results back in json format. On Tue, May 31, 2011 at 11:35 AM, Romi wrote: > I have indexed all my database data in solr, now I want to rum search on it > and display results in JSON. what i need to do for it. > > > - > Thanks & Regards > Romi > -- > View this message in context: > http://lucene.472066.n3.nabble.com/How-to-display-solr-search-results-in-Json-format-tp3004734p3004734.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Thanks and Regards, DakshinaMurthy BM
How to display solr search results in Json format
I have indexed all my database data in solr, now I want to rum search on it and display results in JSON. what i need to do for it. - Thanks & Regards Romi -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-display-solr-search-results-in-Json-format-tp3004734p3004734.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexing files Solr cell and Amazon S3
Hi, You can use parameter stream.file to tell Solr to read the file from local disk, not stream across network: http://lucene.472066.n3.nabble.com/Example-of-using-quot-stream-file-quot-to-post-a-binary-file-to-solr-td781172.html -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 30. mai 2011, at 22.46, Greg Georges wrote: > Hello everyone, > > We have our infrastructure on Amazon cloud servers, and we use the S3 file > system. We need to index files using Solr Cell. From what I have read, we > need to stream files to Solr in order for it to extract the metadata into the > index. If we stream data through a public url there will be costs associated > to the transfer on the Amazon cloud. We have planned to have a directory with > the files, is it possible to tell solr to add documents from a specific > folder location? Or must we stream them into Solr? In SolrJ I see that the > only option is streaming. Thank you very much. > > Greg
Resolved- Re: Replication Error - Index fetch failed - File Not Found & OverlappingFileLockException
Hi, I found out the problem by myself. The reason was a bad deployment of of Solr on tomcat. Two instances of solr were instantiated instead of one. The two instances were managing the same indexes, and therefore were trying to write at the same time. My apologies for the noise created on the ml, -- Renaud Delbru On 30/05/11 21:52, Renaud Delbru wrote: Hi, For months, we were using apache solr 3.1.0 snapshots without problems. Recently, we have upgraded our index to apache solr 3.1.0, and also moved to a multi-core infrastructure (4 core per nodes, each core having its own index). We found that one of the index slave started to show failure, i.e., query errors. By looking at the log, we observed some errors during the latest snappull, due to two type of exceptions: - java.io.FileNotFoundException: File does not exist ... and - java.nio.channels.OverlappingFileLockException: null Then, after the failed pull, the index started to show some index related failure: java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:207)] However, after manually restarting the node, everything went back to normal. You can find a more detailed log at [1]. We are afraid to see this problem occurring again. Have you some idea on what can be the cause ? Or a solution to avoid such problem ? [1] http://pastebin.com/vbnyrUgJ Thanks in advance
Replication Error - Index fetch failed - File Not Found & OverlappingFileLockException
Hi, For months, we were using apache solr 3.1.0 snapshots without problems. Recently, we have upgraded our index to apache solr 3.1.0, and also moved to a multi-core infrastructure (4 core per nodes, each core having its own index). We found that one of the index slave started to show failure, i.e., query errors. By looking at the log, we observed some errors during the latest snappull, due to two type of exceptions: - java.io.FileNotFoundException: File does not exist ... and - java.nio.channels.OverlappingFileLockException: null Then, after the failed pull, the index started to show some index related failure: java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:207)] However, after manually restarting the node, everything went back to normal. You can find a more detailed log at [1]. We are afraid to see this problem occurring again. Have you some idea on what can be the cause ? Or a solution to avoid such problem ? [1] http://pastebin.com/vbnyrUgJ Thanks in advance -- Renaud Delbru
Indexing files Solr cell and Amazon S3
Hello everyone, We have our infrastructure on Amazon cloud servers, and we use the S3 file system. We need to index files using Solr Cell. From what I have read, we need to stream files to Solr in order for it to extract the metadata into the index. If we stream data through a public url there will be costs associated to the transfer on the Amazon cloud. We have planned to have a directory with the files, is it possible to tell solr to add documents from a specific folder location? Or must we stream them into Solr? In SolrJ I see that the only option is streaming. Thank you very much. Greg
Solr Dismax bf & bq vs. q:{boost ...}
I tried to do this: #1. search phrases in title^3 & text^1 #2. based on result #1 add boost for field closed:0^2 #3. based on result in #2 boost based on last_modified and i tried like these: /solr/select ?q={!boost b=$dateboost v=$qq defType=dismax} &dateboost=recip(ms(NOW/HOUR,modified),8640,2,1) &qq=video &qf=title^3+text &pf=title^3+text &bq=closed:0^2 &debugQuery=true then i tried differently by changing solrconfig like these: title^3 text title^3 text recip(ms(NOW/HOUR,modified),8640,2,1) closed:0^2 with query: /solr/select ?q=video &debugQuery=true both seems give wrong results, anyone have an idea about doing those tasks? thanks in advanced -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Dismax-bf-bq-vs-q-boost-tp3003028p3003028.html Sent from the Solr - User mailing list archive at Nabble.com.
Explain the difference in similarity and similarityProvider
I'm looking over the patch notes from https://issues.apache.org/jira/browse/SOLR-2338 and I do not understand the difference between param value and is there an echo? When would I use one over the other? Thanks, Brian Lamb
Re: SOLR-1155 on 3.1
I think the answers to both are negative. Vote for it! Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Ofer Fort > To: solr-user@lucene.apache.org > Sent: Mon, May 30, 2011 7:50:15 AM > Subject: SOLR-1155 on 3.1 > > Hey all, > In the last comment on SOLR-1155 by Jayson Minard ( >https://issues.apache.org/jira/browse/SOLR-1155?focusedCommentId=13019955&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13019955 >5 > ) > "I'll look at updating this for 3.1" > was it integrated into 3.1? if not is there a patch one can use? > thanks >
Re: Can we stream binary data with StreamingUpdateSolrServer ?
I'm not looking at the source code, but this doesn't sound right. I think it uses javabin. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: pravesh > To: solr-user@lucene.apache.org > Sent: Mon, May 30, 2011 8:40:28 AM > Subject: Can we stream binary data with StreamingUpdateSolrServer ? > > Hi, > > I'm using StreamingUpdateSolrServer to post a batch of content to SOLR1.4.1. > By looking at StreamingUpdateSolrServer code, it looks it only provides the > content to be streamed in XML format only. > > Can we use it to stream data in binary format? > > > > -- > View this message in context: >http://lucene.472066.n3.nabble.com/Can-we-stream-binary-data-with-StreamingUpdateSolrServer-tp3001813p3001813.html > > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: n-gram speed
Denis, Also, what are your documents and queries like? Maybe give a few examples so we can help. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Tor Henning Ueland > To: solr-user@lucene.apache.org > Sent: Mon, May 30, 2011 8:40:34 AM > Subject: Re: n-gram speed > > 2011/5/30 Denis Kuzmenok : > > I have a database with n-gram field, about 5 millions documents. QTime > > is about 200-1000 ms, database is not optimized because it must reply > > to queries everytime and data are updated often. Is it normal? > > Solr: 3.1, java -Xms2048M -Xmx4096M > > Server: i7, 12Gb > > Start by optimizing it, it wont "stop working" due to a optimize. Some > other vital info is the size of the index, disk type used etc (SSD, > SATA, IDE..) > > -- > Mvh > Tor Henning Ueland >
Re: DataImportHandler
I faced the same problem before, but that's because some parent classloader has loaded the DataImport class instead of using the SolrResourceLoader's delegated classloader. How are you starting your Solr? Via Eclipse? If you try starting Solr using cmdline, will you encounter the same issue? On May 30, 2011, at 9:28 PM, adpablos wrote: > Hi, > > i've tryed to install DataImportHandler but i've some problems when run up > solr. > > > GRAVE: org.apache.solr.common.SolrException: Error Instantiating Request > Handler, > org.apache.solr.handler.dataimport.DataImportHandler is not a > org.apache.solr.request.SolrRequestHandler > > This is the log. > > I've > > class="org.apache.solr.handler.dataimport.DataImportHandler"> > >db-data-config.xml > > > > in my solrconfig.xml > > i'm working ina java project and in my eclipse project, i can write > something like this: SolrRequestHandler srh = new DataImportHandler(); with > out problem. > > Sorry about my english and thank you in advance. > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/DataImportHandler-tp3001957p3001957.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: return unaltered complete multivalued fields with Highlighted results
Hi Alexei, We have the same issue/behavior. The highlighting component fragments the fields to highlight and choose the bests to be returned and highlighted. You can return all fragments with the maximum size for each one, but it will never return fragments with scores equal to 0, I mean without any words found. To return the whole mutli valued field, the Highlighting component needs to be modified for this specific case. That is something we should do in the next weeks. If I missed something, I would be happy to find another solution too :) Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/return-unaltered-complete-multivalued-fields-with-Highlighted-results-tp2967146p3002357.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr 3.1 commit errors
After restart i have these errors every time i do commit via post.jar. Config: multicore / 5 cores, Solr 3.1 Lock obtain timed out: SimpleFSLock@/home/ava/solr/example/multicore/context/data/index/write.lock org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@/home/ava/solr/example/multicore/context/data/index/write.lock at org.apache.lucene.store.Lock.obtain(Lock.java:84) at org.apache.lucene.index.IndexWriter.(IndexWriter.java:1097) at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:83) at org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHandler.java:102) at org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHandler2.java:174) at org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:222) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:61) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:147) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:77) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:55) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1360) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.j Tried to google a little bit but without any luck..
Re: return unaltered complete multivalued fields with Highlighted results
Thank you for the reply Erick. I can return the stored content but I would like to show the highlighted results. With multivalued fields there seems to be some sorting of highlighed results (in order of importance?) going on. The problem is: 1 - I could not find a way to keep the original order of my text. 2 - I could not display all of the values in my multivalued field. So if I have a multivalued field with four values: value1 value2 with text value3 value4 and something and the search is: "value2 something" the highlighted result would be: value2 with text value4 and something value1 and value3 will be skipped completely. When a field is not multivalued everything works as advertised. Any suggestions? Regards, Alexei -- View this message in context: http://lucene.472066.n3.nabble.com/return-unaltered-complete-multivalued-fields-with-Highlighted-results-tp2967146p3002248.html Sent from the Solr - User mailing list archive at Nabble.com.
Spellcheck component not returned with numeric queries
Hi, The spell check component's output is not written when sending queries that consist of numbers only. Clients depending on the availability of the spellcheck output need to check if the output is actually there. This is with a very recent Solr 3.x check out. Is this a feature or a bug? File an issue? Cheers, -- Markus Jelsma - CTO - Openindex http://www.linkedin.com/in/markus17 050-8536620 / 06-50258350
is replication eating up OldGen space
Dear list, after switching from FAST to Solr I get the first _real_ data. This includes search times, memory consumption, perfomance of solr,... What I recognized so far is that something eats up my OldGen and I assume it might be replication. Current Data: one master - indexing only two slaves - search only over 28 million docs single instance single core index size 140g current heap size 16g After startup I have about 4g heap in use and about 3.5g of OldGen. After one week and some replications OldGen is filled close to 100 percent. If I start an optimize under this condition I get OOM of heap. So my assumption is that something is eating up my heap. Any idea how to trace this down? May be a memory leak somewhere? Best regards Bernd -- * Bernd FehlingUniversitätsbibliothek Bielefeld Dipl.-Inform. (FH)Universitätsstr. 25 Tel. +49 521 106-4060 Fax. +49 521 106-4052 bernd.fehl...@uni-bielefeld.de33615 Bielefeld BASE - Bielefeld Academic Search Engine - www.base-search.net *
DataImportHandler
Hi, i've tryed to install DataImportHandler but i've some problems when run up solr. GRAVE: org.apache.solr.common.SolrException: Error Instantiating Request Handler, org.apache.solr.handler.dataimport.DataImportHandler is not a org.apache.solr.request.SolrRequestHandler This is the log. I've db-data-config.xml in my solrconfig.xml i'm working ina java project and in my eclipse project, i can write something like this: SolrRequestHandler srh = new DataImportHandler(); with out problem. Sorry about my english and thank you in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/DataImportHandler-tp3001957p3001957.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: n-gram speed
2011/5/30 Denis Kuzmenok : > I have a database with n-gram field, about 5 millions documents. QTime > is about 200-1000 ms, database is not optimized because it must reply > to queries everytime and data are updated often. Is it normal? > Solr: 3.1, java -Xms2048M -Xmx4096M > Server: i7, 12Gb Start by optimizing it, it wont "stop working" due to a optimize. Some other vital info is the size of the index, disk type used etc (SSD, SATA, IDE..) -- Mvh Tor Henning Ueland
Can we stream binary data with StreamingUpdateSolrServer ?
Hi, I'm using StreamingUpdateSolrServer to post a batch of content to SOLR1.4.1. By looking at StreamingUpdateSolrServer code, it looks it only provides the content to be streamed in XML format only. Can we use it to stream data in binary format? -- View this message in context: http://lucene.472066.n3.nabble.com/Can-we-stream-binary-data-with-StreamingUpdateSolrServer-tp3001813p3001813.html Sent from the Solr - User mailing list archive at Nabble.com.
SOLR-1155 on 3.1
Hey all, In the last comment on SOLR-1155 by Jayson Minard ( https://issues.apache.org/jira/browse/SOLR-1155?focusedCommentId=13019955&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13019955 ) "I'll look at updating this for 3.1" was it integrated into 3.1? if not is there a patch one can use? thanks
collapse component with pivot faceting
Hi All! Can anyone tell me how pivot faceting works in combination with field collapsing.? Please guide me in this respect. Thanks! Isha Garg
n-gram speed
I have a database with n-gram field, about 5 millions documents. QTime is about 200-1000 ms, database is not optimized because it must reply to queries everytime and data are updated often. Is it normal? Solr: 3.1, java -Xms2048M -Xmx4096M Server: i7, 12Gb
Re: wildcards and German umlauts
Hi, Agree that this is annoying for foreign languages. I get the idea behind the original behaviour, but there could be more elegant ways of handling it. It would make sense to always run the CharFilters. Perhaps a mechanism where TokenFilters can be tagged for exclusion from wildcard terms would be an idea. That way we can skip stemming, synonym and phonetic for wildcard terms, but still do lowercasing and characterNormalization. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 29. mai 2011, at 19.24, mdz-munich wrote: > Ah, NOW I got it. It's not a bug, it's a feature. > > But that would mean, that every character-manipulation (e.g. > char-mapping/replacement, Porter-Stemmer in some cases ...) would cause a > wildcard-query to fail. That too bad. > > But why? What's the Problem with passing the prefix through the > analyzer/filter-chain? > > Greetz, > > Sebastian > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/wildcards-and-German-umlauts-tp499972p2999237.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem with spellchecking, dont want multiple request to SOLR
Hi, Define two searchComponents with different names. Then refer to both in in your Search Request Handler config. -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com On 27. mai 2011, at 10.01, roySolr wrote: > mm ok. I configure 2 spellcheckers: > > > > spell_what > spell_what > true > spellchecker_what > > > spell_where > spell_where > true > spellchecker_where > > > > How can i enable it in my search request handler and search both in one > request? > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Problem-with-spellchecking-dont-want-multiple-request-to-SOLR-tp2988167p2992076.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: Bulk indexing, UpdateProcessor overwriteDupes and poor IO performances
Hello, Sorry for re-posting this but it seems my message got lost in the mailing list's messages stream without hitting anyone's attention... =D Shortly, has anyone already experienced dramatic indexing slowdowns during large bulk imports with overwriteDupes turned on and a fairly high duplicates rate (around 4-8x) ? It seems to produce a lot of deletions, which in turn appear to make the merging of segments pretty slow, by fairly increasing the number of little reads operations occuring simultaneously with the regular large write operations of the merge. Added to the poor IO performances of a commodity SATA drive, indexing takes ages. I temporarily bypassed that limitation by disabling the overwriting of duplicates, but that changes the way I request the index, requiring me to turn on field collapsing at search time. Is this a known limitation ? Has anyone a few hints on how to optimize the handling of index time deduplication ? More details on my setup and the state of my understanding are in my previous message here-after. Thank you very much in advance. Regards, Tanguy On 05/25/11 15:35, Tanguy Moal wrote: Dear list, I'm posting here after some unsuccessful investigations. In my setup I push documents to Solr using the StreamingUpdateSolrServer. I'm sending a comfortable initial amount of documents (~250M) and wished to perform overwriting of duplicated documents at index time, during the update, taking advantage of the UpdateProcessorChain. At the beginning of the indexing stage, everything is quite fast; documents arrive at a rate of about 1000 doc/s. The only extra processing during the import is computation of a couple of hashes that are used to identify uniquely documents given their content, using both stock (MD5Signature) and custom (derived from Lookup3Signature) update processors. I send a commit command to the server every 500k documents sent. During a first period, the server is CPU bound. After a short while (~10 minutes), the rate at which documents are received starts to fall dramatically, the server being IO bound. I've been firstly thinking of a normal speed decrease during the commit, while my push client is waiting for the flush to occur. That would have been a normal slowdown. The thing that retained my attention was the fact that unexpectedly, the server was performing a lot of small reads, way more the number writes, which seem to be larger. The combination of the many small reads with the constant amount of bigger writes seem to be creating a lot of IO contention on my commodity SATA drive, and the ETA of my built index started to increase scarily =D I then restarted the JVM with JMX enabled so I could start investigating a little bit more. I've the realized that the UpdateHandler was performing many reads while processing the update request. Are there any known limitations around the UpdateProcessorChain, when overwriteDupes is set to true ? I turned that off, which of course breaks the intent of my built index, but for comparison purposes it's good. That did the trick, indexing is fast again, even with the periodic commits. I therefor have two questions, an interesting first one and a boring second one : 1 / What's the workflow of the UpdateProcessorChain when one or more processors have overwriting of duplicates turned on ? What happens under the hood ? I tried to answer that myself looking at DirectUpdateHandler2 and my understanding stopped at the following : - The document is added to the lucene IW - The duplicates are deleted from the lucene IW The dark magic I couldn't understand seems to occur around the idTerm and updateTerm things, in the addDoc method. The deletions seem to be buffered somewhere, I just didn't get it :-) I might be wrong since I didn't read the code more than that, but the point might be at how does solr handles deletions, which is something still unclear to me. In anyways, a lot of reads seem to occur for that precise task and it tends to produce a lot of IO, killing indexing performances when overwriteDupes is on. I don't even understand why so many read operations occur at this stage since my process had a comfortable amount of RAM (with Xms=Xmx=8GB), with only 4.5GB are used so far. Any help, recommandation or idea is welcome :-) 2 / In the case there isn't a simple fix for this, I'll have to do with duplicates in my index. I don't mind since solr offers a great grouping feature, which I already use in some other applications. The only thing I don't know yet is that if I do rely on grouping at search time, in combination with the Stats component (which is the intent of that index), and limiting the results to 1 document per group, will the computed statistics take those duplicates into account or not ? Shortly, how well does the Stats component behave when combined to hits collapsing ? I had firstly implemented my solution using overwriteDupes becau
Re: Problem with caps and star symbol
I am sending some xml to understand the scenario. Indexed term = ROLE_DELETE Search Term = roledelete 0 4 on 0 name : roledelete 2.2 10 Indexed term = ROLE_DELETE Search Term = role 0 5 on 0 name : role 2.2 10 Mon May 30 13:09:14 BDST 2011 Global Role for Deletion role:9223372036854775802 Mon May 30 13:09:14 BDST 2011 ROLE_DELETE Mon May 30 13:09:14 BDST 2011 Global Role for Deletion role:9223372036854775802 Mon May 30 13:09:14 BDST 2011 ROLE_DELETE Indexed term = ROLE_DELETE Search Term = role* 0 4 on 0 name : role* 2.2 10 Mon May 30 13:09:14 BDST 2011 Global Role for Deletion role:9223372036854775802 Mon May 30 13:09:14 BDST 2011 ROLE_DELETE Indexed term = ROLE_DELETE Search Term = Role* 0 4 on 0 name : Role* 2.2 10 Indexed term = ROLE_DELETE Search Term = ROLE_DELETE* 0 4 on 0 name : ROLE_DELETE* 2.2 10 I am also adding a analysis html. On Mon, May 30, 2011 at 7:19 AM, Erick Erickson wrote: > I'd start by looking at the analysis page from the Solr admin page. That > will give you an idea of the transformations the various steps carry out, > it's invaluable! > > Best > Erick > On May 26, 2011 12:53 AM, "Saumitra Chowdhury" < > saumi...@smartitengineering.com> wrote: > > Hi all , > > In my schema.xml i am using WordDelimiterFilterFactory, > > LowerCaseFilterFactory, StopFilterFactory for index analyzer and an extra > > SynonymFilterFactory for query analyzer. I am indexing a field name > > '*name*'.Now > > if a value with all caps like "NAME_BILL" is indexed I am able get this > as > > search result with the term " *name_bill *", " *NAME_BILL *", " *namebill > *", > > "*namebill** ", " *nameb** " ... But for the term like following " * > > NAME_BILL** ", " *name_bill** ", " *namebill** ", " *NAME** " the result > > does mot show this document. Can anyone please explain why this is > > happening? .In fact star " * " is not giving any result in many > > cases specially if it is used after full value of a field. > > > > Portion of my schema is given below. > > > > positionIncrementGap="100"> > > - > > > > > > > > > > - > > > > - > > > > > > > generateNumberParts="0" catenateWords="1" catenateNumbers="1" > > catenateAll="0"/> > > > > > words="stopwords.txt" enablePositionIncrements="true"/> > > > > - > > > > > > > generateNumberParts="0" catenateWords="1" catenateNumbers="1" > > catenateAll="0"/> > > > > > ignoreCase="true" expand="true"/> > > > words="stopwords.txt" enablePositionIncrements="true"/> > > > > > > - > > > positionIncrementGap="100"> > > - > > > > > > > generateNumberParts="0" catenateWords="1" catenateNumbers="1" > > catenateAll="0"/> > > > > > ignoreCase="true" expand="false"/> > > > words="stopwords.txt"/> > > > > > > >