If search matches index in the middle of filter chain, will result return?
Hi all I am using Solr 3.4 with Win7 and Jetty. When I do a search on a field, according to the Analysis from Solr, the search string matches the index in the middle of the chain. Here is the schema: fieldType name=substring_search class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.CommonGramsFilterFactory words=../../filters/stopwords.txt ignoreCase=true/ filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=20/ filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer /fieldType I am searching for an email called: off...@officeofficeoffice.com. If I search any text under 20 characters, result will be returned. But when I search the whole string: off...@officeofficeoffice.com, no result return. As you all see in the schema in index part, when I search the whole string, it will match the index chain before NGramFilterFactory. But after NGram, no result found. Here are my questions: - Is this behavior normal? - In order to get off...@officeofficeoffice.com, does it mean that I have to make the maxGramSize larger (like 70)? Thank you in advance for all your support. This is a great community.
RE: If search matches index in the middle of filter chain, will result return?
Thanks Shawn. So to recap: - Every match must be found after entire chain, not in the middle of the chain. - Suggested: index and query chain should be the same. In my situation, if I make both of them the same, the result may be misleading because it will also match other records that have the same partial string. But your suggestion is wonderful. Thank you very much. -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: 2011年11月23日 12:04 下午 To: solr-user@lucene.apache.org Subject: Re: If search matches index in the middle of filter chain, will result return? On 11/22/2011 7:54 PM, Ellery Leung wrote: I am searching for an email called: off...@officeofficeoffice.com. If I search any text under 20 characters, result will be returned. But when I search the whole string: off...@officeofficeoffice.com, no result return. As you all see in the schema in index part, when I search the whole string, it will match the index chain before NGramFilterFactory. But after NGram, no result found. Here are my questions: - Is this behavior normal? I'm pretty sure that your query must match after the entire analyzer chain is done. I would expect that behavior to be normal. - In order to get off...@officeofficeoffice.com, does it mean that I have to make the maxGramSize larger (like 70)? If you were to increase the maxGramSize to 70, you would get a match in this case, but your index might get a lot larger, depending on what's in your source data. That's probably not the right approach, though. In general, you want to have your index and query analyzer chains exactly the same. There are some exceptions, but I don't think the NGram filter is one of them. The synonym filter and WordDelimiterFilter are examples where it is expected that your index and query analyzer chains will be different. Add the NGram and CommonGram filters to the query chain, and everything should start working. If you were to go with a single analyzer for both like the following, I think it would start working. You wouldn't even need to reindex, since you wouldn't be changing the index analyzer. fieldType name=substring_search class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.CommonGramsFilterFactory words=../../filters/stopwords.txt ignoreCase=true/ filter class=solr.NGramFilterFactory minGramSize=1 maxGramSize=20/ filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer /fieldType Regarding your NGram filter, I would actually increase the minGramSize to at least 2 and decrease the maxGramSize to something like 10 or 15, then reindex. An additional note: CommonGrams may not be all that useful unless you are indexing large numbers of huge documents, like entire books. This particular fieldType is not suitable for full text anyway, since it uses KeywordTokenizer. Consider removing CommonGrams from this fieldType and reindexing. Unless you are dealing with large amounts of text, consider removing it from the entire schema. If you do remove it, it's usually not a good idea to replace it with a StopFilter. The index size reduction found in stopword removal is not usually worth the potential loss of recall. Be prepared to test all reasonable analyzer combinations, rather than taking my word for it. After reading the Hathi Trust blog, I tried CommonGrams on my own index. It actually made things slower, not faster. My typical document is only a few thousand bytes of metadata. The Hathi Trust is indexing millions of full-length books. Thanks, Shawn
RE: Weird: Solr Search result and Analysis Result not match?
Thanks Erick, here are my responses: 1. Yes. What I want to achieve is that when index is filtered with EdgeNgram, and a query that is not filtered in that way, I can do search on partial string. 2. Good suggestion, will test it. 3. ok 4. Thank you 5/6. Will remove the synonyms and word delimiterfilterfactory in query 7. will look at that using Luke. By the way, it is the first time I saw that there is a tool for that. Thank you. 8. Yes. Will check that again, thank you. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: 2011年11月8日 9:52 下午 To: solr-user@lucene.apache.org; elleryle...@be-o.com Subject: Re: Weird: Solr Search result and Analysis Result not match? Several things: 1 You don't have EdgeNGramFilterFactory in your query analysis chain, is this intentional? 2 You have a LOT of stuff going on here, you might try making your analysis chain simpler and adding stuff back in until you see the error. Don't forget to re-index! 3 Analysis doesn't take into account query *parsing*, so it's possible to get a false sense of assurance when the analysis page matches your expectations. 4 Even though nothing jumps out at me except the Edge factory, nice job of including information. 5 It's unusual to expand synonyms both at query and index time, usually one or the other with index time preferred. 6 Same with WordDelimiterFilterFactory. If you put all the variants in the index, you don't need to put all the variants in the query and vice-versa. 7 Take a look at your actual contents, perhaps using Luke to insure that what you expect to be in your index actually is. 8 You did re-index after your latest changes to your schema, right G? All of this is a way of saying that I don't quite see what the problem is, but at least there are some avenues to explore. Best Erick On Mon, Nov 7, 2011 at 9:29 PM, Ellery Leung elleryle...@be-o.com wrote: Hi all. I am using Solr 3.4 under Win 7. In schema there is a multivalue field indexed in this way: == Schema: == field name=myEvent type=myCustomText multiValued=true indexed=true stored=true omitNorms=true/ fieldType name=myCustomText class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.TrimFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=../../filters/filter-synonyms.txt ignoreCase=true expand=true/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 preserveOriginal=1/ filter class=solr.PhoneticFilterFactory encoder=DoubleMetaphone inject=true/ filter class=solr.PorterStemFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=50 side=front/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.TrimFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=../../filters/filter-synonyms.txt ignoreCase=true expand=true/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1 generateWordParts=0 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 preserveOriginal=1/ filter class=solr.PhoneticFilterFactory encoder=DoubleMetaphone/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType == Actual index: == arr name=myEvent str2284e2/str str2284e4/str str2284e5/str str1911e2/str /arr == Question: == Now when I do a search like this: myEvent:1911e2 This should match the 4th item. Now on Full Interface, it does not return any result. But on analysis, matches are highlighted. By using Debug: the parsedquery is: MultiPhraseQuery
Weird: Solr Search result and Analysis Result not match?
Hi all. I am using Solr 3.4 under Win 7. In schema there is a multivalue field indexed in this way: == Schema: == field name=myEvent type=myCustomText multiValued=true indexed=true stored=true omitNorms=true/ fieldType name=myCustomText class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.TrimFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=../../filters/filter-synonyms.txt ignoreCase=true expand=true/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1 generateWordParts=1 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 preserveOriginal=1/ filter class=solr.PhoneticFilterFactory encoder=DoubleMetaphone inject=true/ filter class=solr.PorterStemFilterFactory/ filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=50 side=front/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory/ tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.TrimFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SynonymFilterFactory synonyms=../../filters/filter-synonyms.txt ignoreCase=true expand=true/ filter class=solr.ASCIIFoldingFilterFactory/ filter class=solr.WordDelimiterFilterFactory splitOnCaseChange=1 splitOnNumerics=1 stemEnglishPossessive=1 generateWordParts=0 generateNumberParts=1 catenateWords=1 catenateNumbers=1 catenateAll=0 preserveOriginal=1/ filter class=solr.PhoneticFilterFactory encoder=DoubleMetaphone/ filter class=solr.PorterStemFilterFactory/ filter class=solr.RemoveDuplicatesTokenFilterFactory/ /analyzer /fieldType == Actual index: == arr name=myEvent str2284e2/str str2284e4/str str2284e5/str str1911e2/str /arr == Question: == Now when I do a search like this: myEvent:1911e2 This should match the 4th item. Now on Full Interface, it does not return any result. But on analysis, matches are highlighted. By using Debug: the parsedquery is: MultiPhraseQuery(myEvent:(1911e2 1911) (A e) 2) Parsedquery_toString: myEvent:(1911e2 1911) (A e) 2 Can anyone please help me on this?
How to return exact set of multivalue field
Hi all I am using Solr 3.4 on Windows 7. Here is the example of a multivalue field: doc arr name=field_name str387/str str386/str /arr /doc doc arr name= field_name str387/str str386/str /arr /doc doc arr name= field_name str387/str str386/str str385/str str382/str str312/str str311/str /arr /doc I am doing a search on field_name and JUST want to return record that IS 387 and 386 (the first and second record). Here is the query: field_name: (387 AND 386) But this query return all 3 records, which is wrong. I have tried using filter: field_name: (387 AND 386) but it still doesn't work. Therefore I would like to ask, are there any way to change this query so that it will ONLY return first and second record? Thank you in advance for any help.
RE: How to return exact set of multivalue field
Thank you very much for your help! Follow up question: what if it is a string instead of number? While you can use [387 TO *] to find out all number that is bigger than 387, how do you find specific set of string? Thank you again for any help here. -Original Message- From: dan sutton [mailto:danbsut...@gmail.com] Sent: 2011年10月20日 6:09 下午 To: solr-user@lucene.apache.org; elleryle...@be-o.com Subject: Re: How to return exact set of multivalue field -field_name:[ * TO 384] +field_name:[385 TO 386] -field_name:[387 TO *] On Thu, Oct 20, 2011 at 10:51 AM, Ellery Leung elleryle...@be-o.com wrote: Hi all I am using Solr 3.4 on Windows 7. Here is the example of a multivalue field: doc arr name=field_name str387/str str386/str /arr /doc doc arr name= field_name str387/str str386/str /arr /doc doc arr name= field_name str387/str str386/str str385/str str382/str str312/str str311/str /arr /doc I am doing a search on field_name and JUST want to return record that IS 387 and 386 (the first and second record). Here is the query: field_name: (387 AND 386) But this query return all 3 records, which is wrong. I have tried using filter: field_name: (387 AND 386) but it still doesn't work. Therefore I would like to ask, are there any way to change this query so that it will ONLY return first and second record? Thank you in advance for any help.
RE: solr Invalid Date in Date Math String/Invalid Date String
: [2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z] Best Erick 2011/5/27 Ellery Leung elleryle...@be-o.com: Thank you Mike. So I understand that now. But what about the other items that have values on both size? They don't work at all. -Original Message- From: Mike Sokolov [mailto:soko...@ifactory.com] Sent: 2011年5月27日 10:23 下午 To: solr-user@lucene.apache.org Cc: alucard001 Subject: Re: solr Invalid Date in Date Math String/Invalid Date String The * endpoint for range terms wasn't implemented yet in 1.4.1 As a workaround, we use very large and very small values. -Mike On 05/27/2011 12:55 AM, alucard001 wrote: Hi all I am using SOLR 1.4.1 (according to solr info), but no matter what date field I use (date or tdate) defined in default schema.xml, I cannot do a search in solr-admin analysis.jsp: fieldtype: date(or tdate) fieldvalue(index): 2006-12-22T13:52:13Z (I type it in manually, no trailing space) fieldvalue(query): The only success case: 2006-12-22T13:52:13Z All search below are failed: * TO NOW [* TO NOW] 2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z 2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z [2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z] [2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z] 2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z 2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z [2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z] [2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z] 2006-12-22T00:00:00Z TO * 2006\-12\-22T00\:00\:00Z TO * [2006-12-22T00:00:00Z TO *] [2006\-12\-22T00\:00\:00Z TO *] 2006-12-22T00:00:00.000Z TO * 2006\-12\-22T00\:00\:00\.000Z TO * [2006-12-22T00:00:00.000Z TO *] [2006\-12\-22T00\:00\:00\.000Z TO *] (vice versa) I get either: Invalid Date in Date Math String or Invalid Date String error What's wrong with it? Can anyone please help me on that? Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-Invalid-Date-in-Date-Math-String-Inv alid-Date-String-tp2991763p2991763.html Sent from the Solr - User mailing list archive at Nabble.com.
Match in the process of filter, not end, does it mean not matching?
This is the schema: fieldType name=textContains class=solr.TextField positionIncrementGap=100 analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ISOLatin1AccentFilterFactory/ filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.CommonGramsFilterFactory words=../../filters/stopwords.txt ignoreCase=true/ filter class=solr.ShingleFilterFactory minShingleSize=2 maxShingleSize=30/ filter class=solr.NGramFilterFactory minGramSize=2 maxGramSize=30/ filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=../../filters/filter-mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory / tokenizer class=solr.KeywordTokenizerFactory/ filter class=solr.ISOLatin1AccentFilterFactory/ filter class=solr.TrimFilterFactory / filter class=solr.LowerCaseFilterFactory / filter class=solr.RemoveDuplicatesTokenFilterFactory / /analyzer /fieldType And there is a multiValued field: field name=textContains_Something type=textContains multiValued=true indexed=true stored=true / Now I want to search this string: Merry Christmas and Happy New Year In Admin Analysis in solr admin, it highlight (in light blue) the matching word in LowerCaseFilterFactory, CommonGramsFilterFactory and ShingleFilterFactory. However, it does not have any highlight in NGramFilterFactory. Now, I did a search in full-interface mode in solr admin: textContains_Something:Merry Christmas and Happy New Year It contains NO RESULT. Does it mean that matching only counts after all tokenizer and filters? Thank you in advance for any help.
HTMLStripTransformer will remove the content in XML??
I have an XML string like this: ?xml version=1.0 encoding=UTF-8?languageintl![CDATA[hello]]/intlloc![CDATA[solr ]]/loc/language By using HTMLStripTransformer, I expect to get 'hello,solr'. But actual this transformer will remove ALL THE TEXT INSIDE! Did I do something silly, or is it a bug? Thank you
RE: HTMLStripTransformer will remove the content in XML??
Got it. Actually I use solr.MappingCharFilterFactory to replace the ![CDATA[ and ]] to empty first, and use HTMLStripCharFilterFactory to get hello and solr. For future reference, here is part of schema.xml fieldType name=textMaxWord class=solr.TextField analyzer type=index charFilter class=solr.MappingCharFilterFactory mapping=mappings.txt/ charFilter class=solr.HTMLStripCharFilterFactory / ... In mappings.txt (2 lines) ![CDATA[ = ]] = Restart Solr It works. Thank you -Original Message- From: bryan rasmussen [mailto:rasmussen.br...@gmail.com] Sent: 2011年5月27日 4:20 下午 To: solr-user@lucene.apache.org; elleryle...@be-o.com Subject: Re: HTMLStripTransformer will remove the content in XML?? I would expect that it doesn't understand CDATA and thinks of everything between and as a 'tag'. Best Regards, Bryan Rasmussen On Fri, May 27, 2011 at 9:41 AM, Ellery Leung elleryle...@be-o.com wrote: I have an XML string like this: ?xml version=1.0 encoding=UTF-8?languageintl![CDATA[hello]]/intlloc![CDATA[solr ]]/loc/language By using HTMLStripTransformer, I expect to get 'hello,solr'. But actual this transformer will remove ALL THE TEXT INSIDE! Did I do something silly, or is it a bug? Thank you
RE: solr Invalid Date in Date Math String/Invalid Date String
Thank you Mike. So I understand that now. But what about the other items that have values on both size? They don't work at all. -Original Message- From: Mike Sokolov [mailto:soko...@ifactory.com] Sent: 2011年5月27日 10:23 下午 To: solr-user@lucene.apache.org Cc: alucard001 Subject: Re: solr Invalid Date in Date Math String/Invalid Date String The * endpoint for range terms wasn't implemented yet in 1.4.1 As a workaround, we use very large and very small values. -Mike On 05/27/2011 12:55 AM, alucard001 wrote: Hi all I am using SOLR 1.4.1 (according to solr info), but no matter what date field I use (date or tdate) defined in default schema.xml, I cannot do a search in solr-admin analysis.jsp: fieldtype: date(or tdate) fieldvalue(index): 2006-12-22T13:52:13Z (I type it in manually, no trailing space) fieldvalue(query): The only success case: 2006-12-22T13:52:13Z All search below are failed: * TO NOW [* TO NOW] 2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z 2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z [2006-12-22T00:00:00Z TO 2006-12-22T23:59:59Z] [2006\-12\-22T00\:00\:00Z TO 2006\-12\-22T23\:59\:59Z] 2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z 2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z [2006-12-22T00:00:00.000Z TO 2006-12-22T23:59:59.999Z] [2006\-12\-22T00\:00\:00\.000Z TO 2006\-12\-22T23\:59\:59\.999Z] 2006-12-22T00:00:00Z TO * 2006\-12\-22T00\:00\:00Z TO * [2006-12-22T00:00:00Z TO *] [2006\-12\-22T00\:00\:00Z TO *] 2006-12-22T00:00:00.000Z TO * 2006\-12\-22T00\:00\:00\.000Z TO * [2006-12-22T00:00:00.000Z TO *] [2006\-12\-22T00\:00\:00\.000Z TO *] (vice versa) I get either: Invalid Date in Date Math String or Invalid Date String error What's wrong with it? Can anyone please help me on that? Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-Invalid-Date-in-Date-Math-String-Inv alid-Date-String-tp2991763p2991763.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is this error means?
Hi Israel Thank you for your response. However, I use both ini_set and set the _defaultTimeout to 6000 but the error still occur with same error message. Now, when I start build the index, the error pops up much faster than changing it before. So do you have any idea? Thank you in advance for your help. Israel Ekpo wrote: Ellery, A preliminary look at the source code indicates that the error is happening because the solr server is taking longer than expected to respond to the client http://code.google.com/p/solr-php-client/source/browse/trunk/Apache/Solr/Service.php The default time out handed down to Apache_Solr_Service:_sendRawPost() is 60 seconds since you were calling the addDocument() method So if it took longer than that (1 minute), then it will exit with that error message. You will have to increase the default value to something very high like 10 minutes or so on line 252 in the source code since there is no way to specify that in the constructor or the addDocument method. Another alternative will be to update the default_socket_timeout in the php.ini file or in the code using ini_set I hope that helps On Tue, Jan 12, 2010 at 9:33 PM, Ellery Leung elleryle...@be-o.com wrote: Hi, here is the stack trace: br / Fatal error: Uncaught exception 'Exception' with message 'quot;0quot; Status: Communication Error' in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Serv ice.php:385 Stack trace: #0 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(652): Apache_Solr_Ser vice-gt;_sendRawPost('http://127.0.0', 'lt;add allowDups=...') #1 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(676): Apache_Solr_Ser vice-gt;add('lt;add allowDups=...') #2 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(221): Apache_Solr_Service-gt;addDocument(Object(Apache_Solr_Document)) #3 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(262): SolrSearchEngine-gt;buildIndex(Array, 'key') #4 C:\nginx\html\apps\milio\lib\System\classes\Indexer\Indexer.class.php(51): So lrSearchEngine-gt;createFullIndex('contacts', Array, 'key', 'www') #5 C:\nginx\html\apps\milio\lib\System\functions\createIndex.php(64): Indexer-g t;create('www') #6 {main} thrown in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php on li ne 385br / C:\nginx\html\apps\milio\htdocs\Contactspause Press any key to continue . . . Thanks for helping me. Grant Ingersoll-6 wrote: Do you have a stack trace? On Jan 12, 2010, at 2:54 AM, Ellery Leung wrote: When I am building the index for around 2 ~ 25000 records, sometimes I came across with this error: Uncaught exception Exception with message '0' Status: Communication Error I search Google Yahoo but no answer. I am now committing document to solr on every 10 records fetched from a SQLite Database with PHP 5.3. Platform: Windows 7 Home Web server: Nginx Solr Specification Version: 1.4.0 Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40 Lucene Specification Version: 2.9.1 Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25 Solr hosted in jetty 6.1.3 All the above are in one single test machine. The situation is that sometimes when I build the index, it can be created successfully. But sometimes it will just stop with the above error. Any clue? Please help. Thank you in advance. -- View this message in context: http://old.nabble.com/What-is-this-error-means--tp27123815p27138658.html Sent from the Solr - User mailing list archive at Nabble.com. -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/ -- View this message in context: http://old.nabble.com/What-is-this-error-means--tp27123815p27155487.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is this error means?
Here are a workaround of this issue: On line 382 of SolrPhpClient/Apache/Solr/Service.php, I change to: while(true){ $str = file_get_contents($url, false, $this-_postContext); if(empty($str) == false){ break; } } $response = new Apache_Solr_Response($str, $http_response_header, $this-_createDocuments, $this-_collapseSingleValueArrays); As I found that, for some strange reason on Windows, when you post some data and add index, Solr may not be able to receive it. Therefore I added an infinitive loop and if it does not receive any response ($str is empty), we post it again. Side effect: when I open the window console to see it, sometimes it will prompt: Failed to open stream: HTTP request failed! I haven't researched it yet, but the index is built successfully. Hope it helps someone. Ellery Leung wrote: Hi Israel Thank you for your response. However, I use both ini_set and set the _defaultTimeout to 6000 but the error still occur with same error message. Now, when I start build the index, the error pops up much faster than changing it before. So do you have any idea? Thank you in advance for your help. Israel Ekpo wrote: Ellery, A preliminary look at the source code indicates that the error is happening because the solr server is taking longer than expected to respond to the client http://code.google.com/p/solr-php-client/source/browse/trunk/Apache/Solr/Service.php The default time out handed down to Apache_Solr_Service:_sendRawPost() is 60 seconds since you were calling the addDocument() method So if it took longer than that (1 minute), then it will exit with that error message. You will have to increase the default value to something very high like 10 minutes or so on line 252 in the source code since there is no way to specify that in the constructor or the addDocument method. Another alternative will be to update the default_socket_timeout in the php.ini file or in the code using ini_set I hope that helps On Tue, Jan 12, 2010 at 9:33 PM, Ellery Leung elleryle...@be-o.com wrote: Hi, here is the stack trace: br / Fatal error: Uncaught exception 'Exception' with message 'quot;0quot; Status: Communication Error' in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Serv ice.php:385 Stack trace: #0 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(652): Apache_Solr_Ser vice-gt;_sendRawPost('http://127.0.0', 'lt;add allowDups=...') #1 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(676): Apache_Solr_Ser vice-gt;add('lt;add allowDups=...') #2 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(221): Apache_Solr_Service-gt;addDocument(Object(Apache_Solr_Document)) #3 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(262): SolrSearchEngine-gt;buildIndex(Array, 'key') #4 C:\nginx\html\apps\milio\lib\System\classes\Indexer\Indexer.class.php(51): So lrSearchEngine-gt;createFullIndex('contacts', Array, 'key', 'www') #5 C:\nginx\html\apps\milio\lib\System\functions\createIndex.php(64): Indexer-g t;create('www') #6 {main} thrown in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php on li ne 385br / C:\nginx\html\apps\milio\htdocs\Contactspause Press any key to continue . . . Thanks for helping me. Grant Ingersoll-6 wrote: Do you have a stack trace? On Jan 12, 2010, at 2:54 AM, Ellery Leung wrote: When I am building the index for around 2 ~ 25000 records, sometimes I came across with this error: Uncaught exception Exception with message '0' Status: Communication Error I search Google Yahoo but no answer. I am now committing document to solr on every 10 records fetched from a SQLite Database with PHP 5.3. Platform: Windows 7 Home Web server: Nginx Solr Specification Version: 1.4.0 Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40 Lucene Specification Version: 2.9.1 Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25 Solr hosted in jetty 6.1.3 All the above are in one single test machine. The situation is that sometimes when I build the index, it can be created successfully. But sometimes it will just stop with the above error. Any clue? Please help. Thank you in advance. -- View this message in context: http://old.nabble.com/What-is-this-error-means--tp27123815p27138658.html Sent from the Solr - User mailing list archive at Nabble.com. -- Good Enough is not good enough. To give anything less than your best is to sacrifice the gift. Quality First. Measure Twice. Cut Once. http://www.israelekpo.com/ -- View this message in context: http://old.nabble.com/What-is-this-error-means--tp27123815p27156058.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: What is this error means?
Hi, here is the stack trace: br / Fatal error: Uncaught exception 'Exception' with message 'quot;0quot; Status: Communication Error' in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Serv ice.php:385 Stack trace: #0 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(652): Apache_Solr_Ser vice-gt;_sendRawPost('http://127.0.0', 'lt;add allowDups=...') #1 C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php(676): Apache_Solr_Ser vice-gt;add('lt;add allowDups=...') #2 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(221): Apache_Solr_Service-gt;addDocument(Object(Apache_Solr_Document)) #3 C:\nginx\html\apps\milio\lib\System\classes\SolrSearchEngine.class.php(262): SolrSearchEngine-gt;buildIndex(Array, 'key') #4 C:\nginx\html\apps\milio\lib\System\classes\Indexer\Indexer.class.php(51): So lrSearchEngine-gt;createFullIndex('contacts', Array, 'key', 'www') #5 C:\nginx\html\apps\milio\lib\System\functions\createIndex.php(64): Indexer-g t;create('www') #6 {main} thrown in C:\nginx\html\lib\SolrPhpClient\Apache\Solr\Service.php on li ne 385br / C:\nginx\html\apps\milio\htdocs\Contactspause Press any key to continue . . . Thanks for helping me. Grant Ingersoll-6 wrote: Do you have a stack trace? On Jan 12, 2010, at 2:54 AM, Ellery Leung wrote: When I am building the index for around 2 ~ 25000 records, sometimes I came across with this error: Uncaught exception Exception with message '0' Status: Communication Error I search Google Yahoo but no answer. I am now committing document to solr on every 10 records fetched from a SQLite Database with PHP 5.3. Platform: Windows 7 Home Web server: Nginx Solr Specification Version: 1.4.0 Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40 Lucene Specification Version: 2.9.1 Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25 Solr hosted in jetty 6.1.3 All the above are in one single test machine. The situation is that sometimes when I build the index, it can be created successfully. But sometimes it will just stop with the above error. Any clue? Please help. Thank you in advance. -- View this message in context: http://old.nabble.com/What-is-this-error-means--tp27123815p27138658.html Sent from the Solr - User mailing list archive at Nabble.com.
What is this error means?
When I am building the index for around 2 ~ 25000 records, sometimes I came across with this error: Uncaught exception Exception with message '0' Status: Communication Error I search Google Yahoo but no answer. I am now committing document to solr on every 10 records fetched from a SQLite Database with PHP 5.3. Platform: Windows 7 Home Web server: Nginx Solr Specification Version: 1.4.0 Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40 Lucene Specification Version: 2.9.1 Lucene Implementation Version: 2.9.1 832363 - 2009-11-03 04:37:25 Solr hosted in jetty 6.1.3 All the above are in one single test machine. The situation is that sometimes when I build the index, it can be created successfully. But sometimes it will just stop with the above error. Any clue? Please help. Thank you in advance.
What does it mean about this error message???
there_are_more_terms_than_documents_in_field_someField_but_its_impossible_ to_sort_on_tokenized_fields The index is probably built and run. Using Solr 1.4. The error message is quite vague that it seems to talk about different thing.. Can somebody please explain what it is? Thank you in advance