Where does Solr load balancer runs?
Hi, I'm a little confused about the Solr replication/load balancing. Where exactly the load balancer runs, is it in the Master node (might not I guess) or Slave or somewhere else? Please let me know. Thanks Regards, Sreehareesh KM
How to index data in muliValue field with key
*This might be very simple question but I can't figure out after i tried to google all day. I just want the data to show like this* /record: [ { id: product001 name: iPhone case, title: { th: เคส ไอโฟน5 iphone5 Case วิบวับ ลายผสมมุกสีชมพู back case, en: iphone5 Case pinky pearl back case } ]/ *and this is my schema.xml* /field name=title type=text_th indexed=true stored=true multiValued=true/ / *this is my php code* ?php require_once( 'SolrPhpClient/Apache/Solr/Service.php' ); $solr = new Apache_Solr_Service( 'localhost', '8983', './solr' ); if( !$solr-ping() ) { echo Solr service is not responding; exit; } $parts = array( 'spark_plug' = array( 'id' = 11, 'name' = 'Spark plug', 'title' = array( 'th' = 'เคส sdsdไอโฟน4 iphone4 Case วิบวับ ลายหอไอเฟลสุดเก๋ สีชมพูเข้ม ปปback case', 'en' = 'New design Iphone 4 case with pink and beutiful back case ' ), 'model' = array( 'a'='Boxster', 'b'='924' ), 'price' = 25.00, 'inStock' = true, ), 'windshield' = array( 'id' = 2, 'name' = 'Windshield', 'model' = '911', 'price' = 15.50, 'inStock' = false, 'url'='http://store.weloveshopping.com/joeishiablex12' ) ); $documents = array(); foreach ( $parts as $item = $fields ) { $doc = new Apache_Solr_Document(); foreach ( $fields as $key = $value ) { if ( is_array( $value ) ) { foreach ( $value as $datum ) { $doc-setMultiValue( $key, $datum ); } } else { $doc-$key = $value; } } $documents[] = $doc; } try { $solr-addDocuments($documents); $solr-commit(); $solr-optimize(); } catch(Exeption $e) { echo $e-getMessage(); } ? *but the response that I'm getting now like below as you see it has no key ( th or en) in response* /record: [ { id: product001 name: iPhone case, title: { เคส ไอโฟน5 iphone5 Case วิบวับ ลายผสมมุกสีชมพู back case, iphone5 Case pinky pearl back case } ]/ *Please help, million thanks Chun.* -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-data-in-muliValue-field-with-key-tp4110653.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Searchquery on field that contains space
@iorixxx: thanks, you 2nd solution worked. The first one didn't (does not matter now), I got this: field name=title type=prefix_full indexed=true stored=true/ field name=title_search type=prefix_full indexed=true stored=true/ With the first solution all queries work as expected, however with this: q=title_search:new%20yk* still new york is returned. -- View this message in context: http://lucene.472066.n3.nabble.com/Searchquery-on-field-that-contains-space-tp4110166p4110658.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Where does Solr load balancer runs?
Hi Sreehareesh, In master-slave there is no LB inside. You need to provide it externally and configure to rotate slave endpoints. On Fri, Jan 10, 2014 at 12:20 PM, Sreehareesh Kaipravan Meethaleveetil smeethalevee...@sapient.com wrote: Hi, I'm a little confused about the Solr replication/load balancing. Where exactly the load balancer runs, is it in the Master node (might not I guess) or Slave or somewhere else? Please let me know. Thanks Regards, Sreehareesh KM -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Sorting of facets
Hi, is ist possible to sort the facet results by other fields than the facet field? e.g. I have 3 int fields: directory, pages, links Because I want all unique directories I have to use directory as the facet.field parameter. As far as I understand what I read I can now only sort the facet results by the amount of appearances of each directory. But the TOP directories in my use-case are those with most links and most pages on it. Is there a way to sort my facet results by another field or is it maybe possible to have a group of facet fields which can be sorted by each field inside the group? Thanks in advance, Markus
Boosting documents at index time, based on payloads
Hi, I'm not really sure how/if payloads work (I tried out Rafal Kuc's payload example in Apache Solr 4 Cookbook and did not do what i was expecting - see below what i was expecting to do and please correct me if i was looking for the the wrong droid) What I am trying to achieve is similar to the payload principle, give certain term a boosting value at index time. At query time , if searched by that term, that boost value should influence the scoring, docs with bigger boost values being preferred to the ones with smaller boost values. Can this be achieved using payloads? I expect so, but then how should this behaviour be implemented - the basic recipe failed to work, so I'm a little confused. Thanks! - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Boosting-documents-at-index-time-based-on-payloads-tp4110661.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How to index data in muliValue field with key
Doesn't work like that - a multivalued field is like a list. PHP doesn't make a difference between a list and a map - but Solr does. you can't have a key in those fields. But based on what infos you've provided .. it looks more like you do in fact need different analyzers to get the english vs. the thai text properly. You could try title_th and title_en and configure those fields according to your needs. -Stefan On Friday, January 10, 2014 at 10:06 AM, rachun wrote: *This might be very simple question but I can't figure out after i tried to google all day. I just want the data to show like this* /record: [ { id: product001 name: iPhone case, title: { th: เคส ไอโฟน5 iphone5 Case วิบวับ ลายผสมมุกสีชมพู back case, en: iphone5 Case pinky pearl back case } ]/ *and this is my schema.xml* /field name=title type=text_th indexed=true stored=true multiValued=true/ / *this is my php code* ?php require_once( 'SolrPhpClient/Apache/Solr/Service.php' ); $solr = new Apache_Solr_Service( 'localhost', '8983', './solr' ); if( !$solr-ping() ) { echo Solr service is not responding; exit; } $parts = array( 'spark_plug' = array( 'id' = 11, 'name' = 'Spark plug', 'title' = array( 'th' = 'เคส sdsdไอโฟน4 iphone4 Case วิบวับ ลายหอไอเฟลสุดเก๋ สีชมพูเข้ม ปปback case', 'en' = 'New design Iphone 4 case with pink and beutiful back case ' ), 'model' = array( 'a'='Boxster', 'b'='924' ), 'price' = 25.00, 'inStock' = true, ), 'windshield' = array( 'id' = 2, 'name' = 'Windshield', 'model' = '911', 'price' = 15.50, 'inStock' = false, 'url'='http://store.weloveshopping.com/joeishiablex12' ) ); $documents = array(); foreach ( $parts as $item = $fields ) { $doc = new Apache_Solr_Document(); foreach ( $fields as $key = $value ) { if ( is_array( $value ) ) { foreach ( $value as $datum ) { $doc-setMultiValue( $key, $datum ); } } else { $doc-$key = $value; } } $documents[] = $doc; } try { $solr-addDocuments($documents); $solr-commit(); $solr-optimize(); } catch(Exeption $e) { echo $e-getMessage(); } ? *but the response that I'm getting now like below as you see it has no key ( th or en) in response* /record: [ { id: product001 name: iPhone case, title: { เคส ไอโฟน5 iphone5 Case วิบวับ ลายผสมมุกสีชมพู back case, iphone5 Case pinky pearl back case } ]/ *Please help, million thanks Chun.* -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-index-data-in-muliValue-field-with-key-tp4110653.html Sent from the Solr - User mailing list archive at Nabble.com (http://Nabble.com).
Re: Analysis page broken on trunk?
Sorry for not getting back on this earlier - i've tried several fields w/ values from the example docs and that looks pretty okay to me, no change noticed on that. Can you share a screenshot or something like that? And perhaps Input, Fields/Fieldtype which doesn't work for you? -Stefan On Wednesday, January 8, 2014 at 2:24 PM, Markus Jelsma wrote: Hi - You will see on the left side each filter abbreviation but you won't see anything in the right container. No terms, positions, offsets, nothing. Markus -Original message- From:Stefan Matheis matheis.ste...@gmail.com (mailto:matheis.ste...@gmail.com) Sent: Wednesday 8th January 2014 14:10 To: solr-user@lucene.apache.org (mailto:solr-user@lucene.apache.org) Subject: Re: Analysis page broken on trunk? Hey Markus i'm not up to date with the latest changes, but if you can describe how to reproduce it, i can try to verify that? -Stefan On Wednesday, January 8, 2014 at 12:44 PM, Markus Jelsma wrote: Hi - it seems the analysis page is broken on trunk and it looks like our 4.5 and 4.6 builds are unaffected. Can anyone on trunk confirm this? Markus
Re: Solr 4.6.0: DocValues (distributed search)
In short, when running a distributed search every shard runs the query separately. Each shard's collector returns the topN (rows param) internal docId's of the matching documents. These topN docId's are converted to their uniqueKey in the BinaryResponseWriter and sent to the frontend core (the one the received the query). This conversion is implemented by a StoredFieldVisitor, meaning the uniqueKeys are read from their stored field and not from their docValues. As in our use-case we have a high row param, these conversions became a performance bottleneck. We implemented a user-cache that stores the shard's uniqueKey docValues, which is a [docId, uniqueKey] mapping. This eliminates the need of accessing the stored field for these frequent conversions. You can have a look at the patch. Feel free commenting https://issues.apache.org/jira/browse/SOLR-5478 Best, Manuel On Thu, Jan 9, 2014 at 7:33 PM, ku3ia dem...@gmail.com wrote: Today I setup a simple SolrCloud with tow shards. Seems the same. When I'm debugging a distributed search I can't catch a break-point at lucene codec file, but when I'm using faceted search everything looks fine - debugger stops. Can anyone help me with my question? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-6-0-DocValues-distributed-search-tp4110289p4110511.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Copying Index
Folks, any response to below query would be highly appreciated Thanks, Anand On 1/10/2014 11:04 AM, anand chandak wrote: Hi, I am testing replication feature of solr 4.x with large index, unfortunately, that index that we had was 3.x format. So to conver that into 4.x I copied (file system copy) the index and then ran the upgradeindexer utility to convert it to 4.x format. The utility did, what it is suppose to do and I had 4.x index (verified it with checkindex). However, now when I am replicating, the upgraded index is not getting replicated, I don't see any errors in the log file too ? If somebody can throw some light on what could be issue here ? Thanks, Anand
Re: solr text analysis showing a red bar error
Hmmm, works on a 4x Solr. Please paste the raw text you're putting in the entry field here, so I don't have to re-type it all from the image (can't cut/paste). What version of Solr are you using? Anything come out in the Solr log that looks suspicious? Best, Erick On Thu, Jan 9, 2014 at 7:52 AM, Umapathy S nsupat...@gmail.com wrote: Hi, I am a new to solr/lucene. I am trying to do a text analysis on my index. The below error (screenshot) is shown when I increase the field value length. I have tried searching in vain for any length specific restrictions in solr.TextField. There is no error text/exception thrown. [image: Inline images 1] The field is below field name=text type=text_general stored=true indexed=true / fieldtype is fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Any help much appreciated. Thanks Umapathy
Re: Searchquery on field that contains space
What's the purpose of having two fields title and title_search? They both are exactly the same so it seems you could get rid of one Just a nit. Erick As far as the analysis page is concerned, I suspect you took out this definition from your solrconfig.xml file: requestHandler name=/analysis/field startup=lazy class=solr.FieldAnalysisRequestHandler / PUT IT BACK ;). Really, this page will save you again and again and again. At least when I commented out this definition and tried using the analysis page I got the same error. You may have taken out other things in your solrconfig.xml file that are needed for this to work, but this is the place to start. Best Erick On Fri, Jan 10, 2014 at 4:31 AM, PeterKerk vettepa...@hotmail.com wrote: @iorixxx: thanks, you 2nd solution worked. The first one didn't (does not matter now), I got this: field name=title type=prefix_full indexed=true stored=true/ field name=title_search type=prefix_full indexed=true stored=true/ With the first solution all queries work as expected, however with this: q=title_search:new%20yk* still new york is returned. -- View this message in context: http://lucene.472066.n3.nabble.com/Searchquery-on-field-that-contains-space-tp4110166p4110658.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Copying Index
You have to be a bit patient. Folks in the US, at least, are barely awake yet. We are not a paid-support helpdesk... On to your problem. You haven't given enough information to say much. for instance, what do you mean by replication? Old-style master/slave replication? Do you see any errors in your logs? Or any attempt on the part of the slave (assuming M/S replication) to replicate? What is the state of your slave that you expect it to replicate? How are your slaves configured? Are they pointing to the master properly? If you're in SolrCloud mode, then it's a different story. You might review: http://wiki.apache.org/solr/UsingMailingLists Best, Erick On Fri, Jan 10, 2014 at 6:59 AM, anand chandak anand.chan...@oracle.com wrote: Folks, any response to below query would be highly appreciated Thanks, Anand On 1/10/2014 11:04 AM, anand chandak wrote: Hi, I am testing replication feature of solr 4.x with large index, unfortunately, that index that we had was 3.x format. So to conver that into 4.x I copied (file system copy) the index and then ran the upgradeindexer utility to convert it to 4.x format. The utility did, what it is suppose to do and I had 4.x index (verified it with checkindex). However, now when I am replicating, the upgraded index is not getting replicated, I don't see any errors in the log file too ? If somebody can throw some light on what could be issue here ? Thanks, Anand
Re: Copying Index
Erick My apologies,if I was rushing. Yes its' an old style master/slave replication. I was going through solr logs, but don't see any errors as such. The slave are correctly configured and pointing to master correctly, one thing I noted is on creating any new document to the master, its getting replicated correctly to the slave, but the old index that I copied and upgraded to 4.x format, that's not getting replicated, I also ran the checkIndex -f utility, don't see any issue there too Thanks, Anand On 1/10/2014 6:30 PM, Erick Erickson wrote: You have to be a bit patient. Folks in the US, at least, are barely awake yet. We are not a paid-support helpdesk... On to your problem. You haven't given enough information to say much. for instance, what do you mean by replication? Old-style master/slave replication? Do you see any errors in your logs? Or any attempt on the part of the slave (assuming M/S replication) to replicate? What is the state of your slave that you expect it to replicate? How are your slaves configured? Are they pointing to the master properly? If you're in SolrCloud mode, then it's a different story. You might review: http://wiki.apache.org/solr/UsingMailingLists Best, Erick On Fri, Jan 10, 2014 at 6:59 AM, anand chandak anand.chan...@oracle.com wrote: Folks, any response to below query would be highly appreciated Thanks, Anand On 1/10/2014 11:04 AM, anand chandak wrote: Hi, I am testing replication feature of solr 4.x with large index, unfortunately, that index that we had was 3.x format. So to conver that into 4.x I copied (file system copy) the index and then ran the upgradeindexer utility to convert it to 4.x format. The utility did, what it is suppose to do and I had 4.x index (verified it with checkindex). However, now when I am replicating, the upgraded index is not getting replicated, I don't see any errors in the log file too ? If somebody can throw some light on what could be issue here ? Thanks, Anand
leading wildcard characters
How do you disable leading wildcards in 4.X? The setAllowLeadingWildcard method is there in the parser, but nothing references the getter. Also, the Edismax parser always enables it and provides no way to override. Thanks, Peter
Re: Solr 4.6.0: DocValues (distributed search)
Manuel Le Normand wrote In short, when running a distributed search every shard runs the query separately. Each shard's collector returns the topN (rows param) internal docId's of the matching documents. These topN docId's are converted to their uniqueKey in the BinaryResponseWriter and sent to the frontend core (the one the received the query). This conversion is implemented by a StoredFieldVisitor, meaning the uniqueKeys are read from their stored field and not from their docValues. As in our use-case we have a high row param, these conversions became a performance bottleneck. We implemented a user-cache that stores the shard's uniqueKey docValues, which is a [docId, uniqueKey] mapping. This eliminates the need of accessing the stored field for these frequent conversions. You can have a look at the patch. Feel free commenting https://issues.apache.org/jira/browse/SOLR-5478 Best, Manuel On Thu, Jan 9, 2014 at 7:33 PM, ku3ia lt; demesg@ gt; wrote: Today I setup a simple SolrCloud with tow shards. Seems the same. When I'm debugging a distributed search I can't catch a break-point at lucene codec file, but when I'm using faceted search everything looks fine - debugger stops. Can anyone help me with my question? Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-6-0-DocValues-distributed-search-tp4110289p4110511.html Sent from the Solr - User mailing list archive at Nabble.com. Hi, Manuel! Many thanks for your post! I'll try yours patch. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-6-0-DocValues-distributed-search-tp4110289p4110698.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Copying Index
OK, probably this is what's happening: There's no event that causes the slave to say Oh, my index is out of date. This assumes (and I haven't checked) that you have the same number of segments etc. after the upgrade to 4x format. So when you update a doc, that registers a change event that the slave recognizes as a changed index and pulls the doc down as part of a new segment. So I posit that eventually the entire index will be replicated in the new format as the segments get merged. You can probably force this by doing an optimize on the master. WARNING: This is theoretical, I'm not saying this from deep understanding of the replication code, but it seems like a good story. Best, Erick On Fri, Jan 10, 2014 at 8:09 AM, anand chandak anand.chan...@oracle.com wrote: Erick My apologies,if I was rushing. Yes its' an old style master/slave replication. I was going through solr logs, but don't see any errors as such. The slave are correctly configured and pointing to master correctly, one thing I noted is on creating any new document to the master, its getting replicated correctly to the slave, but the old index that I copied and upgraded to 4.x format, that's not getting replicated, I also ran the checkIndex -f utility, don't see any issue there too Thanks, Anand On 1/10/2014 6:30 PM, Erick Erickson wrote: You have to be a bit patient. Folks in the US, at least, are barely awake yet. We are not a paid-support helpdesk... On to your problem. You haven't given enough information to say much. for instance, what do you mean by replication? Old-style master/slave replication? Do you see any errors in your logs? Or any attempt on the part of the slave (assuming M/S replication) to replicate? What is the state of your slave that you expect it to replicate? How are your slaves configured? Are they pointing to the master properly? If you're in SolrCloud mode, then it's a different story. You might review: http://wiki.apache.org/solr/UsingMailingLists Best, Erick On Fri, Jan 10, 2014 at 6:59 AM, anand chandak anand.chan...@oracle.com wrote: Folks, any response to below query would be highly appreciated Thanks, Anand On 1/10/2014 11:04 AM, anand chandak wrote: Hi, I am testing replication feature of solr 4.x with large index, unfortunately, that index that we had was 3.x format. So to conver that into 4.x I copied (file system copy) the index and then ran the upgradeindexer utility to convert it to 4.x format. The utility did, what it is suppose to do and I had 4.x index (verified it with checkindex). However, now when I am replicating, the upgraded index is not getting replicated, I don't see any errors in the log file too ? If somebody can throw some light on what could be issue here ? Thanks, Anand
Re: solr text analysis showing a red bar error
I think it's the HTTP GET parameter length/size issue. I got to the maximum characters it allowed through Field value (Index). But when I added characters on Field value (Query), I got the red bar again. I had to reduce the characters in Field value (Index) to make it work. I was using chrome so possibly hit that 2KB GET limit. No. The request never reached the solr service (it was running localhost). Thanks. On 10 January 2014 12:38, Erick Erickson erickerick...@gmail.com wrote: Hmmm, works on a 4x Solr. Please paste the raw text you're putting in the entry field here, so I don't have to re-type it all from the image (can't cut/paste). What version of Solr are you using? Anything come out in the Solr log that looks suspicious? Best, Erick On Thu, Jan 9, 2014 at 7:52 AM, Umapathy S nsupat...@gmail.com wrote: Hi, I am a new to solr/lucene. I am trying to do a text analysis on my index. The below error (screenshot) is shown when I increase the field value length. I have tried searching in vain for any length specific restrictions in solr.TextField. There is no error text/exception thrown. [image: Inline images 1] The field is below field name=text type=text_general stored=true indexed=true / fieldtype is fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Any help much appreciated. Thanks Umapathy
Re: solr text analysis showing a red bar error
Ah, OK. The analysis page in the admin screen is not really intended to analyze large text blocks. I suspect that if you're running into size limitations, you'll find the output pretty hard to read the output. I almost always use it with pretty short text fragments, usually just a few words. FWIW, Erick On Fri, Jan 10, 2014 at 9:50 AM, Umapathy S nsupat...@gmail.com wrote: I think it's the HTTP GET parameter length/size issue. I got to the maximum characters it allowed through Field value (Index). But when I added characters on Field value (Query), I got the red bar again. I had to reduce the characters in Field value (Index) to make it work. I was using chrome so possibly hit that 2KB GET limit. No. The request never reached the solr service (it was running localhost). Thanks. On 10 January 2014 12:38, Erick Erickson erickerick...@gmail.com wrote: Hmmm, works on a 4x Solr. Please paste the raw text you're putting in the entry field here, so I don't have to re-type it all from the image (can't cut/paste). What version of Solr are you using? Anything come out in the Solr log that looks suspicious? Best, Erick On Thu, Jan 9, 2014 at 7:52 AM, Umapathy S nsupat...@gmail.com wrote: Hi, I am a new to solr/lucene. I am trying to do a text analysis on my index. The below error (screenshot) is shown when I increase the field value length. I have tried searching in vain for any length specific restrictions in solr.TextField. There is no error text/exception thrown. [image: Inline images 1] The field is below field name=text type=text_general stored=true indexed=true / fieldtype is fieldType name=text_general class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.LowerCaseFilterFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt enablePositionIncrements=true / filter class=solr.SynonymFilterFactory synonyms=synonyms.txt ignoreCase=true expand=true/ filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Any help much appreciated. Thanks Umapathy
Re: leading wildcard characters
Hi Peter, Can you remove any occurrence of ReversedWildcardFilterFactory in schema.xml? (even if you don't use it) Ahmet On Friday, January 10, 2014 3:34 PM, Peter Keegan peterlkee...@gmail.com wrote: How do you disable leading wildcards in 4.X? The setAllowLeadingWildcard method is there in the parser, but nothing references the getter. Also, the Edismax parser always enables it and provides no way to override. Thanks, Peter
Re: DateField - Invalid JSON String Exception - converting Query Response to JSON Object
: Response: : {responseHeader={status=0,QTime=0,params={lowercaseOperators=true,sort=score : desc,cache=false,qf=content,wt=javabin,rows=100,defType=edismax,version=2,fl=*,score,start=0,q=White+Paper,stopwords=true,fq=type:White : Paper}},response={numFound=9,start=0,maxScore=0.61586785,docs=[SolrDocument{id=007, : type=White Paper, source=Documents, title=White Paper 003, body=White Paper : 004 Body, author=[Author 3], keywords=[Keyword 3], description=Vivamus : turpis eros, mime_type=pdf, _version_=1456609602022932480, : *publication_date=Wed : Jan 08 03:16:06 IST 2014*, score=0.61586785}]}, You are not looking at JSON data -- you are looking at a simple toString value from the QueryResponse java object. It's not intended to be used for anything beyond debugging. if you want the raw JSON data from Solr, then you should either *not* use SolrJ (most of that code is for parsing hte response into Java objects and you aparently don't want that) ... or: you should specify your own ResponseParser that will give you access to the raw stream of JSON... class YourRawResponseParser extends ResponseParser { // ... processResponse(InputStream body, String encoding) { // ... // do some JSON processing of body // ... } } But this assumes you want the raw JSON values returned by Solr -- previously you mentioned that you were trying to *create* JSON using the data returned by Solr using a JSON generating library -- in which case you may in fact waht to use Solr's binary response format, get the structured java object response, and then walk it pulling out just the pieces of data fro mthe response you want, and pass those specific values to your JSON generation library. It's hard to tell, because you haven't really elaborated on what it is you are trying to do -- all you've made clear is that that you are getting invalid JSON from a method that has never ever been ment to return JSON... https://people.apache.org/~hossman/#xyproblem XY Problem Your question appears to be an XY Problem ... that is: you are dealing with X, you are assuming Y will help you, and you are asking about Y without giving more details about the X so that we can understand the full issue. Perhaps the best solution doesn't involve Y at all? See Also: http://www.perlmonks.org/index.pl?node_id=542341 -Hoss http://www.lucidworks.com/
Re: leading wildcard characters
Removing ReversedWildcardFilterFactory had no effect. On Fri, Jan 10, 2014 at 10:48 AM, Ahmet Arslan iori...@yahoo.com wrote: Hi Peter, Can you remove any occurrence of ReversedWildcardFilterFactory in schema.xml? (even if you don't use it) Ahmet On Friday, January 10, 2014 3:34 PM, Peter Keegan peterlkee...@gmail.com wrote: How do you disable leading wildcards in 4.X? The setAllowLeadingWildcard method is there in the parser, but nothing references the getter. Also, the Edismax parser always enables it and provides no way to override. Thanks, Peter
Re: Tracking down the input that hits an analysis chain bug
: The problem manifests as this sort of thing: : : Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log : SEVERE: java.lang.IllegalArgumentException: startOffset must be : non-negative, and endOffset must be = startOffset, : startOffset=-1811581632,endOffset=-1811581632 Is there a stack trace in the log to go along with that? there should be. My suspicion is that since analysis errors like these are RuntimeExceptions, they may not be getting caught re-thrown with as much context as they should -- so by the time they get logged (or returned to the client) there isn't any info about the problematic field value, let alone the unqiueKey. If we had a test case that reproduces (ie: with a mock tokenfilter that always throws a RuntimeException when a token matches fail_now or something) we could have some tests that assert indexing a doc with that token results in a useful error -- which should help ensure that useful error also gets logged (although i don't think we don't really have any easy way of asserting specific log messages at the moment) -Hoss http://www.lucidworks.com/
Re: Tracking down the input that hits an analysis chain bug
OK, patch forthcoming. On Fri, Jan 10, 2014 at 11:23 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : The problem manifests as this sort of thing: : : Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log : SEVERE: java.lang.IllegalArgumentException: startOffset must be : non-negative, and endOffset must be = startOffset, : startOffset=-1811581632,endOffset=-1811581632 Is there a stack trace in the log to go along with that? there should be. My suspicion is that since analysis errors like these are RuntimeExceptions, they may not be getting caught re-thrown with as much context as they should -- so by the time they get logged (or returned to the client) there isn't any info about the problematic field value, let alone the unqiueKey. If we had a test case that reproduces (ie: with a mock tokenfilter that always throws a RuntimeException when a token matches fail_now or something) we could have some tests that assert indexing a doc with that token results in a useful error -- which should help ensure that useful error also gets logged (although i don't think we don't really have any easy way of asserting specific log messages at the moment) -Hoss http://www.lucidworks.com/
Re: Tracking down the input that hits an analysis chain bug
Is there a neighborhood of existing tests I should be visiting here? On Fri, Jan 10, 2014 at 11:27 AM, Benson Margulies bimargul...@gmail.com wrote: OK, patch forthcoming. On Fri, Jan 10, 2014 at 11:23 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : The problem manifests as this sort of thing: : : Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log : SEVERE: java.lang.IllegalArgumentException: startOffset must be : non-negative, and endOffset must be = startOffset, : startOffset=-1811581632,endOffset=-1811581632 Is there a stack trace in the log to go along with that? there should be. My suspicion is that since analysis errors like these are RuntimeExceptions, they may not be getting caught re-thrown with as much context as they should -- so by the time they get logged (or returned to the client) there isn't any info about the problematic field value, let alone the unqiueKey. If we had a test case that reproduces (ie: with a mock tokenfilter that always throws a RuntimeException when a token matches fail_now or something) we could have some tests that assert indexing a doc with that token results in a useful error -- which should help ensure that useful error also gets logged (although i don't think we don't really have any easy way of asserting specific log messages at the moment) -Hoss http://www.lucidworks.com/
Re: Tracking down the input that hits an analysis chain bug
: Is there a neighborhood of existing tests I should be visiting here? You'll need a custom schema that refers to your new MockFailOnCertainTokensFilterFactory, so i would create a completley new test class somewhere in ...solr.update (you're testing that an update fails with a clean error) -Hoss http://www.lucidworks.com/
Indexing spatial fields into SolrCloud (HTTP)
I am porting an application from Lucene to Solr which makes use of spatial4j for distance searches. The Lucene version works correctly but I am having a problem getting the Solr version to work in the same way. Lucene version: SpatialContext geoSpatialCtx = SpatialContext.GEO; geoSpatialStrategy = new RecursivePrefixTreeStrategy(new GeohashPrefixTree( geoSpatialCtx, GeohashPrefixTree.getMaxLevelsPossible()), DocumentFieldNames.LOCATION); Point point = geoSpatialCtx.makePoint(lon, lat); for (IndexableField field : geoSpatialStrategy.createIndexableFields(point)) { document.add(field); } //Store the field document.add(new StoredField(geoSpatialStrategy.getFieldName(), geoSpatialCtx.toString(point))); Solr version: Point point = geoSpatialCtx.makePoint(lon, lat); for (IndexableField field : geoSpatialStrategy.createIndexableFields(point)) { try { solrDocument.addField(field.name(), field.tokenStream(analyzer)); } catch (IOException e) { LOGGER.error(Failed to add geo field to Solr index, e); } } // Store the field solrDocument.addField(geoSpatialStrategy.getFieldName(), geoSpatialCtx.toString(point)); The server-side error is as follows: Caused by: com.spatial4j.core.exception.InvalidShapeException: Unable to read: org.apache.lucene.spatial.prefix.PrefixTreeStrategy$CellTokenStr\ eam@0 at com.spatial4j.core.io.ShapeReadWriter.readShape(ShapeReadWriter.java:48) at com.spatial4j.core.context.SpatialContext.readShape(SpatialContext.java:195) at org.apache.solr.schema.AbstractSpatialFieldType.parseShape(AbstractSpatialFieldType.java:142) I've seen David Smiley's sample code, specifically the class, SpatialDemoUpdateProcessorFactory, but I can't say that I was able to benefit from it at all. What I'm trying to do seems like it should be easy - just to index a point for distance searching - but I'm obviously missing something. Any ideas? Thanks, Jim The information contained in this email message, including any attachments, is intended solely for use by the individual or entity named above and may be confidential. If the reader of this message is not the intended recipient, you are hereby notified that you must not read, use, disclose, distribute or copy any part of this communication. If you have received this communication in error, please immediately notify me by email and destroy the original message, including any attachments. Thank you.
Re: Tracking down the input that hits an analysis chain bug
Thanks, that's the recipe that I need. On Fri, Jan 10, 2014 at 11:40 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : Is there a neighborhood of existing tests I should be visiting here? You'll need a custom schema that refers to your new MockFailOnCertainTokensFilterFactory, so i would create a completley new test class somewhere in ...solr.update (you're testing that an update fails with a clean error) -Hoss http://www.lucidworks.com/
CSVResponseWriter and grouped results not working
With Solr 4.3.1 - it appears the CSVResponseWriter does not return any results if group=true. Is that correct or am I doing something wrong? I get results when not grouping. I wanted to verify before posting a feature request.
Re: CSVResponseWriter and grouped results not working
Same here with Solr 4.4.0, no result when group=truewt=csv -lianyi Less isn't more; just enough is more. -Milton Glaser On Fri, Jan 10, 2014 at 2:37 PM, Matt Kleweno matt.klew...@gmail.com wrote: With Solr 4.3.1 - it appears the CSVResponseWriter does not return any results if group=true. Is that correct or am I doing something wrong? I get results when not grouping. I wanted to verify before posting a feature request.
Re: Perl Client for SolrCloud
I'm pretty interested in taking a stab at a Perl CPAN for SolrCloud that is Zookeeper-aware; it's the least I can do for Solr as a non-Java developer. :) A quick question though: how would I write the shard logic to behave similar to Java's Zookeeper-aware client? I'm able to get the hash/hex needed for each shard from clusterstate.json, but how do I know which field to hash on? I'm guessing I also need to read the collection's schema.xml from Zookeeper to get uniqueKey, and then use that for sharding, or does the Java client take the sharding field as input? Looking for ideas here. Thanks! Tim On 08/01/14 09:35 AM, Chris Hostetter wrote: : I couldn't find anyone which can connect to SolrCloud similar to SolrJ's : CloudSolrServer. : : Since I have a load balancer in front of 8 nodes, WebService::Solr[1] still : works fine. Right -- just because SolrJ is ZooKeeper aware doesn't mean you can *only* talk to SolrCloud with SolrJ -- you can still use any HTTP client of your choice to connect to your Solr nodes in a round robin fashion (or via a load blancer) if you wish -- just like with a non SolrCloud deployment using something like master/slave. What you might want to consider, is taking a look at something like Net::ZooKeeper to have a ZK aware perl client layer that could wrap WebService::Solr. -Hoss http://www.lucidworks.com/
Re: Perl Client for SolrCloud
: A quick question though: how would I write the shard logic to behave similar : to Java's Zookeeper-aware client? I'm able to get the hash/hex needed for each : shard from clusterstate.json, but how do I know which field to hash on? The logic you're asking about is encapsulated in the DocRouter (which can be customized per collection). I'm not sure how the CloudSolrServer SolrJ client current deals with knowing which DocRouter to use, but for a non-Java langauge that can't directly load the same classes a great first step would be... * be conigurable solely with a list of ZK addresses * connect to ZK and per collection be continuously aware of: - the list of all live nodes as they go up/down - the list of leaders as shard elections happen * for queries, route to a random live node * for updates, route to any live leader the most important part being the first 2 bullets. the last bullet being an optimization over just sending to a random node because you increase the odds of hitting the correct leader for the doc in question regardless of which DocRouter is in use. -Hoss http://www.lucidworks.com/