Re: Starts with Query
Thanks Jack for valuable response,Actually i am trying to match *any* numeric pattern at the start of each document. I dont know documents in index i just want documents title starting with any digit. -- View this message in context: http://lucene.472066.n3.nabble.com/Starts-with-Query-tp3989627p3989761.html Sent from the Solr - User mailing list archive at Nabble.com.
IndexWrite in Lucene/Solr 3.5 is slower?
We are upgrading our search infrastructure from Lucene 2.3.1 to Lucene 3.5. I am in the process of load testing and I could find that Lucene 2.3.1 could index 32,000 docs per second, whereas Lucene 3.5 could index only around 17,000 docs per second. Indeed, both of them use the standard analyzer and the default settings. Is 3.5 slower because it indexes more details and thereby resulting in a faster search? Ours is a log management product and the speed of indexing is highly important. Ok, cutting the long story short, will the slower indexing of 3.5 result in a higher search speed?, if not, what else should I fine tune to improve the indexing speed? -- With Thanks and Regards, Ramprakash Ramamoorthy, Engineer Trainee, Zoho Corporation. +91 9626975420
RE: Starts with Query
If you are not searching for the specific digit and want to match all documents that start with any digit, you could as part of the indexing process, have another field say startsWithDigit and set it to true if it the title begins with a digit. All you need to do at query time then is query for startsWithDigit =true. Thanks Afroz From: nutchsolruser Sent: 6/14/2012 11:03 PM To: solr-user@lucene.apache.org Subject: Re: Starts with Query Thanks Jack for valuable response,Actually i am trying to match *any* numeric pattern at the start of each document. I dont know documents in index i just want documents title starting with any digit. -- View this message in context: http://lucene.472066.n3.nabble.com/Starts-with-Query-tp3989627p3989761.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: IndexWrite in Lucene/Solr 3.5 is slower?
BTW, Have you changed the MergePolicy MergeScheduler settings also? Since Lucene 3.x/3.5 onwards, there have been new MergePolicy MergeScheduler implementations available, like TieredMergePolicy ConcurrentMergeScheduler. Regards Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/IndexWrite-in-Lucene-Solr-3-5-is-slower-tp3989764p3989768.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Starts with Query
It's not necessary to do this. You can simply be happy about the fact that all digits are ordered strictly in unicode, so you can use a range query: (f)q={!frange l=0 u=\: incl=true incu=false}title This finds all documents where any token from the title field starts with a digit, so if you want to only find documents where the whole title starts with a digit, you need a second field with a string or untokenized text type. Use the copyField directive then, as Jack Krupansky already suggested in a previous reply. Greetings, Kuli Am 15.06.2012 08:38, schrieb Afroz Ahmad: If you are not searching for the specific digit and want to match all documents that start with any digit, you could as part of the indexing process, have another field say startsWithDigit and set it to true if it the title begins with a digit. All you need to do at query time then is query for startsWithDigit =true. Thanks Afroz From: nutchsolruser Sent: 6/14/2012 11:03 PM To: solr-user@lucene.apache.org Subject: Re: Starts with Query Thanks Jack for valuable response,Actually i am trying to match *any* numeric pattern at the start of each document. I dont know documents in index i just want documents title starting with any digit. -- View this message in context: http://lucene.472066.n3.nabble.com/Starts-with-Query-tp3989627p3989761.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: IndexWrite in Lucene/Solr 3.5 is slower?
On Fri, Jun 15, 2012 at 12:20 PM, pravesh suyalprav...@yahoo.com wrote: BTW, Have you changed the MergePolicy MergeScheduler settings also? Since Lucene 3.x/3.5 onwards, there have been new MergePolicy MergeScheduler implementations available, like TieredMergePolicy ConcurrentMergeScheduler. Regards Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/IndexWrite-in-Lucene-Solr-3-5-is-slower-tp3989764p3989768.html Sent from the Solr - User mailing list archive at Nabble.com. Thanks for the reply Pravesh. Yes I initially used the default TieredMergePolicy and later set the merge policy in both the versions to LogByteSizeMergePolicy, in order to maintain congruence. But still Lucene 3.5 lagged behind by 2X approx. -- With Thanks and Regards, Ramprakash Ramamoorthy, Engineer Trainee, Zoho Corporation. +91 9626975420
Re: DIH idle in transaction forever
Btw, I removed the batchSize but performance is better with batchSize=1. I haven't done further testing to see what the best setting is, but the difference between setting it at 1 and not setting it is almost double the indexing time (~20 minutes vs ~37 minutes) On Thu, Jun 14, 2012 at 4:49 PM, Jasper Floor jasper.fl...@m4n.nl wrote: Actually, the readOnly=true makes things worse. What it does (among other things) is: c.setTransactionIsolation(Connection.TRANSACTION_READ_UNCOMMITTED); which leads to: Caused by: org.postgresql.util.PSQLException: Cannot change transaction isolation level in the middle of a transaction. because the connection is idle in transaction. I found this issue: https://issues.apache.org/jira/browse/SOLR-2045 Patching DIH with the code they suggest seems to work. mvg, Jasper On Thu, Jun 14, 2012 at 4:36 PM, Dyer, James james.d...@ingrambook.com wrote: Try readOnly=true in the dataSource configuration. This causes several defaults to get set in the JDBC connection, and often will solve problems like this. (see http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource) Also, try a batch size of 0 to let your jdbc driver pick what it thinks is optimal. This might be better than 1. There is also an issue in that it doesn't explicitly close the resultset but relies on closing the connection to implicily close the child objects. I know when I tried using DIH with Derby a while back this had at the least caused some log warnings, and it wouldn't work at all without readOnly=false. Not sure abour PostgreSql. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Jasper Floor [mailto:jasper.fl...@m4n.nl] Sent: Thursday, June 14, 2012 8:21 AM To: solr-user@lucene.apache.org Subject: DIH idle in transaction forever Hi all, It seems that DIH always holds two connections open to the database. One of them is almost always 'idle in transaction'. It may sometimes seem to do a little work but then it goes idle again. datasource definition: dataSource name=df-stream-store-ds jndiName=java:ext_solr_datafeeds_dba type=JdbcDataSource autoCommit=false batchSize=1 / We have a datasource defined in the jndi: no-tx-datasource jndi-nameext_solr_datafeeds_dba/jndi-name security-domainext_solr_datafeeds_dba_realm/security-domain connection-urljdbc:postgresql://db1.live.mbuyu.nl/datafeeds/connection-url min-pool-size0/min-pool-size max-pool-size5/max-pool-size transaction-isolationTRANSACTION_READ_COMMITTED/transaction-isolation driver-classorg.postgresql.Driver/driver-class blocking-timeout-millis3/blocking-timeout-millis idle-timeout-minutes5/idle-timeout-minutes new-connection-sqlSELECT 1/new-connection-sql check-valid-connection-sqlSELECT 1/check-valid-connection-sql /no-tx-datasource If we set autocommit to true then we get an OOM on indexing so that is not an option. Does anyone have any idea why this happens? I would guess that DIH doesn't close the connection, but reading the code I can't be sure of this. The ResultSet object should close itself once it reaches the end. mvg, JAsper
FileListEntityProcessor limit at 11 files?
Hello, I'm using the DIH to index some PDFs. Everything works fine for the first 11 files. But after indexing 11 PDFs the process stops independently of the PDFs being indexed or the directory structure (recursive=true). The lucene index for these 11 documents is valid. Is there anything like a FileListEntityProcessor limit that can be set? Regards, Roland
Re: FilterCache - maximum size of document set
Test first, of course, but slave on 3.6 and master on 3.5 should be fine. If you're getting evictions with the cache settings that high, you really want to look at why. Note that in particular, using NOW in your filter queries virtually guarantees that they won't be re-used as per the link I sent yesterday. Best Erick On Fri, Jun 15, 2012 at 1:15 AM, Pawel Rog pawelro...@gmail.com wrote: It can be true that filters cache max size is set to high value. That is also true that. We looked at evictions and hit rate earlier. Maybe you are right that evictions are not always unwanted. Some time ago we made tests. There are not so high difference in hit rate when filters maxSize is set to 4000 (hit rate about 85%) and 16000 (hitrate about 91%). I think that also using LFU cache can be helpful but it makes me to migrate to 3.6. Do you think it is reasonable to use slave on version 3.6 and master on 3.5? Once again, Thanks for your help -- Pawel On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson erickerick...@gmail.comwrote: Hmmm, your maxSize is pretty high, it may just be that you've set this much higher than is wise. The maxSize setting governs the number of entries. I'd start with a much lower number here, and monitor the solr/admin page for both hit ratio and evictions. Well, and size too. 16,000 entries puts a ceiling of, what, 48G on it? Ouch! It sounds like what's happening here is you're just accumulating more and more fqs over the course of the evening and blowing memory. Not all FQs will be that big, there's some heuristics in there to just store the document numbers for sparse filters, maxDocs/8 is pretty much the upper bound though. Evictions are not necessarily a bad thing, the hit-ratio is important here. And if you're using a bare NOW in your filter queries, you're probably never re-using them anyway, see: http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/ I really question whether this limit is reasonable, but you know your situation best. Best Erick On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog pawelro...@gmail.com wrote: Thanks for your response Yes, maybe you are right. I thought that filters can be larger than 3M. All kinds of filters uses BitSet? Moreover maxSize of filterCache is set to 16000 in my case. There are evictions during day traffic but not during night traffic. Version of Solr which I use is 3.5 I haven't used Memory Anayzer yet. Could you write more details about it? -- Regards, Pawel On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson erickerick...@gmail.comwrote: Hmmm, I think you may be looking at the wrong thing here. Generally, a filterCache entry will be maxDocs/8 (plus some overhead), so in your case they really shouldn't be all that large, on the order of 3M/filter. That shouldn't vary based on the number of docs that match the fq, it's just a bitset. To see if that makes any sense, take a look at the admin page and the number of evictions in your filterCache. If that is 0, you're probably using all the memory you're going to in the filterCache during the day.. But you haven't indicated what version of Solr you're using, I'm going from a relatively recent 3x knowledge-base. Have you put a memory analyzer against your Solr instance to see where the memory is being used? Best Erick On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote: Hi, I have solr index with about 25M documents. I optimized FilterCache size to reach the best performance (considering traffic characteristic that my Solr handles). I see that the only way to limit size of a Filter Cace is to set number of document sets that Solr can cache. There is no way to set memory limit (eg. 2GB, 4GB or something like that). When I process a standard trafiic (during day) everything is fine. But when Solr handle night traffic (and the charateristic of requests change) some problems appear. There is JVM out of memory error. I know what is the reason. Some filters on some fields are quite poor filters. They returns 15M of documents or even more. You could say 'Just put that into q'. I tried to put that filters into Query part but then, the statistics of request processing time (during day) become much worse. Reduction of Filter Cache maxSize is also not good solution because during day cache filters are very very helpful. You could be interested in type of filters that I use. These are range filters (I tried standard range filters and frange) - eg. price:[* TO 1]. Some fq with price can return few thousands of results (eg. price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions of documents. I'd also like to avoid solution which will introduce strict ranges that user can choose. Have you any suggestions what can I do? Is there any way to limit
SolrCloud subdirs in conf boostrap dir
Hi, We'd like to create subdirectories for each collection in our conf bootstrap directory for cleaner maintenance and not having to include the collection name in each configuration file. However, it is not working: 2012-06-15 11:31:08,483 ERROR [solr.core.CoreContainer] - [main] - : org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode f or /configs/COLLECTION_NAME/solrconfig.xml The solrconfig.xml is in boostrap_conf/dirname/solrconfig.xml and solr.xml's solrconfig attribute points to the proper file. A better question might be, how can i nicely maintain multiple collection configuration directories in SolrCloud? Thanks, Markus
Re: Dedupe and overwriteDupes setting
Hi, My solrconfig dedupe setting is as follows. updateRequestProcessorChain name=dedupe processor class=org.apache.solr.update.processor.SignatureUpdateProcessorFactory bool name=enabledtrue/bool bool name=overwriteDupesfalse/bool str name=signatureFielddupesign/str str name=fieldstitle,url/str str name=signatureClassorg.apache.solr.update.processor.Lookup3Signature/str /processor processor class=solr.LogUpdateProcessorFactory / processor class=solr.RunUpdateProcessorFactory / /updateRequestProcessorChain Even though overwriteDupes is set to false, search qiery results show the contents are overwrtten. Is this because there are duplicate contents on solr and the query results is displaying only the latest entery from the duplicate? I actually need the date field not to be overwritten. Please help. Thanks Shameema -- View this message in context: http://lucene.472066.n3.nabble.com/Dedupe-and-overwriteDupes-setting-tp809320p3989807.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: IndexWrite in Lucene/Solr 3.5 is slower?
On Fri, Jun 15, 2012 at 12:50 PM, Ramprakash Ramamoorthy youngestachie...@gmail.com wrote: On Fri, Jun 15, 2012 at 12:20 PM, pravesh suyalprav...@yahoo.com wrote: BTW, Have you changed the MergePolicy MergeScheduler settings also? Since Lucene 3.x/3.5 onwards, there have been new MergePolicy MergeScheduler implementations available, like TieredMergePolicy ConcurrentMergeScheduler. Regards Pravesh -- View this message in context: http://lucene.472066.n3.nabble.com/IndexWrite-in-Lucene-Solr-3-5-is-slower-tp3989764p3989768.html Sent from the Solr - User mailing list archive at Nabble.com. Thanks for the reply Pravesh. Yes I initially used the default TieredMergePolicy and later set the merge policy in both the versions to LogByteSizeMergePolicy, in order to maintain congruence. But still Lucene 3.5 lagged behind by 2X approx. -- With Thanks and Regards, Ramprakash Ramamoorthy, Engineer Trainee, Zoho Corporation. +91 9626975420 Can someone help me with this please? -- With Thanks and Regards, Ramprakash Ramamoorthy, Engineer Trainee, Zoho Corporation. +91 9626975420
Re: Building a heat map from geo data in index
So I've tried this a bit, but I can't get it to look quite right. What I was doing up until now was taking the center point of the geohash cell as location for the value I am getting from the index. Doing this you end up with what appears to be islands (using HeatMap.js currently). I guess what I would like to do is take this information and generate a static image so I can quickly prototype some things. Are there any good Java based heatmap tools? Also if anyone has done this before any thoughts on how to do this would really be appreciated. On Mon, Jun 11, 2012 at 12:52 PM, Jamie Johnson jej2...@gmail.com wrote: Yeah I'll have to play to see how useful it is, I really don't know at this point. On another note we already using some binning like is described in teh wiki you sent, specifically http://code.google.com/p/javageomodel/ for other purposes. Not sure if that could be used or not, guess I'd have to think on it harder. On Mon, Jun 11, 2012 at 12:04 PM, Tanguy Moal tanguy.m...@gmail.com wrote: Yes it looks interesting and is not too difficult to do. However, the length of the geohashes gives you very little control on the size of the regions to colorize. Quoting wikipedia : geohash length km error1 ±25002 ±6303 ±784 ±205 ±2.46 ±0.617 ±0.0768 ±0.019 This is interesting also : http://wiki.openstreetmap.org/wiki/QuadTiles But it does what you're looking for, somehow :) -- Tanguy 2012/6/11 Jamie Johnson jej2...@gmail.com If you look at the Stack response from David he had suggested breaking the geohash up into pieces and then using a prefix for refining precision. I hadn't imagined limiting this to a particular area, just limiting it based on the prefix (which would be based on users zoom level or something) allowing the information to become more precise as the user zoomed in. That seemed a very reasonable approach to the problem. On Mon, Jun 11, 2012 at 10:55 AM, Tanguy Moal tanguy.m...@gmail.com wrote: There is definitely something interesting to do around geohashes. I'm wondering how one could map the N by N tiles requested tiles to a range of geohashes. (Where the gap would be a function of N). What I try to mean is that I don't know if a bijective function exist between tiles and geohash ranges. I don't even know if a contiguous range of geohashes ends up in a squared box. Because if you can find such a function, then you could probably solve the issue by asking facet ranges on a geohash field to solr. I don't if that helps but the topic is very interesting to me... Please share your findings, if any :-) -- Tanguy 2012/6/11 Dmitry Kan dmitry@gmail.com so it sounds to me, that the geohash is just a hash representation of lat, lon coordinates for an easier referencing (see e.g. http://en.wikipedia.org/wiki/Geohash). I would probably start with something easier, having bbox lat,lon coordinate pairs of top left corner (or in some coordinate systems, it is down left corner), break each bbox into cells of size w/N, h/N (and probably, that's equal numbers). Then you can loop over the cells and compute your facet counts with bbox of a cell. You could then evolve this to geohashes, if you want, but at least you would know where to start. -- Dmitry On Mon, Jun 11, 2012 at 4:48 PM, Jamie Johnson jej2...@gmail.com wrote: That is certainly an option but the collecting of the heat map data is really the question. I saw this http://stackoverflow.com/questions/8798711/solr-using-facets-to-sum-documents-based-on-variable-precision-geohashes but don't have a really good understanding of how this would be accomplished. I need to get a more firm understanding of geohashes as my understanding is extremely lacking at this point. On Mon, Jun 11, 2012 at 8:55 AM, Stefan Matheis matheis.ste...@googlemail.com wrote: I'm not entirely sure, that it has to be that complicated .. what about using for example http://www.patrick-wied.at/static/heatmapjs/ ? You could collect all the geo-related data and do the (heat)map stuff on the client. On Sunday, June 10, 2012 at 7:49 PM, Jamie Johnson wrote: I had a request from a customer which to this point I have not seen much similar so I figured I'd pose the question here. I've been asked if it was possible to build a heat map from the results of a query. I can imagine a process to do this through some post processing, but that sounds very expensive for large/distributed indices so I was wondering if with all of the new geospatial support that is being added to lucene/solr there was a way to do geospatial faceting. What I am imagining is bounding box being defined and that box being broken into an N by N matrix, each of which would return counts so a heat map could be constructed. Any other thoughts on this would
Re: FilterCache - maximum size of document set
Thanks I don't use NOW in queries. All my filters with timestamp are rounded to hundreds of seconds to increase hitrate. The only problem could be in price filters which can be varied (users are unpredictable :P), but also that filters from fq or setting cache=false is also bad idea ... checked it :) Load rised three times :) -- Pawel On Fri, Jun 15, 2012 at 1:30 PM, Erick Erickson erickerick...@gmail.comwrote: Test first, of course, but slave on 3.6 and master on 3.5 should be fine. If you're getting evictions with the cache settings that high, you really want to look at why. Note that in particular, using NOW in your filter queries virtually guarantees that they won't be re-used as per the link I sent yesterday. Best Erick On Fri, Jun 15, 2012 at 1:15 AM, Pawel Rog pawelro...@gmail.com wrote: It can be true that filters cache max size is set to high value. That is also true that. We looked at evictions and hit rate earlier. Maybe you are right that evictions are not always unwanted. Some time ago we made tests. There are not so high difference in hit rate when filters maxSize is set to 4000 (hit rate about 85%) and 16000 (hitrate about 91%). I think that also using LFU cache can be helpful but it makes me to migrate to 3.6. Do you think it is reasonable to use slave on version 3.6 and master on 3.5? Once again, Thanks for your help -- Pawel On Thu, Jun 14, 2012 at 7:22 PM, Erick Erickson erickerick...@gmail.com wrote: Hmmm, your maxSize is pretty high, it may just be that you've set this much higher than is wise. The maxSize setting governs the number of entries. I'd start with a much lower number here, and monitor the solr/admin page for both hit ratio and evictions. Well, and size too. 16,000 entries puts a ceiling of, what, 48G on it? Ouch! It sounds like what's happening here is you're just accumulating more and more fqs over the course of the evening and blowing memory. Not all FQs will be that big, there's some heuristics in there to just store the document numbers for sparse filters, maxDocs/8 is pretty much the upper bound though. Evictions are not necessarily a bad thing, the hit-ratio is important here. And if you're using a bare NOW in your filter queries, you're probably never re-using them anyway, see: http://www.lucidimagination.com/blog/2012/02/23/date-math-now-and-filter-queries/ I really question whether this limit is reasonable, but you know your situation best. Best Erick On Wed, Jun 13, 2012 at 5:40 PM, Pawel Rog pawelro...@gmail.com wrote: Thanks for your response Yes, maybe you are right. I thought that filters can be larger than 3M. All kinds of filters uses BitSet? Moreover maxSize of filterCache is set to 16000 in my case. There are evictions during day traffic but not during night traffic. Version of Solr which I use is 3.5 I haven't used Memory Anayzer yet. Could you write more details about it? -- Regards, Pawel On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson erickerick...@gmail.comwrote: Hmmm, I think you may be looking at the wrong thing here. Generally, a filterCache entry will be maxDocs/8 (plus some overhead), so in your case they really shouldn't be all that large, on the order of 3M/filter. That shouldn't vary based on the number of docs that match the fq, it's just a bitset. To see if that makes any sense, take a look at the admin page and the number of evictions in your filterCache. If that is 0, you're probably using all the memory you're going to in the filterCache during the day.. But you haven't indicated what version of Solr you're using, I'm going from a relatively recent 3x knowledge-base. Have you put a memory analyzer against your Solr instance to see where the memory is being used? Best Erick On Wed, Jun 13, 2012 at 1:05 PM, Pawel pawelmis...@gmail.com wrote: Hi, I have solr index with about 25M documents. I optimized FilterCache size to reach the best performance (considering traffic characteristic that my Solr handles). I see that the only way to limit size of a Filter Cace is to set number of document sets that Solr can cache. There is no way to set memory limit (eg. 2GB, 4GB or something like that). When I process a standard trafiic (during day) everything is fine. But when Solr handle night traffic (and the charateristic of requests change) some problems appear. There is JVM out of memory error. I know what is the reason. Some filters on some fields are quite poor filters. They returns 15M of documents or even more. You could say 'Just put that into q'. I tried to put that filters into Query part but then, the statistics of request processing time (during day) become much worse. Reduction of Filter Cache maxSize
SolrCloud and split-brain
Hi, How exactly does SolrCloud handle split brain situations? Imagine a cluster of 10 nodes. Imagine 3 of them being connected to the network by some switch and imagine the out port of this switch dies. When that happens, these 3 nodes will be disconnected from the other 7 nodes and we'll have 2 clusters, one with 3 nodes and one with 7 nodes and we'll have a split brain situation. Imagine we had 3 ZK nodes in the original 10-node cluster, 2 of which are connected to the dead switch and are thus aware only of the 3 node cluster now, and 1 ZK instance which is on a different switch and is thus aware only of the 7 node cluster. At this point how exactly does ZK make SolrCloud immune to split brain? Does LBHttpSolrServer play a key role here? (I see LBHttpSolrServer mentioned only once on http://wiki.apache.org/solr/SolrCloud and with a question mark next to it) Thanks, Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm
Re: SolrCloud and split-brain
On 6/15/2012 12:49 PM, Otis Gospodnetic wrote: Hi, How exactly does SolrCloud handle split brain situations? Imagine a cluster of 10 nodes. Imagine 3 of them being connected to the network by some switch and imagine the out port of this switch dies. When that happens, these 3 nodes will be disconnected from the other 7 nodes and we'll have 2 clusters, one with 3 nodes and one with 7 nodes and we'll have a split brain situation. Imagine we had 3 ZK nodes in the original 10-node cluster, 2 of which are connected to the dead switch and are thus aware only of the 3 node cluster now, and 1 ZK instance which is on a different switch and is thus aware only of the 7 node cluster. At this point how exactly does ZK make SolrCloud immune to split brain? A quorum of N/2+1 nodes is required to operate (that's also the reason you need at least 3 to begin with)
StreamingUpdateSolrServer Connection Timeout Setting
Hi, Does anybody know what the default connection timeout setting is for StreamingUpdateSolrServer? Can i explicitly set one and how? Thanks.
Re: SolrCloud and split-brain
Zookeeper avoids split brain using Paxos (or something very like it - I can't remember if they extended it or modified and/or what they call it). So you will only ever see one Zookeeper cluster - the smaller partition will be down. There is a proof for Paxos if I remember right. Zookeeper then acts as the system of record for Solr. Solr won't auto form its own new little clusters - *the* cluster is modeled in Zookeeper and that's the cluster. So Solr does not find it self organizing new mini clusters on partition splits. When we lose our connection to Zookeeper, update requests are no longer accepted, because we may have a stale cluster view and not know it for a long period of time. On Jun 15, 2012, at 12:49 PM, Otis Gospodnetic wrote: Hi, How exactly does SolrCloud handle split brain situations? Imagine a cluster of 10 nodes. Imagine 3 of them being connected to the network by some switch and imagine the out port of this switch dies. When that happens, these 3 nodes will be disconnected from the other 7 nodes and we'll have 2 clusters, one with 3 nodes and one with 7 nodes and we'll have a split brain situation. Imagine we had 3 ZK nodes in the original 10-node cluster, 2 of which are connected to the dead switch and are thus aware only of the 3 node cluster now, and 1 ZK instance which is on a different switch and is thus aware only of the 7 node cluster. At this point how exactly does ZK make SolrCloud immune to split brain? Does LBHttpSolrServer play a key role here? (I see LBHttpSolrServer mentioned only once on http://wiki.apache.org/solr/SolrCloud and with a question mark next to it) Thanks, Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm - Mark Miller lucidimagination.com
Re: SolrCloud and split-brain
Hi, Zookeeper avoids split brain using Paxos (or something very like it - I can't remember if they extended it or modified and/or what they call it). So you will only ever see one Zookeeper cluster - the smaller partition will be down. There is a proof for Paxos if I remember right. Zookeeper then acts as the system of record for Solr. Solr won't auto form its own new little clusters - *the* cluster is modeled in Zookeeper and that's the cluster. So Solr does not find it self organizing new mini clusters on partition splits. When we lose our connection to Zookeeper, update requests are no longer accepted, because we may have a stale cluster view and not know it for a long period of time. Does this work even when outside clients (apps for indexing or searching) send their requests directly to individual nodes? Let's use the example from my email where we end up with 2 groups of nodes: 7-node group with 2 ZK nodes on the same network and 3-node group with 1 ZK node on the same network. If a client sends a request to a node in the 7-node group what happens? And if a client sends a request to a node in the 3-node group what happens? Yury wrote: A quorum of N/2+1 nodes is required to operate (that's also the reason you need at least 3 to begin with) N=3 (ZK nodes), right? So in that case we need at least 3/2+1 = 2.5 ZK nodes to operate. So in my example neither the 7-node group nor the 3-node group will operate (does that mean request rejection or something else?) because neither sees 2.5 ZK nodes? Thanks, Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm On Jun 15, 2012, at 12:49 PM, Otis Gospodnetic wrote: Hi, How exactly does SolrCloud handle split brain situations? Imagine a cluster of 10 nodes. Imagine 3 of them being connected to the network by some switch and imagine the out port of this switch dies. When that happens, these 3 nodes will be disconnected from the other 7 nodes and we'll have 2 clusters, one with 3 nodes and one with 7 nodes and we'll have a split brain situation. Imagine we had 3 ZK nodes in the original 10-node cluster, 2 of which are connected to the dead switch and are thus aware only of the 3 node cluster now, and 1 ZK instance which is on a different switch and is thus aware only of the 7 node cluster. At this point how exactly does ZK make SolrCloud immune to split brain? Does LBHttpSolrServer play a key role here? (I see LBHttpSolrServer mentioned only once on http://wiki.apache.org/solr/SolrCloud and with a question mark next to it) Thanks, Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm - Mark Miller lucidimagination.com
Re: SolrCloud and split-brain
On Jun 15, 2012, at 1:44 PM, Otis Gospodnetic wrote: Does this work even when outside clients (apps for indexing or searching) send their requests directly to individual nodes? Let's use the example from my email where we end up with 2 groups of nodes: 7-node group with 2 ZK nodes on the same network and 3-node group with 1 ZK node on the same network. The 3-node group with 1 ZK would not have a functioning zk - so it would stop accepting updates. If it could serve a complete view of the index, it would though, for searches. The 7-node group would have a working ZK it could talk to, and it would continue to accept updates as long as a node for a shard for that hash range is up. It would also of course serve searches. In this case, hitting a box in the 3-node group for searches would start becoming stale. A smart client would no longer hit those boxes though. If you have a 'dumb' client or load balancer, then yes - you would have to remove the bad nodes from rotation. We could improve this or make the behavior configurable. At least initially though, we figured it was better if we kept serving searches even when we cannot talk to zookeeper. If a client sends a request to a node in the 7-node group what happens? And if a client sends a request to a node in the 3-node group what happens? - Mark Miller lucidimagination.com
Re: StreamingUpdateSolrServer Connection Timeout Setting
The api doc for version 3.6.0 is available here: http://lucene.apache.org/solr/api/org/apache/solr/client/solrj/impl/StreamingUpdateSolrServer.html I think the default is coming from your OS if you are not setting it explicitly. -- Sami Siren On Fri, Jun 15, 2012 at 8:22 PM, Kissue Kissue kissue...@gmail.com wrote: Hi, Does anybody know what the default connection timeout setting is for StreamingUpdateSolrServer? Can i explicitly set one and how? Thanks.
Re: SolrCloud and split-brain
Ola, Thanks Mark! Does this work even when outside clients (apps for indexing or searching) send their requests directly to individual nodes? Let's use the example from my email where we end up with 2 groups of nodes: 7-node group with 2 ZK nodes on the same network and 3-node group with 1 ZK node on the same network. The 3-node group with 1 ZK would not have a functioning zk - so it would stop accepting updates. If it could serve a complete view of the index, it would though, for searches. So in this case information in this 1 ZK node would tell the 3 Solr nodes whether they have all index data or if some shards are missing (i.e. were only on nodes in the other 7-node group)? And if nodes figure out they don't have all index data they will reject search requests? Or will they accept and perform searches, but return responses that tell the client that the searched index was not complete? The 7-node group would have a working ZK it could talk to, and it would continue to accept updates as long as a node for a shard for that hash range is up. It would also of course serve searches. Right, so if the node for the shard where a doc is supposed to go to is in that 3-node group, then the indexing request will be rejected. Is this correct? In this case, hitting a box in the 3-node group for searches would start becoming stale. A smart client would no longer hit those boxes though. If you have a 'dumb' client or load balancer, then yes - you would have to remove the bad nodes from rotation. Aha, yes and yes. We could improve this or make the behavior configurable. At least initially though, we figured it was better if we kept serving searches even when we cannot talk to zookeeper. Makes sense. Do responses carry something to alert the client that something is rotten in the state of cluster? Thanks, Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm
Re: SolrCloud and split-brain
On Jun 15, 2012, at 2:12 PM, Otis Gospodnetic wrote: Makes sense. Do responses carry something to alert the client that something is rotten in the state of cluster? No, I don't think so - we should probably add that to the header similar to how I assume partial results will work. Feel free to fire up a JIRA issue for that. - Mark Miller lucidimagination.com
Re: SolrCloud and split-brain
Thanks Mark, will open an issue in a bit. But I think the following is the real meat of the Q about split brain and SolrCloud, especially when it comes to how indexing is handled during split brain: Does this work even when outside clients (apps for indexing or searching) send their requests directly to individual nodes? Let's use the example from my email where we end up with 2 groups of nodes: 7-node group with 2 ZK nodes on the same network and 3-node group with 1 ZK node on the same network. The 3-node group with 1 ZK would not have a functioning zk - so it would stop accepting updates. If it could serve a complete view of the index, it would though, for searches. So in this case information in this 1 ZK node would tell the 3 Solr nodes whether they have all index data or if some shards are missing (i.e. were only on nodes in the other 7-node group)? And if nodes figure out they don't have all index data they will reject search requests? Or will they accept and perform searches, but return responses that tell the client that the searched index was not complete? The 7-node group would have a working ZK it could talk to, and it would continue to accept updates as long as a node for a shard for that hash range is up. It would also of course serve searches. Right, so if the node for the shard where a doc is supposed to go to is in that 3-node group, then the indexing request will be rejected. Is this correct? Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm - Original Message - From: Mark Miller markrmil...@gmail.com To: solr-user solr-user@lucene.apache.org Cc: Sent: Friday, June 15, 2012 2:22 PM Subject: Re: SolrCloud and split-brain On Jun 15, 2012, at 2:12 PM, Otis Gospodnetic wrote: Makes sense. Do responses carry something to alert the client that something is rotten in the state of cluster? No, I don't think so - we should probably add that to the header similar to how I assume partial results will work. Feel free to fire up a JIRA issue for that. - Mark Miller lucidimagination.com
WordBreak and default dictionary crash Solr
Is this a configuration problem or a bug? We use two dictionaries, default (spellcheckerFreq) and solr.WordBreakSolrSpellChecker. When a query contains 2 misspellings, one corrected by the default dictionary, and the other corrected by the wordbreak dictionary (strawberryn shortcake) , Solr crashes with error below. It doesn't matter which dictionary is checked first. java.lang.NullPointerException at org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:566) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:177) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1555) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) Multiple errors corrected by the SAME dictionary (either wordbreak or default) do not crash Solr. Here is excerpt from our solrconfig.xml: searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypetextSpell/str lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldspell/str str name=combineWordstrue/str str name=breakWordstrue/str int name=maxChanges1/int /lst lst name=spellchecker str name=namedefault/str str name=fieldspell/str str name=spellcheckIndexDirspellcheckerFreq/str str name=buildOnOptimizetrue/str /lst /searchComponent requestHandler name=/select class=solr.SearchHandler lst name=defaults . str name=spellcheck.dictionarywordbreak/str str name=spellcheck.dictionarydefault/str str name=spellcheck.count3/str str name=spellcheck.collatetrue/str str name=spellcheck.onlyMorePopularfalse/str /lst /requestHandler
Re: SolrCloud and split-brain
On Jun 15, 2012, at 3:21 PM, Otis Gospodnetic wrote: Thanks Mark, will open an issue in a bit. But I think the following is the real meat of the Q about split brain and SolrCloud, especially when it comes to how indexing is handled during split brain: Does this work even when outside clients (apps for indexing or searching) send their requests directly to individual nodes? Let's use the example from my email where we end up with 2 groups of nodes: 7-node group with 2 ZK nodes on the same network and 3-node group with 1 ZK node on the same network. The 3-node group with 1 ZK would not have a functioning zk - so it would stop accepting updates. If it could serve a complete view of the index, it would though, for searches. So in this case information in this 1 ZK node would tell the 3 Solr nodes whether they have all index data or if some shards are missing (i.e. were only on nodes in the other 7-node group)? And if nodes figure out they don't have all index data they will reject search requests? Or will they accept and perform searches, but return responses that tell the client that the searched index was not complete? The 1 ZK node will not function, so the 3 Solr nodes will not accept updates. If there is one replica for each shard available, search will still work. I don't think partial results has been committed yet for distrib search. In that case, we will put something in the header to indicate a full copy of the index was not available. I think we can also add something in the header if we know we cannot talk to zookeeper to let the client know it could be seeing stale state. SmartClients that talked to zookeeper would see those nodes appear as down in zookeeper and stop trying to talk to them. The 7-node group would have a working ZK it could talk to, and it would continue to accept updates as long as a node for a shard for that hash range is up. It would also of course serve searches. Right, so if the node for the shard where a doc is supposed to go to is in that 3-node group, then the indexing request will be rejected. Is this correct? it depends on what is available - but you will need at least one replica for each shard available - eg your partition needs to have one copy of the index - otherwise updates are rejected if there are no nodes hosting a shard of the hash range. So if a replica made it into the larger partition, you will be fine - it will become the leader. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm - Original Message - From: Mark Miller markrmil...@gmail.com To: solr-user solr-user@lucene.apache.org Cc: Sent: Friday, June 15, 2012 2:22 PM Subject: Re: SolrCloud and split-brain On Jun 15, 2012, at 2:12 PM, Otis Gospodnetic wrote: Makes sense. Do responses carry something to alert the client that something is rotten in the state of cluster? No, I don't think so - we should probably add that to the header similar to how I assume partial results will work. Feel free to fire up a JIRA issue for that. - Mark Miller lucidimagination.com - Mark Miller lucidimagination.com
RE: WordBreak and default dictionary crash Solr
Carrie, Thank you for trying out new features! I'm pretty sure you've found a bug here. Could you tell me whether you're using a build from Trunk or Solr_4x ? Also, do you know the svn revision or the Jenkins build # (or timestamp) you're working from? Could you try instead to use DirectSolrSpellChecker instead of IndexBasedSpellChecker for your default dictionary? (In Trunk and the 4.x branch, the Solr Example now uses DirectSolrSpellChecker as its default.) It could be this is a problem related to using WordBreakSolrSpellChecker with the older IndexBasedSpellChecker. So if you have better luck with DirectSolrSpellChecker, that would be helpful in honing in on the exact problem. Also, judging from the line that is failing, could it be you're using a build based on svn revision pre-r1346489 (Trunk) or pre-r1346499 (Branch_4x) ? https://issues.apache.org/jira/browse/SOLR-2993 Shortly after the initial commit of this feature, a bug similar to the one you're reporting was later fixed with these subsequent revisions. James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Carrie Coy [mailto:c...@ssww.com] Sent: Friday, June 15, 2012 2:46 PM To: solr-user@lucene.apache.org Subject: WordBreak and default dictionary crash Solr Is this a configuration problem or a bug? We use two dictionaries, default (spellcheckerFreq) and solr.WordBreakSolrSpellChecker. When a query contains 2 misspellings, one corrected by the default dictionary, and the other corrected by the wordbreak dictionary (strawberryn shortcake) , Solr crashes with error below. It doesn't matter which dictionary is checked first. java.lang.NullPointerException at org.apache.solr.handler.component.SpellCheckComponent.toNamedList(SpellCheckComponent.java:566) at org.apache.solr.handler.component.SpellCheckComponent.process(SpellCheckComponent.java:177) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:204) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:240) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1555) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:662) Multiple errors corrected by the SAME dictionary (either wordbreak or default) do not crash Solr. Here is excerpt from our solrconfig.xml: searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypetextSpell/str lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldspell/str str name=combineWordstrue/str str name=breakWordstrue/str int name=maxChanges1/int /lst lst name=spellchecker str name=namedefault/str str name=fieldspell/str str name=spellcheckIndexDirspellcheckerFreq/str str name=buildOnOptimizetrue/str /lst /searchComponent requestHandler name=/select class=solr.SearchHandler lst name=defaults . str name=spellcheck.dictionarywordbreak/str str name=spellcheck.dictionarydefault/str str name=spellcheck.count3/str str name=spellcheck.collatetrue/str str name=spellcheck.onlyMorePopularfalse/str /lst /requestHandler
Re: How to boost a field with another field's value?
Actually I have a title field that I am searching for my query term, and the documents have a rating field that I want to boost the results by, so the higher rated items appear before the lower rated documents. I am also boosting results on another field using bq: q=summerdf=titlebq=sponsored:true^5.0qf=rating^2.0defType=dismax However, when I use qf to boost the results by rating, Sorl is trying to match the query in the rating field. How can I accomplish boosting by rating using query time boosting? -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-boost-a-field-with-another-field-s-value-tp3989706p3989917.html Sent from the Solr - User mailing list archive at Nabble.com.
Writing index files that have the right owner
I have been putting together an application using Quartz to run several indexing jobs in sequence using SolrJ and Tomcat on Windows. I would like the Quartz job to do the following: 1. Delete index directories from the cores so each indexing job starts fresh with empty indexes to populate. 2. Start the Tomcat server. 3. Run the indexing job. 4. Stop the Tomcat server. 5. Copy the index directories to an archive. Steps 2-5 work fine, but I haven't been able to find a way to delete the index directories from within Java. I also can't delete them from a Windows command shell window: I get an error message that says Access is denied. The reason for this is that the index directories and files have the owner BUILTIN\Administrators. Although I am an administrator on this machine, the fact that these files have a different owner means that I can only delete them in a Windows command shell window if I start it with Run as administrator. I spent a bunch of time today trying every Java function and Windows shell command I could find that would let me change the owner of these files, grant my user account the capability to delete the files, etc. Nothing I tried worked, likely because along with not having permission to delete the files, I also don't have permission to give myself permission to delete the files. At a certain point I stopped wondering how to change the files owner or permissions and started wondering why the files have BUILTIN\Administrators as owner, and the permissions associated with that owner, in the first place. Is there somewhere in the Solr or Tomcat configuration files, or in the SolrJ code, where I can set who the owner of files written to the index directories should be? Thanks, Mike
Re: Solr Search Count Variance
The variance is simply likely due to the fact that your text field is analyzed differently than the source fields you include in your dismax qf. For example, maybe some of them may be string with no analysis. So, fewer of those fields are matching on your query terms when using dismax. Look at the results of both queries and then try querying on the specific fields of a document that is found by the traditional Lucene/Solr query parser but not found using dismax. -- Jack Krupansky -Original Message- From: mechravi25 Sent: Friday, June 15, 2012 1:16 AM To: solr-user@lucene.apache.org Subject: Solr Search Count Variance Hi all, When we give a search request to solr, the part of the request url to solr having the search query will be as following /select/?qf=name%5e2.3+text+r_name%5e0.3+id%5e0.3+xid%5e0.3fl=*f.tFacet.facet.mincount=1facet.field=tFacetf.rFacet.facet.mincount=1facet.field=rFacetfacet=truehl.fl=*hl=truerows=10start=0q=test+LogdebugQuery=on? We find the number of documnts returned to be 5000 (approx.). Here, it makes use of the standard handler and we get the parsed query as follows str name=parsedquery(text:Cxx1 text:test) (text:Dyy3 text:Log)/str str name=parsedquery_toString(text:Cxx1 text:test) (text:Dyy3 text:Log)/str here, text is the default field and this is used by the standard handler and it is the destination field for all the other fields. The same way, when we alter the above url to fetch the result by using the dismax handler, /select/?qf=name%5e2.3+text+r_name%5e0.3+id%5e0.3+xid%5e0.3qt=dismaxfl=*f.tFacet.facet.mincount=1facet.field=tFacetf.rFacet.facet.mincount=1facet.field=rFacetfacet=truehl.fl=*hl=truerows=10start=0q=test+LogdebugQuery=on? We find the number of documents found to be 710 and the parsed query is as follows str name=parsedquery+((DisjunctionMaxQuery((xid:test^0.3 | id:test^0.3 | ((r_name:Cxx1 r_name:test)^0.3) | (text:Cxx1 text:test) | ((name:Cxx1 name:test)^2.3))) DisjunctionMaxQuery((xid:Log^0.3 | id:Log^0.3 | ((r_name:Dyy3 r_name:Log)^0.3) | (text:Dyy3 text:Log) | ((name:Dyy3 name:Log)^2.3~2) ()/str str name=parsedquery_toString+(((xid:test^0.3 | id:test^0.3 | ((r_name:Cxx1 r_name:test)^0.3) | (text:Cxx1 text:test) | ((name:Cxx1 name:test)^2.3)) (xid:Log^0.3 | id:Log^0.3 | ((r_name:Dyy3 r_name:Log)^0.3) | (text:Dyy3 text:Log) | ((name:Dyy3 name:Log)^2.3)))~2) ()/str If we try to give the boosts like dismax in q parameter for standard, its working fine i.e. the total number of documents fetched is 710. The query used is as follows q:(name:test^2.3 AND name:Log^2.3)OR(text:test AND text:Log)OR(r_name:test^0.3 AND r_name:Log^0.3)OR(id:test^0.3 AND id:Log^0.3)OR(xid:test^0.3 AND xid:Log^0.3) I have two doubts here 1. Why is there a count difference of this extent between the standard and dismax handler? 2. Does the dismax handler use AND operation in the phrase query (when we use with/without quotes)? Can you please explain me the same? Thanks in advance -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Search-Count-Variance-tp3989760.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud and split-brain
Thanks Mark. The reason I asked this is because I saw mentions of SolrCloud being resilient to split brain because it uses ZooKeeper. However, if my half brain understands what split brain is then I think that's not a completely true claim because one can get unlucky and get a SolrCloud cluster partitioned in a way that one or even all partitions reject indexing (and update and deletion) requests if they do not have a complete index. In my example of a 10-node cluster that gets split into a 7-node and a 3-node partition, if neither partition ends up containing the full index (i.e. at least one copy of each shard) then neither partition will accept updates. And here is one more Q. * Imagine a client is adding documents and, for simplicity, imagine SolrCloud routes all these documents to the same shard, call it S. * Imagine that both the 7-node and the 3-node partition end up with a complete index and thus both accept updates. * This means that both the 7-node and the 3-node partition have at least one replica of shard S, lets call then S7 and S3. * Now imagine if the client sending documents for indexing happened to be sending documents to 2 nodes, say in round-robin fashion. * And imagine that each of these 2 nodes ended up in a different partition. The client now keeps sending docs to these 2 nodes and both happily take and index documents in their own copies of S. To the client everything looks normal - all documents are getting indexed. But S7 and S3 are no longer the same - they contain different documents! Problem, no? What happens with somebody fixes the cluster and all nodes are back in the same 10-node cluster? What happens to S7 and S3? Wouldn't SolrCloud have to implement bi-directional synchronization to fix things and unify S7 and S3? And if there are updates and deletes involved, things get even messier :( Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm - Original Message - From: Mark Miller markrmil...@gmail.com To: solr-user solr-user@lucene.apache.org Cc: Sent: Friday, June 15, 2012 5:07 PM Subject: Re: SolrCloud and split-brain On Jun 15, 2012, at 3:21 PM, Otis Gospodnetic wrote: Thanks Mark, will open an issue in a bit. But I think the following is the real meat of the Q about split brain and SolrCloud, especially when it comes to how indexing is handled during split brain: Does this work even when outside clients (apps for indexing or searching) send their requests directly to individual nodes? Let's use the example from my email where we end up with 2 groups of nodes: 7-node group with 2 ZK nodes on the same network and 3-node group with 1 ZK node on the same network. The 3-node group with 1 ZK would not have a functioning zk - so it would stop accepting updates. If it could serve a complete view of the index, it would though, for searches. So in this case information in this 1 ZK node would tell the 3 Solr nodes whether they have all index data or if some shards are missing (i.e. were only on nodes in the other 7-node group)? And if nodes figure out they don't have all index data they will reject search requests? Or will they accept and perform searches, but return responses that tell the client that the searched index was not complete? The 1 ZK node will not function, so the 3 Solr nodes will not accept updates. If there is one replica for each shard available, search will still work. I don't think partial results has been committed yet for distrib search. In that case, we will put something in the header to indicate a full copy of the index was not available. I think we can also add something in the header if we know we cannot talk to zookeeper to let the client know it could be seeing stale state. SmartClients that talked to zookeeper would see those nodes appear as down in zookeeper and stop trying to talk to them. The 7-node group would have a working ZK it could talk to, and it would continue to accept updates as long as a node for a shard for that hash range is up. It would also of course serve searches. Right, so if the node for the shard where a doc is supposed to go to is in that 3-node group, then the indexing request will be rejected. Is this correct? it depends on what is available - but you will need at least one replica for each shard available - eg your partition needs to have one copy of the index - otherwise updates are rejected if there are no nodes hosting a shard of the hash range. So if a replica made it into the larger partition, you will be fine - it will become the leader. Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm - Original Message - From: Mark Miller markrmil...@gmail.com To: solr-user solr-user@lucene.apache.org Cc: Sent: Friday, June