Interesting, that did work. Do you or anyone else have any ideas or what I should look at? While soft commit is not a requirement in my project, my understanding is that it should help performance. On the same index, I will be doing both a large number of queries as well as updates.
If I have to disable autoCommit, should I increase the chunk size? Of course, I will have to run a more large scale test tomorrow, but I saw this problem fairly consistently in my smaller test. In a previous experiment, I applied the SOLR-4816 patch that someone indicated might help. I also reduced the CSV upload chunk size to 500. It seemed like things got a little better, but still eventually hung. I also see SOLR-5081, but I don't know if that is my issue or not. At least in my test, the index writes are not parallel as in the ticket. -Kevin On Tue, Aug 13, 2013 at 8:40 PM, Jason Hellman < jhell...@innoventsolutions.com> wrote: > While I don't have a past history of this issue to use as reference, if I > were in your shoes I would consider trying your updates with softCommit > disabled. My suspicion is you're experiencing some issue with the > transaction logging and how it's managed when your hard commit occurs. > > If you can give that a try and let us know how that fares we might have > some further input to share. > > > On Aug 13, 2013, at 11:54 AM, Kevin Osborn <kevin.osb...@cbsi.com> wrote: > > > I am using Solr Cloud 4.4. It is pretty much a base configuration. We > have > > 2 servers and 3 collections. Collection1 is 1 shard and the Collection2 > and > > Collection3 both have 2 shards. Both servers are identical. > > > > So, here is my process, I do a lot of queries on Collection1 and > > Collection2. I then do a bunch of inserts into Collection3. I am doing > CSV > > uploads. I am also doing custom shard routing. All the products in a > single > > upload will have the same shard key. All Solr interaction is through > SolrJ > > with full Zookeeper awareness. My uploads are also using soft commits. > > > > I tried this on a record set of 936 products. Everything worked fine. I > > then sent over a record set of 300k products. The upload into Collection3 > > is chunked. I tried both 1000 and 200,000 with similar results. The first > > upload to Solr would just hang. There would simply be no response from > > Solr. A few of the products from this request would make it into the > index, > > but not many. > > > > In this state, queries continued to work, but deletes did not. > > > > My only solution was to kill each Solr process. > > > > As an experiment, I did the large catalog first. First, I reset > everything. > > With A chunk size of 1000, about 110,000 out of 300,000 records made it > > into Solr before the process hung. Again, queries worked, but deletes did > > not and I had to kill Solr. It hung after about 30 seconds. Timing-wise, > > this is at about the second autocommit cycle, given the default > autocommit > > of 15 seconds. I am not sure if this is related or not. > > > > As an additional experiment, I ran the entire test with just a single > node > > in the cluster. This time, everything ran fine. > > > > Does anyone have any ideas? Everything is pretty default. These servers > are > > Azure VMs, although I have seen similar behavior running two Solr > instances > > on a single internal server as well. > > > > I had also noticed similar behavior before with Solr 4.3. It definitely > has > > something do with the clustering, but I am not sure what. And I don't see > > any error message (or really anything else) in the Solr logs. > > > > Thanks. > > > > -- > > *KEVIN OSBORN* > > LEAD SOFTWARE ENGINEER > > CNET Content Solutions > > OFFICE 949.399.8714 > > CELL 949.310.4677 SKYPE osbornk > > 5 Park Plaza, Suite 600, Irvine, CA 92614 > > [image: CNET Content Solutions] > > -- *KEVIN OSBORN* LEAD SOFTWARE ENGINEER CNET Content Solutions OFFICE 949.399.8714 CELL 949.310.4677 SKYPE osbornk 5 Park Plaza, Suite 600, Irvine, CA 92614 [image: CNET Content Solutions]