Re: Upgraded to 4.10.3, highlighting performance unusably slow
We ran into this as well on 4.10.3 (not related to an upgrade). It was identified during load testing when a small percentage of queries would take more than 20 seconds to return. We were able to isolate it by rerunning the same query multiple times and regardless of cache hits the queries would still take a long time to return. We used this method to narrow down the performance problem to a small number of very large records (many many fields in a single record). We fixed it by turning on hl.requireFieldMatch on the query so that only fields that have an actual hit are passed through the highlighter. Hopefully this helps, Jaime Spicciati On Sat, May 2, 2015 at 8:20 PM, Joel Bernstein wrote: > Hi, > > Can you also include the details of your research that narrowed the issue > to the highlighter? > > Joel Bernstein > http://joelsolr.blogspot.com/ > > On Sat, May 2, 2015 at 5:27 PM, Ryan, Michael F. (LNG-DAY) < > michael.r...@lexisnexis.com> wrote: > > > Are you able to identify if there is a particular part of the code that > is > > slow? > > > > A simple way to do this is to use the jstack command (assuming your > server > > has the full JDK installed). You can run it like this: > > /path/to/java/bin/jstack PID > > > > If you run that a bunch of times while your highlight query is running, > > you might be able to spot the hotspot. Usually I'll do something like > this > > to see the stacktrace for the thread running the query: > > /path/to/java/bin/jstack PID | grep SearchHandler -B30 > > > > A few more questions: > > - What are response times you are seeing before and after the upgrade? Is > > "unusably slow" 1 second, 10 seconds...? > > - If you run the exact same query multiple times, is it consistently > slow? > > Or is it only slow on the first run? > > - While the query is running, do you see high user CPU on your server, or > > high IO wait, or both? (You can check this with the top command or vmstat > > command in Linux.) > > > > -Michael > > > > -Original Message- > > From: Cheng, Sophia Kuen [mailto:sophia_ch...@hms.harvard.edu] > > Sent: Saturday, May 02, 2015 4:13 PM > > To: solr-user@lucene.apache.org > > Subject: Upgraded to 4.10.3, highlighting performance unusably slow > > > > Hello, > > > > We recently upgraded solr from 3.8.0 to 4.10.3. We saw that this upgrade > > caused a incredible slowdown in our searches. We were able to narrow it > > down to the highlighting. The slowdown is extreme enough that we are > > holding back our release until we can resolve this. Our research > indicated > > using TermVectors & FastHighlighter were the way to go, however this > still > > does nothing for the performance. I think we may be overlooking a crucial > > configuration, but cannot figure it out. I was hoping for some guidance > and > > help. Sorry for the long email, I wanted to provide enough information. > > > > Our documents are largely dynamic fields, and so we have been using ‘*’ > as > > the field for highlighting. This is the same setting as in prior versions > > of solr use. The dynamic fields are of type ’text’ and we added > > customizations to the schema.xml for the type ’text’: > > > > > storeOffsetsWithPositions="true" termVectors="true" termPositions="true" > > termOffsets="true"> > > > > > > > > > > > > > words="stopwords.txt" enablePositionIncrements="true"/> > > > generateNumberParts="1" catenateWords="1" catenateNumbers="1" > > catenateAll="0" splitOnCaseChange="1"/> > > > > > protected="protwords.txt"/> > > > > > > > > > > > > > words="stopwords.txt" enablePositionIncrements="true"/> > > > generateNumberParts="1" catenateWords="0" catenateNumbers="0" > > catenateAll="0" splitOnCaseChange="1"/> > > > > > protected="protwords.txt"/> > > > > > > > > One of the two dynamic fields we use: > > > > > stored="true" required="false" multiValued="true"/> > > > > In our solrConfig.xml file, we have: > > > > > name="defaults"> explicit > > 13 > > true > > true > > > > > > tvComponent &
Re: Java.net.socketexception: broken pipe Solr 4.10.2
We ran into this during our indexing process running on 4.10.3. After increasing zookeeper timeouts, client timeouts, socket timeouts, implementing retry logic on our loading process the thing that worked was to change the Hard Commit timing. We were performing a Hard Commit every 5 minutes and after a couple hours of loading data some of the shards would start going down because they would timeout with zookeeper and/or close connections. Changing the timeouts just moved the problem later in the ingest process. Through a combination of decreasing the hard commit timing to 15 seconds, and migrating to G1 garbage collect, we are able to prevent ingest failures. For us the periodic stop the world garbage collects were causing connections to be closed and other nasty things such as zookeeper timeouts that would cause recovery to kick in. (Soft commits are turned off until the full ingest/baseline completes). I believe until a Hard Commit is issued Solr keeps the data in memory which explains why we were experiencing nasty garbage collects. The other change we made which may have helped is that we ensured the socket timeouts were in sync between the jetty instance running Solr and the SolrJ loading the data. During some of our batch updates Solr would take a couple minutes to respond back which I believe in some instances the socket server side would be closed (maxIdleTime setting in Jetty). Hope this helps, Jaime Spicciati Thanks Jaime On Tue, Apr 14, 2015 at 9:26 AM, vsilgalis wrote: > Right now index size is about 10GB on each shard (yes I could use more > RAM), > but I'm looking more for a step up then step down approach. I will try > adding more RAM to these machines as my next step. > > 1. Zookeeper is external to these boxes in a three node cluster with more > than enough RAM to keep everything off disk. > > 2. os disk cache, when I add more RAM I will just add it as RAM for the > machine and not to the Java Heap unless that is something you recommend. > > 3. java heap looks good so far, GC is minimal as far as i can tell but I > can > look into this some more. > > 4. we do have 2 cores per machine, but the second core is a joke (10MB) > > note: zkClientTimeout is set to 30 for safety's sake. > > java settings: > > -XX:+CMSClassUnloadingEnabled-XX:+AggressiveOpts-XX:+ParallelRefProcEnabled-XX:+CMSParallelRemarkEnabled-XX:CMSMaxAbortablePrecleanTime=6000-XX:CMSTriggerPermRatio=80-XX:CMSInitiatingOccupancyFraction=50-XX:+UseCMSInitiatingOccupancyOnly-XX:CMSFullGCsBeforeCompaction=1-XX:PretenureSizeThreshold=64m-XX:+CMSScavengeBeforeRemark-XX:ParallelGCThreads=4-XX:ConcGCThreads=4-XX:+UseConcMarkSweepGC-XX:+UseParNewGC-XX:MaxTenuringThreshold=8-XX:TargetSurvivorRatio=90-XX:SurvivorRatio=4-XX:NewRatio=3-XX:-UseSuperWord-Xmx5588m-Xms1596m > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Java-net-socketexception-broken-pipe-Solr-4-10-2-tp4199484p4199561.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Leading Wildcard Support (ReversedWildcardFilterFactory)
Thanks for the quick response. The index I am currently testing with has the following configuration which is the default for the text_general_rev The field type is solr.TextField maxFractionAsterisk=.33 maxPosAsterisk=3 maxPosQuestion=2 withOriginal=true Through additional review I think it *might *be working as expected even though the Analysis tab and debugQuery parsed query lead me to think otherwise. If I look at the explain plan from the debugQuery and I actually get a hit, I see word/word(s) that actually come back in reversed order with the "\u0001 prefix character, so the actual hit against the inverted index appears to be correct even though the parsed query doesn't reflect this. Is it safe to say that things are in fact working correctly? Thanks again On Thu, Feb 26, 2015 at 3:34 PM, Jack Krupansky wrote: > Please post your field type... or at least confirm a comparison to the > example in the javadoc: > > http://lucene.apache.org/solr/4_10_3/solr-core/org/apache/solr/analysis/ReversedWildcardFilterFactory.html > > -- Jack Krupansky > > On Thu, Feb 26, 2015 at 2:38 PM, jaime spicciati < > jaime.spicci...@gmail.com> > wrote: > > > All, > > I am currently using 4.10.3 running Solr Cloud. > > > > I have configured my index analyzer to leverage the > > solr.ReversedWildcardFilterFactory with various settings for the > > maxFractionAsterisk, maxPosAsterisk,etc. Currently I am running with the > > defaults (ie not configured) > > > > Using the Analysis capability in the Solr admin I see the "Field Value > > (Index)" fields going in correctly, both normal order and reversed order. > > However, on the "Field Value (Query)" side it is not generating a token > > that is reversed as expected (no matter where I place the * in the > leading > > position of the search term). I also confirmed through the Query > capability > > with debugQuery turned on that the parsed query is not reversed as > > expected. > > > > From my current understanding you do not need to have anything configured > > on the index analyzer to make leading wildcards work as expected with the > > reversedwildcardfilterfactory. The default query parser will know to look > > at the index analyzer and leverage the ReversedWildcardFilterFactory > > configuration if the term contains a leading wildcard. (This is what I > have > > read) > > > > Without uploading my entire configuration to this email I was hoping > > someone could point me in the right direction because I am at a loss at > > this point. > > > > Thanks! > > >
Leading Wildcard Support (ReversedWildcardFilterFactory)
All, I am currently using 4.10.3 running Solr Cloud. I have configured my index analyzer to leverage the solr.ReversedWildcardFilterFactory with various settings for the maxFractionAsterisk, maxPosAsterisk,etc. Currently I am running with the defaults (ie not configured) Using the Analysis capability in the Solr admin I see the "Field Value (Index)" fields going in correctly, both normal order and reversed order. However, on the "Field Value (Query)" side it is not generating a token that is reversed as expected (no matter where I place the * in the leading position of the search term). I also confirmed through the Query capability with debugQuery turned on that the parsed query is not reversed as expected. >From my current understanding you do not need to have anything configured on the index analyzer to make leading wildcards work as expected with the reversedwildcardfilterfactory. The default query parser will know to look at the index analyzer and leverage the ReversedWildcardFilterFactory configuration if the term contains a leading wildcard. (This is what I have read) Without uploading my entire configuration to this email I was hoping someone could point me in the right direction because I am at a loss at this point. Thanks!
Question about session affinity and SolrCloud
All, This is my current understanding of how SolrCloud load balancing works... Within SolrCloud, for a cluster with more than 1 shard and at least 1 replica, the Zookeeper aware SolrJ client uses LBHTTPSolrServer which is round robin across the replicas and leaders in the cluster. In turn the shard (which can be a leader or replica) that performs the distributed query may then go to the leader or replica for each shard based on round robin via LBHTTPSolrServer. If this is correct then in a SolrCloud instance that has let's say 1 replica, the initial query from the user may go to the leader for shard 1, then when the user paginates to the second page the subsequent query may go to the replica of shard 1. This seems inefficient from a caching perspective where the queryResultCache and possibly the filterCache would need to be reloaded. >From what I can find there does not appear to be any option of session affinity within the SolrCloud query execution? Thanks!
SolrCloud multi-datacenter failover?
All, At my current customer we have developed a custom federator that will federate queries between Endeca and Solr to ease the transition from an extremely large (TBs of data) Endeca index to Solr. (Endeca is similar to Solr in terms of search/faceted navigation/etc). During this transition plan we need to support multi datacenter failover which we have historically handled via load balancers with the appropriate failover configurations (think F5). We are currently playing our dataloads into multiple datacenters to ensure data consistency. (Each datacenter has a stand-alone instance of solrcloud with its own redundancy/failover) I am curious to see how the community handles multi datacenter failureover at the presentation layer (datacenter A goes down and we want to failover to B). Solrcloud within a datacenter will handle single datacenter failure within the instance, but in order to support multi datacenter failover I haven't seen a definitive ‘answer’ as to how to handle this situation. At this point the only two options I can come up with are 1) Fail the entire datacenter if Solrcloud goes offline (GUI/index/etc go offline) - This is problematic because some portion of user activity will fail, queries that are in transit will not complete 2) Implement failover at the custom federator level. In doing so we would need to detect a failure at datacenter A within our federator, then query datacenter B to fulfill the user request, then potentially fail the entire datacenter A once all transactions have been fulfilled against A Since we are looking up the active solr instance via zookeeper (solrcloud) per datacenter I don’t see any reasonable means of failing over to another datacenter if a given solrcloud instance goes down? Any thoughts are welcome at this point? Thanks Jaime