wildcard matches in EnumField - what do I need to change in code to enable wildcard matches?
Hi all, In my index, I have an EnumField called severity. This is its configuration in enumsConfig.xml: enum name=severity valueNot Available/value valueLow/value valueMedium/value valueHigh/value valueCritical/value /enum My index contains documents with these values. When I search for severity:High, I get results. But when I search for severity:H* , I get no results. What do I need to change in Solr code to enable wildcard matches in EnumField (or any other field)? Thanks.
Re: Percolator feature
Hi, There's https://github.com/flaxsearch/luwak, which isn't integrated into Solr yet, but could be added as a SearchComponent with a bit of work. It's running off a lucene fork at the moment, but I cut a 4.8 branch at Berlin Buzzwords which I will push to github later today. Alan Woodward www.flax.co.uk On 28 May 2014, at 21:44, Jorge Luis Betancourt Gonzalez wrote: Is there some work around in Solr ecosystem to get something similar to the percolator feature offered by elastic search? Greetings!VII Escuela Internacional de Verano en la UCI del 30 de junio al 11 de julio de 2014. Ver www.uci.cu
Solr GeoHash Field (Solr 4.5)
Hi, I've been reading up a lot on what David has written about GeoHash fields and would like to use them. I'm trying to create a nice way to display cluster counts of geo points on a google map. It's naturally not going to be possible to send 40k marker information over the wire to cluster... so figured GeoHash would be perfect. I'm running Solr 4.5. I've seen this.. https://github.com/dsmiley/SOLR-2155 Would this be what I use? It looks like it's really old, and I noticed that there is now a solr.GeoHash core field... However, if I check the documentation at this page https://wiki.apache.org/solr/SpatialSearchDev Solr includes a the field type solr.GeoHashField but it unfortunately doesn't realize any of the intrinsic properties of the geohash to its advantage. *You shouldn't use it.* Instead, check out http://wiki.apache.org/solr/SpatialSearch#SOLR-2155. The main feature is multi-valued field support. Does this mean that there isn't any way to use GeoHash with my version of Solr? Should I just implement a multi value field andadd all of the multi value fields myself? (Also, can you confirm that for doing clustering, I'm on the right track for using GeoHash. I don't need anything perfect. I just want to be able to break up the markers into groups). Thanks
search using Ngram.
Hi All, We are using EdgeNGramFilterFactory for searching with minGramSize=3, as per Business logic, auto fill suggestions should appear on entering 3 characters in search filter. While searching for contact with name Bill Moor, the value will does not get listed when we type 'Bill M' but when we type 'Bill Moo' or 'Bill' it suggests 'Bill Moor'. Clearly, The tokens are not generated when there is space in between, we cannot set set minGramSize=1 as that will generate many tokens and slow the performance. Do we have a solution without using Ngram to generate tokens on entering 3 characters? Please suggest. Thanks, --Gurfan -- View this message in context: http://lucene.472066.n3.nabble.com/search-using-Ngram-tp4138596.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCloud distributed indexing
Hi, How to achieve distributed indexing in solr cloud.I have external Zookeeper with two separate machines acting as leader. In researching further I found As of now we are specifying the port id in our update call and if the leader is down zookeeper do not forward the request to other leader for indexing instead the call fails. As I understand it is because of the port I have specified but then how to achieve this requirement. I have tried following http://localhost:/solr/collection1/update?update.processor=distribself=localhost:/solrshards=localhost:8983/solr,localhost:7574/solr,localhost:/solr but this is not working, Can some one outline the steps or redirect me to proper notes where I can go though the steps.
Re: SolrCloud distributed indexing
If you are using Java to index/query, then use CloudSolrServer which accepts the ZooKeeper connection string as a constructor parameter and it will take care of routing requests and failover. On Thu, May 29, 2014 at 2:41 PM, Priti Solanki pritiatw...@gmail.comwrote: Hi, How to achieve distributed indexing in solr cloud.I have external Zookeeper with two separate machines acting as leader. In researching further I found As of now we are specifying the port id in our update call and if the leader is down zookeeper do not forward the request to other leader for indexing instead the call fails. As I understand it is because of the port I have specified but then how to achieve this requirement. I have tried following http://localhost:/solr/collection1/update?update.processor=distribself=localhost:/solrshards=localhost:8983/solr,localhost:7574/solr,localhost:/solr but this is not working, Can some one outline the steps or redirect me to proper notes where I can go though the steps. -- Regards, Shalin Shekhar Mangar.
RE: Error enquiry- exceeded limit of maxWarmingSearchers=2
Hi, Thanks for your valuable inputs... Find below my code and config in solrconfig.xml. Index update is successful but I am not able to see any data from solr admin console. What could be the issue? Any help here is highly appreciated. I can see the data in the solr admin gui after tomcat restart(solr is running in tomcat in my case) private void addToSolr(ListSolrInputDocument c) throws SolrServerException, IOException { if (!c.isEmpty()) { try { solr.add(c); logger.info(Commit size after Add= + c.size()); } finally { //renew lock } } } autoCommit config in solrconfig.xml = autoCommit maxTime${solr.autoCommit.maxTime:15000}/maxTime maxDocs1/maxDocs openSearcherfalse/openSearcher /autoCommit autoSoftCommit maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime /autoSoftCommit Few more questions.. 2) If I use solrServer.add(Doc list,commitwithin), should I do solrServer.commit() also Thanks Regards, Arjun M -Original Message- From: ext Shawn Heisey [mailto:s...@elyograg.org] Sent: Wednesday, May 28, 2014 6:36 PM To: solr-user@lucene.apache.org Subject: Re: Error enquiry- exceeded limit of maxWarmingSearchers=2 On 5/28/2014 3:45 AM, M, Arjun (NSN - IN/Bangalore) wrote: Also is there a way to check if autowarming completed (or) how to make the next commit wait till previous commit finishes? With Solr, probably not. There might be a statistic available from an admin handler that I don't know about, but as far as I know, your code must be aware of approximately how long a commit is likely to take, and not send another commit until you can be sure that the previous commit is done. This includes the commitWithin parameter on an update request. Now that I've just said that, you *can* do an all documents query with rows=0 and look for a change in numFound. An update might actually result in no change to numFound, so you would need to build in a time-based exit to the loop that looks for numFound changes. In the case of commits done automatically by the configuration (autoCommit and/or autoSoftCommit), there is definitely no way to detect when a previous commit is done. The general recommendation with Solr 4.x is to have autoCommit enabled with openSearcher=false, with a relatively short maxTime -- from 5 minutes down to 15 seconds, depending on indexing rate. These commits will not open a new searcher, and they will not make new documents visible. For commits that affect which documents are visible, you need to determine how long you can possibly stand to go without seeing new data that has been indexed. Once you know that time interval, you can use it to do a manual commit, or you can set up autoSoftCommit with that interval. It is not at all unusual to have an autoCommit time interval that's shorter than autoSoftCommit. This blog post mentions SolrCloud, but it is also applicable to Solr 4.x when NOT running in cloud mode: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Thanks, Shawn
RE: Email Notification for Sucess/Failure of Import Process.
I am not using DIH to index data, I use the post.jar .XML file to load in SOLR. I am not sure still I can use DIH , importstart and importend ..? -Original Message- From: Stefan Matheis [mailto:matheis.ste...@gmail.com] Sent: Wednesday, May 28, 2014 11:55 AM To: solr-user@lucene.apache.org Subject: Re: Email Notification for Sucess/Failure of Import Process. How about using DIH’s EventListeners? http://wiki.apache.org/solr/DataImportHandler#EventListeners -Stefan On Wednesday, May 28, 2014 at 5:31 PM, EXTERNAL Taminidi Ravi (ETI, Automotive-Service-Solutions) wrote: Hi I am using the XML file for Indexing In SOLR. I am planning to make this process more automation. Creating XML File and Loading to SOLR. I like to get email once the process is completed. Is there any way in solr can this achieved, I am not seeing more inputs on configure notification in SOLR. Also I am trying DIH, using MS SQL , Someone can help me on sharing the data-config.xml if you are using already once for MSSQL with few basic steps. Thanks Ravi
Re: search using Ngram.
Sounds like you are tokenizing your string when you don't really want to. Either you want all queries to only search against prefixes of the whole value without tokenization, or you need to produce several copyFields with different analysis applied and use dismax to let Solr know which should rank higher. Or, you could use the Suggester component or one of the other bolt-on autocomplete components instead. Maybe you should post your current field definition and let us know specifically what you're trying to achieve? Michael Della Bitta Applications Developer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Thu, May 29, 2014 at 4:54 AM, Gurfan htc.ja...@gmail.com wrote: Hi All, We are using EdgeNGramFilterFactory for searching with minGramSize=3, as per Business logic, auto fill suggestions should appear on entering 3 characters in search filter. While searching for contact with name Bill Moor, the value will does not get listed when we type 'Bill M' but when we type 'Bill Moo' or 'Bill' it suggests 'Bill Moor'. Clearly, The tokens are not generated when there is space in between, we cannot set set minGramSize=1 as that will generate many tokens and slow the performance. Do we have a solution without using Ngram to generate tokens on entering 3 characters? Please suggest. Thanks, --Gurfan -- View this message in context: http://lucene.472066.n3.nabble.com/search-using-Ngram-tp4138596.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Percolator feature
We've definitely looked at Luwak before... nice to hear it might be being brought closer into the Solr ecosystem!
Re: Error enquiry- exceeded limit of maxWarmingSearchers=2
On 5/29/2014 4:18 AM, M, Arjun (NSN - IN/Bangalore) wrote: Thanks for your valuable inputs... Find below my code and config in solrconfig.xml. Index update is successful but I am not able to see any data from solr admin console. What could be the issue? Any help here is highly appreciated. I can see the data in the solr admin gui after tomcat restart(solr is running in tomcat in my case) private void addToSolr(ListSolrInputDocument c) throws SolrServerException, IOException { if (!c.isEmpty()) { try { solr.add(c); logger.info(Commit size after Add= + c.size()); } finally { //renew lock } } } autoCommit config in solrconfig.xml = autoCommit maxTime${solr.autoCommit.maxTime:15000}/maxTime maxDocs1/maxDocs openSearcherfalse/openSearcher /autoCommit autoSoftCommit maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime /autoSoftCommit The code snippet does not include a commit. I am not really clear on what using a value of -1 would do on maxTime here. I suspect that it efectively disables autoSoftCommit. If that's the case, then there is nothing at all in your code or your config that will open a new searcher -- that option is set to false in your autoCommit. If you want Solr to automatically do commits to make documents visible, I think you should configure a maxTime value for autoSoftCommit, and make it as long as you can possibly stand to not have new documents available. Then you won't have to worry about commits in your code at all. Few more questions.. 2) If I use solrServer.add(Doc list,commitwithin), should I do solrServer.commit() also No. The commitWithin would do a soft commit for you once that much time has elapsed since indexing started (or the last commit with openSearcher=true), so you would not need to do a commit(). My opinion is that you should not combine manual commits with autoSoftCommit. Depending on exactly what your needs are, you might want to use commitWithin, and have autoSoftCommit as a last guarantee against errors in your indexing process. Thanks, Shawn
Re: wildcard matches in EnumField - what do I need to change in code to enable wildcard matches?
On 5/29/2014 12:50 AM, Elran Dvir wrote: In my index, I have an EnumField called severity. This is its configuration in enumsConfig.xml: enum name=severity valueNot Available/value valueLow/value valueMedium/value valueHigh/value valueCritical/value /enum My index contains documents with these values. When I search for severity:High, I get results. But when I search for severity:H* , I get no results. What do I need to change in Solr code to enable wildcard matches in EnumField (or any other field)? I would suspect that enum fields are not actually stored as text. They are likely stored in the index as an integer, with the Solr schema being the piece that knows what the strings are for each of the numbers. I don't think a wildcard match is possible. Looking at the code for the EnumFieldValue class (added by SOLR-5084), I do not see any way to match the string value based on a wildcard or substring. If you want to use wildcard matches, you'll need to switch the field to StrField or TextField, and make sure that all of your code is strict about the values that can end up in the field. Thanks, Shawn
RE: Error enquiry- exceeded limit of maxWarmingSearchers=2
Thanks Shawn... Just one more question.. Can both autoCommit and autoSoftCommit be enabled? If both are enabled, which one takes precedence? Thanks Regards, Arjun M -Original Message- From: ext Shawn Heisey [mailto:s...@elyograg.org] Sent: Thursday, May 29, 2014 7:02 PM To: solr-user@lucene.apache.org Subject: Re: Error enquiry- exceeded limit of maxWarmingSearchers=2 On 5/29/2014 4:18 AM, M, Arjun (NSN - IN/Bangalore) wrote: Thanks for your valuable inputs... Find below my code and config in solrconfig.xml. Index update is successful but I am not able to see any data from solr admin console. What could be the issue? Any help here is highly appreciated. I can see the data in the solr admin gui after tomcat restart(solr is running in tomcat in my case) private void addToSolr(ListSolrInputDocument c) throws SolrServerException, IOException { if (!c.isEmpty()) { try { solr.add(c); logger.info(Commit size after Add= + c.size()); } finally { //renew lock } } } autoCommit config in solrconfig.xml = autoCommit maxTime${solr.autoCommit.maxTime:15000}/maxTime maxDocs1/maxDocs openSearcherfalse/openSearcher /autoCommit autoSoftCommit maxTime${solr.autoSoftCommit.maxTime:-1}/maxTime /autoSoftCommit The code snippet does not include a commit. I am not really clear on what using a value of -1 would do on maxTime here. I suspect that it efectively disables autoSoftCommit. If that's the case, then there is nothing at all in your code or your config that will open a new searcher -- that option is set to false in your autoCommit. If you want Solr to automatically do commits to make documents visible, I think you should configure a maxTime value for autoSoftCommit, and make it as long as you can possibly stand to not have new documents available. Then you won't have to worry about commits in your code at all. Few more questions.. 2) If I use solrServer.add(Doc list,commitwithin), should I do solrServer.commit() also No. The commitWithin would do a soft commit for you once that much time has elapsed since indexing started (or the last commit with openSearcher=true), so you would not need to do a commit(). My opinion is that you should not combine manual commits with autoSoftCommit. Depending on exactly what your needs are, you might want to use commitWithin, and have autoSoftCommit as a last guarantee against errors in your indexing process. Thanks, Shawn
Re: wildcard matches in EnumField - what do I need to change in code to enable wildcard matches?
At a minimum, the doc is too skimpy to say whether this should work or whether this is forbidden. That said, I wouldn't have expected wildcard to be supported for enum fields since they are really storing small integers. Ditto for regular expressions on enum fields. See: https://cwiki.apache.org/confluence/display/solr/Working+with+Enum+Fields -- Jack Krupansky -Original Message- From: Elran Dvir Sent: Thursday, May 29, 2014 2:50 AM To: solr-user@lucene.apache.org Subject: wildcard matches in EnumField - what do I need to change in code to enable wildcard matches? Hi all, In my index, I have an EnumField called severity. This is its configuration in enumsConfig.xml: enum name=severity valueNot Available/value valueLow/value valueMedium/value valueHigh/value valueCritical/value /enum My index contains documents with these values. When I search for severity:High, I get results. But when I search for severity:H* , I get no results. What do I need to change in Solr code to enable wildcard matches in EnumField (or any other field)? Thanks.
Re: Offline Indexes Update to Shard
Hi, On Wed, May 28, 2014 at 4:25 AM, Vineet Mishra clearmido...@gmail.comwrote: Hi All, Has anyone tried with building Offline indexes with EmbeddedSolrServer and posting it to Shards. What do you mean by posting it to shards? How is that different than copying them manually to the right location in FS? Could you please elaborate? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ FYI, I am done building the indexes but looking out for a way to post these index files on shards. Copying the indexes manually to each shard's replica is possible and is working fine but I don't want to go with that approach. Thanks!
Re: Solr High GC issue
Hi Bihan, That's a lot of parameters and without trying one can't really give you very specific and good advice. If I had to suggest something quickly I'd say: * go back to the basics - remove most of those params and stick with the basic ones. Look at GC and tune slowly by changing/adding params one at a time. * consider using G1 GC with the most recent Java7. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Thu, May 29, 2014 at 1:36 AM, bihan.chandu bihan.cha...@gmail.comwrote: Hi All I am Currently using solr 3.6.1 and my system handle lot of request .Now we are facing High GC issue in system. Please find the memory parameters in my solr system . Can some on help me to identify is there any relationship between my memory parameters and GC issue. MEM_ARGS=-Xms7936M -Xmx7936M -XX:NewSize=512M -XX:MaxNewSize=512M -Xss1024k -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled -XX:+AggressiveOpts -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -XX:MaxTenuringThreshold=15 -XX:-UseAdaptiveSizePolicy -XX:PermSize=256M -XX:MaxPermSize=256M -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGC -Xloggc:${GCLOG} -XX:-OmitStackTraceInFastThrow -XX:+DisableExplicitGC -XX:-BindGCTaskThreadsToCPUs -verbose:gc -XX:StackShadowPages=20 Thanks Bihan -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-High-GC-issue-tp4138570.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Transfer Existing Index to Core with Clean Index
I managed to figure it out! I did a full commit to the index by: 1. Creating an update.xml file with the commands: commit/ optimize/ ... with the pasted index in the data folder. 2. Running the command from the web browser: hostport/solr/update?commit=true -- View this message in context: http://lucene.472066.n3.nabble.com/Transfer-Existing-Index-to-Core-with-Clean-Index-tp4138530p4138675.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Solr High GC issue
you will probably also want to get some better visibility into what is going on with your JVM and GC easiest way is to enable some GC logging options. the following additional options will give you a good deal of information in gc logs -Xloggc:$JETTY_LOGS/gc.log -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -XX:+PrintTenuringDistribution -XX:+PrintClassHistogram -XX:+PrintHeapAtGC -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime -XX:+PrintPromotionFailure -XX:+PrintAdaptiveSizePolicy -XX:+PrintTLAB -XX:PrintFLSStatistics=1 -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10m you may find you have a particular portion of your heap which is undersized using G1GC with the adaptive sizing is a very handy way to deal with the memory usage in solr which can be somewhat difficult to tune optimally using the traditional static ratios (what works well for data ingestion is probably not optimal for searching) once you have a baseline of logs using your existing JVM sizings and the additional logging options above you might try switching from CMS to G1GC with adaptive sizing, and removing all the static tunings for tenuring and ratios and compare to a very minimal G1GC config -XX:+UseG1GC -XX:+UseAdaptiveSizePolicy -XX:MaxGCPauseMillis=1000 -- configuring the JMX interface is another way to get real time views into what is going on using jconsole or jvisualvm tools From: Otis Gospodnetic otis.gospodne...@gmail.com Sent: Thursday, May 29, 2014 07:20 To: solr-user@lucene.apache.org Subject: Re: Solr High GC issue Hi Bihan, That's a lot of parameters and without trying one can't really give you very specific and good advice. If I had to suggest something quickly I'd say: * go back to the basics - remove most of those params and stick with the basic ones. Look at GC and tune slowly by changing/adding params one at a time. * consider using G1 GC with the most recent Java7. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Thu, May 29, 2014 at 1:36 AM, bihan.chandu bihan.cha...@gmail.comwrote: Hi All I am Currently using solr 3.6.1 and my system handle lot of request .Now we are facing High GC issue in system. Please find the memory parameters in my solr system . Can some on help me to identify is there any relationship between my memory parameters and GC issue. MEM_ARGS=-Xms7936M -Xmx7936M -XX:NewSize=512M -XX:MaxNewSize=512M -Xss1024k -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled -XX:+AggressiveOpts -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -XX:MaxTenuringThreshold=15 -XX:-UseAdaptiveSizePolicy -XX:PermSize=256M -XX:MaxPermSize=256M -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGC -Xloggc:${GCLOG} -XX:-OmitStackTraceInFastThrow -XX:+DisableExplicitGC -XX:-BindGCTaskThreadsToCPUs -verbose:gc -XX:StackShadowPages=20 Thanks Bihan -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-High-GC-issue-tp4138570.html Sent from the Solr - User mailing list archive at Nabble.com.
openSearcher, default commit settings
Hi, 1. openSearcher (autoCommit) According to the Apache Solr reference, autoCommit/openSearcher is set to false by default. https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig But on Solr v4.8.1, if openSearcher is omitted from the autoCommit config, new searchers are opened and warmed post auto-commits. Is this behaviour intended or the wiki wrong? 2. openSearcher and other default commit settings From previous posts, I know it's not possible to disable commits completely in Solr config (without coding). But is there a way to configure the default settings of hard/explicit commits for the update handler? If not it makes sense to have a configuration mechanism. Currently, a simple commit call seems to be hard-wired with the following options: .. commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} There's no server-side option, e.g. to set openSearcher=false as default or invariant (cf. searchHandler) to prevent new searchers from opening. I found that at times it is necessary to have better server- or infrastructure-side controls for update/commits, especially in agile teams. Client/UI developers do not necessarily have complete Solr knowledge. Unintended commits from misbehaving client-side updates may be norm (e.g. 10 times per minute!). Regards, Boon - Boon Low Search Engineer, DCT Family History __ brightsolid is used in this email to mean brightsolid online technology limited. Email Disclaimer This message is confidential and may contain privileged information. You should not disclose its contents to any other person. If you are not the intended recipient, please notify the sender named above immediately. It is expressly declared that this e-mail does not constitute nor form part of a contract or unilateral obligation. Opinions, conclusions and other information in this message that do not relate to the official business of brightsolid shall be understood as neither given nor endorsed by it. __ This email has been scanned by the brightsolid Email Security System. Powered by McAfee. __
RE: autowarming queries
Thanks for looking into this. These are our static queries. We only see one of them getting executed. If it fails to execute others, shouldn't it show error in log? listener event=newSearcher class=solr.QuerySenderListener arr name=queries lst str name=q*:*/str str name=fqfield1:abc/str str name=fq-field2:xyz/str str name=facettrue/str str name=facet.mincount1/str str name=sortbusdate_i desc/str str name=facet.fieldfield1/str str name=facet.fieldfield2/str str name=facet.fieldfield3/str str name=facet.fieldfield4/str str name=facet.fieldfield5/str str name=facet.fieldfield6/str str name=facet.fieldfield7/str str name=facet.fieldfield8/str str name=facet.fieldfield9/str str name=facet.fieldfield10/str str name=facet.fieldfield11/str str name=facet.fieldfield12/str str name=facet.fieldfield13/str str name=facet.fieldfield14/str str name=facet.fieldfield15/str str name=facet.missingtrue/str str name=facet.sortindex/str str name=facet.limit50/str str name=facet.offset0/str str name=statstrue/str str name=stats.fieldfield16/str str name=stats.fieldfield17/str str name=stats.fieldfield18/str /lst lst str name=q*:*/str str name=fqfield1:abc/str str name=fq-field2:xyz/str str name=facettrue/str str name=facet.mincount1/str str name=sortbusdate_i desc/str str name=facet.fieldfield1/str str name=facet.fieldfield2/str str name=facet.fieldfield3/str str name=facet.fieldfield4/str str name=facet.fieldfield5/str str name=facet.fieldfield6/str str name=facet.fieldfield7/str str name=facet.fieldfield8/str str name=facet.fieldfield9/str str name=facet.fieldfield10/str str name=facet.fieldfield11/str str name=facet.fieldfield12/str str name=facet.fieldfield13/str str name=facet.fieldfield14/str str name=facet.fieldfield15/str str name=facet.missingtrue/str str name=facet.sortcount/str str name=facet.limit50/str str name=facet.offset0/str str name=statstrue/str str name=stats.fieldfield16/str str name=stats.fieldfield17/str str name=stats.fieldfield18/str /lst lst str name=indenton/str str name=echoHandlertrue/str str name=shards.infofalse/str str name=shards.tolerantfalse/str str name=dftext/str str name=defTypelucene/str str name=q*:*/str str name=fqfield1:abc/str str name=fq-field2:xyz/str str name=q.opAND/str str name=facettrue/str str name=facet.mincount1/str str name=rows75/str str name=start00/str str name=facet.fieldfield1/str str name=field1.facet.missingtrue/str str name=field1.facet.sortindex/str str name=field1.facet.limit25/str str name=field1.facet.offset0/str str name=facet.field field2/str str name=field2.facet.missingtrue/str str name=field2.facet.sortcount/str str name=field2.facet.limit25/str str name=field2.facet.offset0/str str name=facet.fieldfield3/str str
Safeguards for stray commands from deleting solr data
Hi, What are ways to prevent someone executing random delete commands against Solr? Like: curl http://solr.com:8983/solr/core/update?commit=true -H Content-Type: text/xml --data-binary 'deletequery*:*/query/delete' I understand we can do IP based access (change /etc/jetty.xml). Is there anything Solr provides out of the box? Thanks!
Re: Solr High GC issue
Hi All Thanks for the Suggestion. I will implement this changes and let us know the update Regards Bihan -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-High-GC-issue-tp4138570p4138694.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: wildcard matches in EnumField - what do I need to change in code to enable wildcard matches?
And I'm not even sure what the actual use case is here. I mean, the values of an enum field must be defined in advance, so if you think a value starts with H, just eyeball that static list and see that the only predefined value starting with H is High, so you can simply replace your * with igh - problem solved! Right? Or is there something or a lot more to your use case that you haven't disclosed? That said, there might be some value to having Solr do the wildcard lookup in the predefined list of values and then search for that value. Although the wildcard or regex could match more than one predefined value, which might be nice to select a set of enum values on a query, an OR of enum values. But... we need to consider the real use case before knowing if this makes any sense. I can imagine interesting use cases, but my personal imagination is not at issue for this particular thread. -- Jack Krupansky -Original Message- From: Shawn Heisey Sent: Thursday, May 29, 2014 9:46 AM To: solr-user@lucene.apache.org Subject: Re: wildcard matches in EnumField - what do I need to change in code to enable wildcard matches? On 5/29/2014 12:50 AM, Elran Dvir wrote: In my index, I have an EnumField called severity. This is its configuration in enumsConfig.xml: enum name=severity valueNot Available/value valueLow/value valueMedium/value valueHigh/value valueCritical/value /enum My index contains documents with these values. When I search for severity:High, I get results. But when I search for severity:H* , I get no results. What do I need to change in Solr code to enable wildcard matches in EnumField (or any other field)? I would suspect that enum fields are not actually stored as text. They are likely stored in the index as an integer, with the Solr schema being the piece that knows what the strings are for each of the numbers. I don't think a wildcard match is possible. Looking at the code for the EnumFieldValue class (added by SOLR-5084), I do not see any way to match the string value based on a wildcard or substring. If you want to use wildcard matches, you'll need to switch the field to StrField or TextField, and make sure that all of your code is strict about the values that can end up in the field. Thanks, Shawn
Re: Solr High GC issue
Agreed, that is a LOT of options. First, check the defaults and remove any flags that are setting something to the default. You can see all the flags and the default values with this command: java -XX:+PrintFlagsFinal -version For example, the default for ParallelGCThreads is 8, so you do not need to set that. We set a fairly large new generation, about 1/4 of heap. 512 Meg is way too small. Solr will allocate a lot of objects that are only used to handle one HTTP request. You want all of those to fit in the new space, even when there are simultaneous requests. If new is not big enough, they will be allocated in tenured space and will cause more frequent major GCs. For an 8G heap, we use a 2G new size. wunder On May 29, 2014, at 7:20 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Bihan, That's a lot of parameters and without trying one can't really give you very specific and good advice. If I had to suggest something quickly I'd say: * go back to the basics - remove most of those params and stick with the basic ones. Look at GC and tune slowly by changing/adding params one at a time. * consider using G1 GC with the most recent Java7. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Thu, May 29, 2014 at 1:36 AM, bihan.chandu bihan.cha...@gmail.comwrote: Hi All I am Currently using solr 3.6.1 and my system handle lot of request .Now we are facing High GC issue in system. Please find the memory parameters in my solr system . Can some on help me to identify is there any relationship between my memory parameters and GC issue. MEM_ARGS=-Xms7936M -Xmx7936M -XX:NewSize=512M -XX:MaxNewSize=512M -Xss1024k -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=80 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSParallelRemarkEnabled -XX:+AggressiveOpts -XX:LargePageSizeInBytes=2m -XX:+UseLargePages -XX:MaxTenuringThreshold=15 -XX:-UseAdaptiveSizePolicy -XX:PermSize=256M -XX:MaxPermSize=256M -XX:SurvivorRatio=4 -XX:TargetSurvivorRatio=90 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintGC -Xloggc:${GCLOG} -XX:-OmitStackTraceInFastThrow -XX:+DisableExplicitGC -XX:-BindGCTaskThreadsToCPUs -verbose:gc -XX:StackShadowPages=20 Thanks Bihan
Re: Error enquiry- exceeded limit of maxWarmingSearchers=2
On 5/29/2014 7:52 AM, M, Arjun (NSN - IN/Bangalore) wrote: Thanks Shawn... Just one more question.. Can both autoCommit and autoSoftCommit be enabled? If both are enabled, which one takes precedence? Yes, and it's a very common configuration. If you do enable both, you want openSearcher to be false on autoCommit, so that your hard commits are not making documents visible. That is a job for autoSoftCommit. If you use openSearcher=false on autoCommit, then the question of which one takes precendence actually has no meaning, because the two kinds of commits will be doing different things. Read this until you completely understand it: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Thanks, Shawn
Re: Solr GeoHash Field (Solr 4.5)
On IRC you said you found out the answers before I came along. For everyone else’s benefit: * Solr’s “documentation” is essentially the “Solr Reference Guide”. Only look at the wiki as a secondary source. * See “location_rpt” in the example schema.xml which supports multi-valued spatial data. It’s the evolution of SOLR-2155. * For clustering, see: http://wiki.apache.org/solr/SpatialClustering ~ David Smiley Freelance Apache Lucene/Solr Search Consultant/Developer http://www.linkedin.com/in/davidwsmiley On Thu, May 29, 2014 at 4:18 AM, Chris Atkinson chrisa...@gmail.com wrote: Hi, I've been reading up a lot on what David has written about GeoHash fields and would like to use them. I'm trying to create a nice way to display cluster counts of geo points on a google map. It's naturally not going to be possible to send 40k marker information over the wire to cluster... so figured GeoHash would be perfect. I'm running Solr 4.5. I've seen this.. https://github.com/dsmiley/SOLR-2155 Would this be what I use? It looks like it's really old, and I noticed that there is now a solr.GeoHash core field... However, if I check the documentation at this page https://wiki.apache.org/solr/SpatialSearchDev Solr includes a the field type solr.GeoHashField but it unfortunately doesn't realize any of the intrinsic properties of the geohash to its advantage. *You shouldn't use it.* Instead, check out http://wiki.apache.org/solr/SpatialSearch#SOLR-2155. The main feature is multi-valued field support. Does this mean that there isn't any way to use GeoHash with my version of Solr? Should I just implement a multi value field andadd all of the multi value fields myself? (Also, can you confirm that for doing clustering, I'm on the right track for using GeoHash. I don't need anything perfect. I just want to be able to break up the markers into groups). Thanks
Re: openSearcher, default commit settings
On 5/29/2014 9:21 AM, Boon Low wrote: 1. openSearcher (autoCommit) According to the Apache Solr reference, autoCommit/openSearcher is set to false by default. https://cwiki.apache.org/confluence/display/solr/UpdateHandlers+in+SolrConfig But on Solr v4.8.1, if openSearcher is omitted from the autoCommit config, new searchers are opened and warmed post auto-commits. Is this behaviour intended or the wiki wrong? I am reasonably certain that the default for openSearcher if it is not specified will always be true. My understanding and your actual experience says that the documentation is wrong. Additional note: The docs for autoSoftCommit are basically a footnote on autoCommit, which I think is a mistake -- it should have its own section, and the docs should mention that openSearcher does not apply. I think the code confirms this. From SolrConfig.java: protected UpdateHandlerInfo loadUpdatehandlerInfo() { return new UpdateHandlerInfo(get(updateHandler/@class,null), getInt(updateHandler/autoCommit/maxDocs,-1), getInt(updateHandler/autoCommit/maxTime,-1), getBool(updateHandler/autoCommit/openSearcher,true), getInt(updateHandler/commitIntervalLowerBound,-1), getInt(updateHandler/autoSoftCommit/maxDocs,-1), getInt(updateHandler/autoSoftCommit/maxTime,-1), getBool(updateHandler/commitWithin/softCommit,true)); } 2. openSearcher and other default commit settings From previous posts, I know it's not possible to disable commits completely in Solr config (without coding). But is there a way to configure the default settings of hard/explicit commits for the update handler? If not it makes sense to have a configuration mechanism. Currently, a simple commit call seems to be hard-wired with the following options: .. commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false} There's no server-side option, e.g. to set openSearcher=false as default or invariant (cf. searchHandler) to prevent new searchers from opening. I found that at times it is necessary to have better server- or infrastructure-side controls for update/commits, especially in agile teams. Client/UI developers do not necessarily have complete Solr knowledge. Unintended commits from misbehaving client-side updates may be norm (e.g. 10 times per minute!). Since you want to handle commits automatically, you'll want to educate your developers and tell them that they should never send commits -- let Solr handle it. If the code that talks to Solr is Java and uses SolrJ, you might want to consider using forbidden-apis in your project so that a build will fail if the commit method gets used. https://code.google.com/p/forbidden-apis/ Thanks, Shawn
RE: SolrCloud: facet range option f.field.facet.mincount=1 omits buckets on response
Hi all, At the moment I am reviewing the code to determine if this is a legitimate bug that needs to be set as a JIRA ticket. Any insight or recommendation is appreciated. Including the replication steps as text: - Solr versions where issue was replicated. * 4.5.1 (Linux) * 4.8.1 (Windows + Cygwin) Replicating 1. Created two-shard environment - no replication https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud a. Download Solr distribution from http://lucene.apache.org/solr/downloads.html b. Unzipped solr-4.8.1.zip to a temporary location: SOLR_DIST_HOME c. Ran once so the SolrCloud jars get unpacked: java -jar start.jar d. Create nodes i. cd SOLR_DIST_HOME ii. Via Windows Explorer copied example to node1 iii. Via Windows Explorer copied example to node2 e. Start Nodes i. Start node 1 cd node1 java -DzkRun -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -jar start.jar ii. Start node 2 cd node2 java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar f. Fed sample documents i. Out of the box curl http://localhost:8983/solr/update?commit=true -H Content-Type: text/xml -d @mem.xml curl http://localhost:7574/solr/update?commit=true -H Content-Type: text/xml -d @monitor2.xml ii. Create a copy of mem.xml to mem2.xml; modified identifiers, names, prices and fed curl http://localhost:8983/solr/update?commit=true -H Content-Type: text/xml -d @mem2.xml add doc field name=idCOMPANY1/field field name=nameCOMPANY1 Device/field field name=manuCOMPANY1 Device Mfg/field . field name=price190/field . /doc doc field name=idCOMPANY2/field field name=nameCOMPANY2 flatscreen/field field name=manuCOMPANY2 Device Mfg./field . field name=price200.00/field . /doc doc field name=idCOMPANY3/field field name=nameCOMPANY3 Laptop/field field name=manuCOMPANY3 Device Mfg./field . field name=price800.00/field . /doc /add 2. Query **without** f.price.facet.mincount=1, counts and buckets are OK http://localhost:8983/solr/collection1/select?q=*:*fl=id,pricesort=id+ascfacet=truefacet.range=pricef.price.facet.range.start=0f.price.facet.range.end=1000f.price.facet.range.gap=50f.price.facet.range.other=allf.price.facet.range.include=upperspellcheck=falsehl=false Only six documents have prices lst name=facet_ranges lst name=price lst name=counts int name=0.00/int int name=50.01/int int name=100.00/int int name=150.03/int int name=200.00/int int name=250.01/int int name=300.00/int int name=350.00/int int name=400.00/int int name=450.00/int int name=500.00/int int name=550.00/int int name=600.00/int int name=650.00/int int name=700.00/int int name=750.01/int int name=800.00/int int name=850.00/int int name=900.00/int int name=950.00/int /lst float name=gap50.0/float float name=start0.0/float float name=end1000.0/float int name=before0/int int name=after0/int int name=between2/int /lst /lst Note: the value in int name=between changes with every other refresh of the query. 3.Use of f.price.facet.mincount=1, missing bucket int name=250.01/int http://localhost:8983/solr/collection1/select?q=*:*fl=id,pricesort=id+ascfacet=truefacet.range=pricef.price.facet.range.start=0f.price.facet.range.end=1000f.price.facet.range.gap=50f.price.facet.range.other=allf.price.facet.range.include=upperspellcheck=falsehl=falsef.price.facet.mincount=1 lst name=facet_ranges lst name=price lst name=counts int name=50.01/int int name=150.03/int int name=750.01/int /lst float name=gap50.0/float float name=start0.0/float float name=end1000.0/float int
RE: Solr High GC issue
bihan.chandu [bihan.cha...@gmail.com] wrote: I am Currently using solr 3.6.1 and my system handle lot of request .Now we are facing High GC issue in system. Maybe it would help to get an idea of what is causing all the allocations? - How many documents in your index? - How many queries/sec? - How long does a typical query take? - How many cores does your machine have? - How many documents are returned/query? - How much faceting do you perform? How many unique terms/facet field? - Toke Eskildsen
Re: SolrCloud: facet range option f.field.facet.mincount=1 omits buckets on response
On 5/29/2014 12:06 PM, Ronald Matamoros wrote: Hi all, At the moment I am reviewing the code to determine if this is a legitimate bug that needs to be set as a JIRA ticket. Any insight or recommendation is appreciated. snip Note: the value in int name=between changes with every other refresh of the query. Whenever distributed search results change from one query to the next, it's almost always caused by having documents with the same uniqueKey in more than one shard. Solr is able to remove these duplicates from the results, but there are other aspects of distributed searching that cannot be dealt with when there are duplicate documents. This leads to problems like numFound changing from one request to the next. To avoid these problems with SolrCloud, you'll likely want to create a new collection and set its router to compositeId. This ensures that indexed documents are distributed to shards according to the hash of their uniqueKey, not imported directly into the node where you made the update request. It's possible that my guess here is completely wrong, but this is usually the problem. Thanks, Shawn
PDFStreamEngine returning a NULL pointer error
I am wondering the best way to debug an error I am getting in Solr. The error is below, but as far as I can tell, pdfbox can not read a font and returns a null pointer which is passed to tika and then to solr. Even though it is only a warning, this appears to terminate the indexing and I get an error that the indexing could not complete. My question is how do I determine what the name and directory of this file, and is there a way to configure either solr or tika to not terminate the indexing on a null pointer? Or is this a completely different problem? Thanks for any help or advice! 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.io.IOException: Error: Could not find font(COSName{Rx142}) in map={Rx133=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@f1f3dd, Rx136=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@c15066, Rx138=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@1858b31, Rx110=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@233dfd, Rx02=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@186de83} 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.io.IOException: Error: Could not find font(COSName{Rx302}) in map={Rx110=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@233dfd, Rx02=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@186de83, Rx266=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@845fc8} 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.io.IOException: Error: Could not find font(COSName{Rx302}) in map={Rx110=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@233dfd, Rx02=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@186de83, Rx266=org.apache.pdfbox.pdmodel.font.PDTrueTypeFont@845fc8} 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:56:09 AM WARN PDFStreamEngine java.lang.NullPointerException 5/23/2014 9:57:23 AM WARN COSDocument Warning: You did not close a PDF Document -- View this message in context: http://lucene.472066.n3.nabble.com/PDFStreamEngine-returning-a-NULL-pointer-error-tp4138722.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: aliasing for Stats component
Thanks Shalin. I will have a look at this. Currently we are using 4.3.1 so it should not be much trouble to patch it. Regards Mohit On Wed, May 28, 2014 at 6:57 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Support for keys, tagging and excluding filters in StatsComponent was added with SOLR-3177 in v4.8.0 You can specify e.g. stats.field={!key=xyz}id and the output will use xyz instead of id. On Wed, May 28, 2014 at 1:55 PM, Mohit Jain mo...@bloomreach.com wrote: Hi, In a solr request one can specify aliasing for returned fields using key:fl_name in fl param. I was looking at stats component and found that similar support is not available. I do not want to expose internal field names to external world. The plan is to do it in fl fashion instead of post-processing the response at external layer. I was wondering if exclusion of this feature is by choice or it's just that it was not added till now. Thanks Mohit -- Regards, Shalin Shekhar Mangar.
Solr: IndexNotFoundException: no segments* file HdfsDirectoryFactory
I'm trying to write some integration tests against SolrCloud for which I'm setting up a solr instance backed with a zookeeper and pointing it to a namenode (all in memory using hadoop testing utilities and JettySolrRunner). I'm getting the following error when I'm trying to create a collection (btw, the exact same configuration works just fine in dev with solrcloud). org.apache.lucene.index.IndexNotFoundException: no segments* file found in NRTCachingDirectory(HdfsDirectory@2ea2a4e4 lockFactory=org.apache.solr.store.hdfs.HdfsLockFactory@4cf0e472; maxCacheMB=192.0 maxMergeSizeMB=16.0): files: [HdfsDirectory@6bf4fc1c lockFactory=org.apache.solr.store.hdfs.hdfslockfact...@51115f81-write.lock] I'm getting this error when I'm trying to create a collection (precisely, when solr is actually trying to open a searcher on the new index.). There are no segment files in the index directory on HDFS. So this error is expected on opening a searcher on the index but I thought that the segment file is created the first time (when a collection is being created). After some debugging I noticed that the IndexWriter is being initialized explicitly with APPEND mode by overriding the default APPEND_CREATE mode, which means that the segment files won't be created if at least one doesn't exist. I'm not sure why this is the case and also I may be going down the wrong path with the error. Again this only happens in my in-memory solrcloud setup. Can someone help me with this? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-IndexNotFoundException-no-segments-file-HdfsDirectoryFactory-tp4138737.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: overseer queue clogged
We were running Solr 4.2, and are in the process of upgrading. I believe that the particular scenario that was clogging our queue was resolved in 4.7.1 - https://issues.apache.org/jira/browse/SOLR-5811 -- View this message in context: http://lucene.472066.n3.nabble.com/overseer-queue-clogged-tp4047878p4138746.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Error enquiry- exceeded limit of maxWarmingSearchers=2
Hi Shawn, Thanks a lot for your nice explanation.. Now I understood the difference between autoCommit and autoSoftCommit.. Now my config looks like below. autoCommit maxDocs1/maxDocs openSearcherfalse/openSearcher /autoCommit autoSoftCommit maxTime15000/maxTime /autoSoftCommit With this now I am getting some other error like this. org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: version conflict for 140142167803912812800030383128128 expected=1469497192978841608 actual=1469497212082847746 What could be the reason? Thanks Regards, Arjun M -Original Message- From: ext Shawn Heisey [mailto:s...@elyograg.org] Sent: Thursday, May 29, 2014 10:14 PM To: solr-user@lucene.apache.org Subject: Re: Error enquiry- exceeded limit of maxWarmingSearchers=2 On 5/29/2014 7:52 AM, M, Arjun (NSN - IN/Bangalore) wrote: Thanks Shawn... Just one more question.. Can both autoCommit and autoSoftCommit be enabled? If both are enabled, which one takes precedence? Yes, and it's a very common configuration. If you do enable both, you want openSearcher to be false on autoCommit, so that your hard commits are not making documents visible. That is a job for autoSoftCommit. If you use openSearcher=false on autoCommit, then the question of which one takes precendence actually has no meaning, because the two kinds of commits will be doing different things. Read this until you completely understand it: http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ Thanks, Shawn
DataImportHandler while Replication
Hi, What would happen to DataImportHandler that is setup on the master when the slave is in the process of replicating the index. Is there anyway to configure DataImportHandler to not do anything if replication is in process and/or disable replication before DataImportHandler starts its process? Please share your thoughts.. Best, Robin
Re: Wordbreak spellchecker excessive breaking.
James, Thanks for clearly stating this , I was not able to find this documented anywhere, yes I am using it with another spell checker (Direct) with the collation on. I will try the maxChangtes and let you know. On a side note , whenever I change the spellchecker parameter , I need to rebuild the index and delete the solr data directory before that as my Tomcat instance would not even start, can you let me know why ? Thanks. On Tue, May 27, 2014 at 12:21 PM, Dyer, James james.d...@ingramcontent.com wrote: You can do this if you set it up like in the mail Solr example: lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldname/str str name=combineWordstrue/str str name=breakWordstrue/str int name=maxChanges10/int /lst The combineWords and breakWords flags let you tell it which kind of workbreak correction you want. maxChanges controls the maximum number of words it can break 1 word into, or the maximum number of words it can combine. It is reasonable to set this to 1 or 2. The best way to use this is in conjunction with a regular spellchecker like DirectSolrSpellChecker. When used together with the collation functionality, it should take a query like mob ile and depending on what actually returns results from your data, suggest either mobile or perhaps mob lie or both. The one thing is cannot do is fix a transposition or misspelling and combine or break words in one shot. That is, it cannot detect that mob lie should become mobile. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: S.L [mailto:simpleliving...@gmail.com] Sent: Saturday, May 24, 2014 4:21 PM To: solr-user@lucene.apache.org Subject: Wordbreak spellchecker excessive breaking. I am using Solr wordbreak spellchecker and the issue is that when I search for a term like mob ile expecting that the wordbreak spellchecker would actually resutn a suggestion for mobile it breaks the search term into letters like m o b I have two issues with this behavior. 1. How can I make Solr combine mob ile to mobile? 2. Not withstanding the fact that my search term mob ile is being broken incorrectly into individual letters , I realize that the wordbreak is needed in certain cases, how do I control the wordbreak so that it does not break it into letters like m o b which seems like excessive breaking to me ? Thanks.