Re: severe problems with soft and hard commits in a large index
Do you seen any (a lot?) of the warming searchers on deck, i.e. value for N: PERFORMANCE WARNING: Overlapping onDeckSearchers=N On Wed, May 6, 2015 at 10:58 AM, adfel70 adfe...@gmail.com wrote: Hello I have a cluster of 16 shards, 3 replicas. the cluster indexed nested documents. it currently has 3 billion documents overall (parent and children). each shard has around 200 million docs. size of each shard is 250GB. this runs on 12 machines. each machine has 4 SSD disks and 4 solr processes. each process has 28GB heap. each machine has 196GB RAM. I perform periodic indexing throughout the day. each indexing cycle adds around 1.5 million docs. I keep the indexing load light - 2 processes with bulks of 20 docs. My use case demands that each indexing cycle will be visible only when the whole cycle finishes. I tried various methods of using soft and hard commits: 1. using auto hard commit with time=10secs (opensearcher=false) and an explicit soft commit when the indexing finishes. 2. using auto soft commit with time=10/30/60secs during the indexing. 3. not using soft commit at all, just using auto hard commit with time=10secs during the indexing (opensearcher=false) and an explicit hard commit with opensearcher=true when the cycle finishes. with all methods I encounter pretty much the same problem: 1. heavy GCs when soft commit is performed (methods 1,2) or when hardcommit opensearcher=true is performed. these GCs cause heavy latency (average latency is 3 secs. latency during the problem is 80secs) 2. if indexing cycles come too often, which causes softcommits or hardcommits(opensearcher=true) occur with a small interval one after another (around 5-10minutes), I start getting many OOM exceptions. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068.html Sent from the Solr - User mailing list archive at Nabble.com. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info
Re: severe problems with soft and hard commits in a large index
On Wed, 2015-05-06 at 00:58 -0700, adfel70 wrote: each shard has around 200 million docs. size of each shard is 250GB. this runs on 12 machines. each machine has 4 SSD disks and 4 solr processes. each process has 28GB heap. each machine has 196GB RAM. [...] 1. heavy GCs when soft commit is performed (methods 1,2) or when hardcommit opensearcher=true is performed. these GCs cause heavy latency (average latency is 3 secs. latency during the problem is 80secs) Sanity check: Are you sure the pauses are due to garbage collection? You have a fairly large heap and judging from your previous post problem with facets - out of memory exception, you are doing non-trivial faceting. Are you using DocValues, as Marc suggested? - Toke Eskildsen, State and University Library, Denmark
What is the best practice to Backup and delete a core from SOLR Master-Slave architecture
Hi, I am a newbie to SOLR. I have setup Master Slave configuration with SOLR 4.0. I am trying to identify what is the best way to backup an old core and delete the same so as to free up space from the disk. I did get the information on how to unload a core and delete the indexes from the core. Unloading - http://localhost:8983/solr/admin/cores?action=UNLOADcore=core0 Delete Indexes - http://localhost:8983/solr/admin/cores?action=UNLOADcore=core0deleteIndex=true What is the best approach to remove the old core ? * Approach 1 o Unload the core in both Master and Slave server AND delete the index only from Master server (retain the indexes in slave server as a backup). If I am retaining the indexes in Slave server, at later point is there a way to bring those to Master Server ? * Approach 2 o Unload and delete the indexes from both Master and Slave server. Before deleting, take a backup of the data dir of old core from File system. I am not sure if this is even possible ? Is there any other way better way of doing this ? Please let me know Thanks Sangeetha
Re: SolrCloud collection properties
We are currently having many custom properties defined in the core.properties which are used in our solrconfig.xml, e.g. str name=enabled${solr.enable.cachewarming:true}/str Now we want to migrate to SolrCloud and want to define these properties for a collection. But defining properties when creating a collection just writes them into the core.properties of the created cores. This is a pain, because we have a lot of properties and you have to specify each as an URL parameter. Furthermore it seems that these properties are not propagated to the cores for new shards, if you e.g. split a shard - error-prone. As you already mentioned, we could resolve this properties ourselves by using many configsets instead of just one. My question was, if it is possible to use just one configset in this case and specify collection specific properties at the collection level? This seems for me the better way to handle the configuration complexity. Markus 2015-05-06 3:48 GMT+02:00 Erick Erickson erickerick...@gmail.com: _What_ properties? Details matter And how do you do this now? Assuming you do this with separate conf directories, these are then just configsets in Zookeeper and you can have as many of them as you want. Problem here is that each one of them is a complete set of schema and config files, AFAIK the config set is the finest granularity that you have OOB. Best, Erick On Tue, May 5, 2015 at 6:55 AM, Markus Heiden markus.hei...@s24.com wrote: Hi, we are trying to migrate from Solr 4.10 to SolrCloud 4.10. I understood that SolrCloud uses collections as abstraction from the cores. What I am missing is a possibility to store collection-specific properties in Zookeeper. Using property.foo=bar in CREATE-URLs just sets core-specific properties which are not distributed, e.g. if I migrate a shard from one node to another. How do I define collection-specific properties (to be used in solrconfig.xml and schema.xml) which get distributed with the collection to all nodes? Why do I try that? Currently we have different cores which structure is identical, but have each having some specific properties. I would like to have a single configuration for them in Zookeeper from which I want to create different collections, which just differ in the value of some properties. Markus
Re: severe problems with soft and hard commits in a large index
1. yes, I'm sure that pauses are due to GCs. I monitor the cluster and receive continuously metric from system and from java process. I see clearly that when soft commit is triggered, major GCs start occurring (sometimes reocuuring on the same process) and latency rises. I use CMS GC and jdk 1.7.75 2. My previous post was about another use case, but nevertheless I have configured docvalues in the faceted fields. Toke Eskildsen wrote On Wed, 2015-05-06 at 00:58 -0700, adfel70 wrote: each shard has around 200 million docs. size of each shard is 250GB. this runs on 12 machines. each machine has 4 SSD disks and 4 solr processes. each process has 28GB heap. each machine has 196GB RAM. [...] 1. heavy GCs when soft commit is performed (methods 1,2) or when hardcommit opensearcher=true is performed. these GCs cause heavy latency (average latency is 3 secs. latency during the problem is 80secs) Sanity check: Are you sure the pauses are due to garbage collection? You have a fairly large heap and judging from your previous post problem with facets - out of memory exception, you are doing non-trivial faceting. Are you using DocValues, as Marc suggested? - Toke Eskildsen, State and University Library, Denmark -- View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068p4204088.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Finding out optimal hash ranges for shard split
Nope, there is no way to find that out without actually doing the split. If you have composite keys then you could also split using the prefix of a composite id via the split.key parameter. On Wed, May 6, 2015 at 9:32 AM, anand.mahajan an...@zerebral.co.in wrote: Looks like its not possible to find out the optimal hash ranges for a split before you actually split it. So the only way out is to keep splitting out the large subshards? -- View this message in context: http://lucene.472066.n3.nabble.com/Finding-out-optimal-hash-ranges-for-shard-split-tp4203609p4204045.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Solr 5.0 - uniqueKey case insensitive ?
Ah, I remember seeing this when we first started using Solr (which was 4.0 because we needed Solr Cloud), I never got around to filing an issue for it (oops!), but we have a note in our schema to leave the key field a normal string (like Bruno we had tried to lowercase it which failed). We didn't really know Solr in those days, and hadn't really thought about it since then, but Hoss' and Erick's explanations make perfect sense now! Since shard routing is (basically) done on hashes of the unique key, if I have 2 documents which are the same, but have values HELLO and hello, they might well hash to completely different shards, so the update logistics would be horrible. Bruno, why do you need to lowercase at all then? You said in your example, that your client application always supplies pn and it is always uppercase, so presumably all adds/updates could be done directly on that field (as a normal string with no lowercasing). Where does the case insensitivity come in, is that only for searching? If so couldn't you add a search field (called id), and update your app to search using that (or make that your default search field, I guess it depends if your calling app explicitly uses the pn field name in its searches). On 6 May 2015 at 01:55, Erick Erickson erickerick...@gmail.com wrote: Well, working fine may be a bit of an overstatement. That has never been officially supported, so it just happened to work in 3.6. As Chris points out, if you're using SolrCloud then this will _not_ work as routing happens early in the process, i.e. before the analysis chain gets the token so various copies of the doc will exist on different shards. Best, Erick On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina bmann...@free.fr wrote: Hello Chris, yes I confirm on my SOLR3.6 it works fine since several years, and each doc added with same code is updated not added. To be more clear, I receive docs with a field name pn and it's the uniqueKey, and it always in uppercase so I must define in my schema.xml field name=id type=string multiValued=false indexed=true required=true stored=true/ field name=pn type=text_general multiValued=true indexed=true stored=false/ ... uniqueKeyid/uniqueKey ... copyField source=id dest=pn/ but the application that use solr already exists so it requests with pn field not id, i cannot change that. and in each docs I receive, there is not id field, just pn field, and i cannot also change that. so there is a problem no ? I must import a id field and request a pn field, but I have a pn field only for import... Le 05/05/2015 01:00, Chris Hostetter a écrit : : On SOLR3.6, I defined a string_ci field like this: : : fieldType name=string_ci class=solr.TextField : sortMissingLast=true omitNorms=true : analyzer : tokenizer class=solr.KeywordTokenizerFactory/ : filter class=solr.LowerCaseFilterFactory/ : /analyzer : /fieldType : : field name=pn type=string_ci multiValued=false indexed=true : required=true stored=true/ I'm really suprised that field would have worked for you (reliably) as a uniqueKey field even in Solr 3.6. the best practice for something like what you describe has always (going back to Solr 1.x) been to use a copyField to create a case insensitive copy of your uniqueKey for searching. if, for some reason, you really want case insensitve *updates* (so a doc with id foo overwrites a doc with id FOO then the only reliable way to make something like that work is to do the lowercassing in an UpdateProcessor to ensure it happens *before* the docs are distributed to the correct shard, and so the correct existing doc is overwritten (even if you aren't using solr cloud) -Hoss http://www.lucidworks.com/ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
severe problems with soft and hard commits in a large index
Hello I have a cluster of 16 shards, 3 replicas. the cluster indexed nested documents. it currently has 3 billion documents overall (parent and children). each shard has around 200 million docs. size of each shard is 250GB. this runs on 12 machines. each machine has 4 SSD disks and 4 solr processes. each process has 28GB heap. each machine has 196GB RAM. I perform periodic indexing throughout the day. each indexing cycle adds around 1.5 million docs. I keep the indexing load light - 2 processes with bulks of 20 docs. My use case demands that each indexing cycle will be visible only when the whole cycle finishes. I tried various methods of using soft and hard commits: 1. using auto hard commit with time=10secs (opensearcher=false) and an explicit soft commit when the indexing finishes. 2. using auto soft commit with time=10/30/60secs during the indexing. 3. not using soft commit at all, just using auto hard commit with time=10secs during the indexing (opensearcher=false) and an explicit hard commit with opensearcher=true when the cycle finishes. with all methods I encounter pretty much the same problem: 1. heavy GCs when soft commit is performed (methods 1,2) or when hardcommit opensearcher=true is performed. these GCs cause heavy latency (average latency is 3 secs. latency during the problem is 80secs) 2. if indexing cycles come too often, which causes softcommits or hardcommits(opensearcher=true) occur with a small interval one after another (around 5-10minutes), I start getting many OOM exceptions. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068.html Sent from the Solr - User mailing list archive at Nabble.com.
New core on Solr Cloud
Hi. This is my first experience with Solr Cloud. I installed three Solr nodes with three ZooKeeper instances and they seemed to start well. Now I have to create a new replicated core and I'm trying to found out how I can do it. I found many examples about how to create shards and cores, but I have to create one core with only one shard replicated on all three nodes (so basically I want to have the same data on all three nodes). Could you help me to understand what is the correct way to make this, please? Thank you very much! Bye
Solr not getting Start. Error : Could not find the main class: org.apache.solr.util.SolrCLI
Hello, When I starting solr-5.1.0 in Ubuntu 12.04 by, */bin/var/www/solr-5.0.0/bin ./solr start* Solr is being started and shows as below, *Started Solr server on port 8983 (pid=14457). Happy searching!* When I starting Solr on http://localhost:8983/solr/ its not starting. Then I have checking the status by */bin/var/www/solr-5.0.0/bin ./solr status* then at the end I have got an error as below, *Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/solr/util/SolrCLI : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:643) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) Could not find the main class: org.apache.solr.util.SolrCLI. Program will exit. please visit http://localhost:8983/solr* Same thing is repeating when starting solr on SolrCloud Please help me in this. -- *Thanks **Regards,* *Mayur Champaneria* *PHP Developer ( MMT )* *Vertex Softwares*
Re: Finding out optimal hash ranges for shard split
Okay - Thanks for the confirmation Shalin. Could this be a feature request in the Collections API - that we have a Split shard dry run API that accepts sub-shards count as a request param and returns the optimal shard ranges for the number of sub-shards requested to be created along with the respective document counts for each of the sub-shards? The users can then use this shard ranges for the actual split? -- View this message in context: http://lucene.472066.n3.nabble.com/Finding-out-optimal-hash-ranges-for-shard-split-tp4203609p4204100.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: New core on Solr Cloud
Ok, I found out that the creation of new core/collection on Solr 5.1 is made with the bin/solr script. So I created a new collection with this command: ./solr create_collection -c test -replicationFactor 3 Is this the correct way? Thank you very much, Bye! 2015-05-06 10:02 GMT+02:00 shacky shack...@gmail.com: Hi. This is my first experience with Solr Cloud. I installed three Solr nodes with three ZooKeeper instances and they seemed to start well. Now I have to create a new replicated core and I'm trying to found out how I can do it. I found many examples about how to create shards and cores, but I have to create one core with only one shard replicated on all three nodes (so basically I want to have the same data on all three nodes). Could you help me to understand what is the correct way to make this, please? Thank you very much! Bye
ZooKeeperException: Could not find configName for collection
Hi list. I created a new collection on my new SolrCloud installation, the new collection is shown and replicated on all three nodes, but on the first node (only on this one) I get this error: new_core: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection new_core found:null I cannot see any core named new_core on that node, and I also tried to remove it: root@index1:/opt/solr# ./bin/solr delete -c new_core Connecting to ZooKeeper at zk1,zk2,zk3 ERROR: Collection new_core not found! Could you help me, please? Thank you very much! Bye
Re: Finding out optimal hash ranges for shard split
Hi Anand, The nature of the hash function (murmur3) should lead to a approximately uniform distribution of documents across sub-shards. Have you investigated why, if at all, the sub-shards are not balanced? Do you use composite keys e.g. abc!id1 which cause the imbalance? I don't think there is a (cheap) way to implement what you are asking in the current scheme of things because unless we go through each id and calculate the hash, we have no way of knowing the optimal distribution. However, if we were to store the hash of the key as a separate field in the index then it should be possible to binary search for hash ranges which lead to approx. equal distribution of docs in sub-shards. We can implement something like that inside Solr. On Wed, May 6, 2015 at 4:42 PM, anand.mahajan an...@zerebral.co.in wrote: Okay - Thanks for the confirmation Shalin. Could this be a feature request in the Collections API - that we have a Split shard dry run API that accepts sub-shards count as a request param and returns the optimal shard ranges for the number of sub-shards requested to be created along with the respective document counts for each of the sub-shards? The users can then use this shard ranges for the actual split? -- View this message in context: http://lucene.472066.n3.nabble.com/Finding-out-optimal-hash-ranges-for-shard-split-tp4203609p4204100.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Solr not getting Start. Error : Could not find the main class: org.apache.solr.util.SolrCLI
UnsupportedClassVersionError means you have an old JDK. Use a more recent one. Markus 2015-05-06 12:59 GMT+02:00 Mayur Champaneria ma...@matchmytalent.com: Hello, When I starting solr-5.1.0 in Ubuntu 12.04 by, */bin/var/www/solr-5.0.0/bin ./solr start* Solr is being started and shows as below, *Started Solr server on port 8983 (pid=14457). Happy searching!* When I starting Solr on http://localhost:8983/solr/ its not starting. Then I have checking the status by */bin/var/www/solr-5.0.0/bin ./solr status* then at the end I have got an error as below, *Exception in thread main java.lang.UnsupportedClassVersionError: org/apache/solr/util/SolrCLI : Unsupported major.minor version 51.0 at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:643) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:277) at java.net.URLClassLoader.access$000(URLClassLoader.java:73) at java.net.URLClassLoader$1.run(URLClassLoader.java:212) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:323) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:268) Could not find the main class: org.apache.solr.util.SolrCLI. Program will exit. please visit http://localhost:8983/solr* Same thing is repeating when starting solr on SolrCloud Please help me in this. -- *Thanks **Regards,* *Mayur Champaneria* *PHP Developer ( MMT )* *Vertex Softwares*
Will field type change require complete re-index?
Hi, I have been using Solr for sometime now and by mistake I used String for my date fields. The format of the string is like this: 2015-05-05T13:24:10Z Now, If I need to change the field type to date from String will this require complete reindex? *Vishal Sharma**Team Leader, SFDC*T:+1 302 753 5105 E: vish...@grazitti.com www.grazitti.com http://go.grazitti.com/Meet-Us-At-Marketo-User-Summit.htmlApril 13th-15th, 2015 *Meet us at*Moscone Center, Booth #32, San Francisco Schedule a Meeting http://www.vcita.com/v/grazittiinteractive/online_scheduling#/schedule |[image: Description: LinkedIn] http://www.linkedin.com/company/grazitti-interactive [image: Description: Twitter] https://twitter.com/grazitti [image: fbook] https://www.facebook.com/grazitti.interactive
Solr with logstash solr_http output plugin and geoip filter
Hi, I'm currently using solr to index a moderate amount of information with the help of logstash and the solr_http contrib output plugin. solr is receiving documents, I've got banana as a web interface and I am running it with a schemaless core. I'm feeding documents via the contrib plugin solr_http and logstash. One of the filters I'm using is geoip with the following setup: geoip { source = subject_ip database = /opt/logstash/vendor/geoip/GeoLiteCity.dat target = geoip fields = [latitude, longitude] } However this created a string field called geoip with the value: {latitude=2.0, longitude=13.0, location=[2.0, 13.0]} This is meant to become three sub fields: geoip.latitude = 2.0 geoip.longitude = 13.0 geoip.location = 2.0, 13.0 The above setup worked with logstash feeding into elasticsearch, resulting in geoip.location being populated correctly as a field itself. Given it did work with ES, I assume the first issue is, solr either does not know how to parse a value as additional variables with values, OR I simply have not configured solr correctly (I'm betting on the latter). I have only been using solr for about 8 hours (installed today), had to try something as no amount of tweaking would resolve the indexing performance issues I had with ES - I'm now indexing the same amount of data into solr at near real-time on the exact same machine that was running ES where indexing would stop after about 2 hours. The whole point of the geoip field is to get geoip.location which will be the location field used by bettermap on the banana web interface. I am not running SiLK. I am running solr 5.1, logstash 1.4. Regards, Daniel
Re: Multiple index.timestamp directories using up disk space
We use the following merge policy on SSD's and are running on physical machines with linux OS. mergeFactor10/mergeFactor mergePolicy class=org.apache.lucene.index.TieredMergePolicy/ mergeScheduler class=org.apache.lucene.index.ConcurrentMergeScheduler int name=maxThreadCount3/int int name=maxMergeCount15/int /mergeScheduler ramBufferSizeMB64/ramBufferSizeMB Not sure if its very aggressive, but its something we keep to prevent deleted documents taking up too much space on our index. Is there some error message that solr logs when rename and deletion of the directories fails. If so we could monitor our logs to get a better idea for the root cause. At present we can only react when things go wrong based on disk space alarms. Thanks, Rishi. -- View this message in context: http://lucene.472066.n3.nabble.com/Multiple-index-timestamp-directories-using-up-disk-space-tp4201098p4204145.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: severe problems with soft and hard commits in a large index
Thank you for the detailed answer. How can I decrease the impact of opening a searcher in such a large index? especially the impact of heap usage that causes OOM. regarding GC tuning - I am doint that. here are the params I use: AggresiveOpts UseLargePages ParallelRefProcEnabled CMSParallelRemarkEnabled CMSMaxAbortablePrecleanTime=6000 CMDTriggerPermRatio=80 CMSInitiatingOccupancyFraction=70 UseCMSInitiatinOccupancyOnly CMSFullGCsBeforeCompaction=1 PretenureSizeThreshold=64m CMSScavengeBeforeRemark UseConcMarkSweepGC MaxTenuringThreshold=8 TargetSurvivorRatio=90 SurviorRatio=4 NewRatio=2 Xms16gb Xmn28gb any input on this? How many documents per shard are recommended? Note that I use nested documents. total collection size is 3 billion docs, number of parent docs is 600 million. the rest are children. Shawn Heisey-2 wrote On 5/6/2015 1:58 AM, adfel70 wrote: I have a cluster of 16 shards, 3 replicas. the cluster indexed nested documents. it currently has 3 billion documents overall (parent and children). each shard has around 200 million docs. size of each shard is 250GB. this runs on 12 machines. each machine has 4 SSD disks and 4 solr processes. each process has 28GB heap. each machine has 196GB RAM. I perform periodic indexing throughout the day. each indexing cycle adds around 1.5 million docs. I keep the indexing load light - 2 processes with bulks of 20 docs. My use case demands that each indexing cycle will be visible only when the whole cycle finishes. I tried various methods of using soft and hard commits: I personally would configure autoCommit on a five minute (maxTime of 30) interval with openSearcher=false. The use case you have outlined (not seeing changed until the indexing is done) demands that you do NOT turn on autoSoftCommit, that you do one manual commit at the end of indexing, which could be either a soft commit or a hard commit. I would recommend a soft commit. Because it is the openSearcher part of a commit that's very expensive, you can successfully do autoCommit with openSearcher=false on an interval like 10 or 15 seconds and not see much in the way of immediate performance loss. That commit is still not free, not only in terms of resources, but in terms of java heap garbage generated. The general advice with commits is to do them as infrequently as you can, which applies to ANY commit, not just those that make changes visible. with all methods I encounter pretty much the same problem: 1. heavy GCs when soft commit is performed (methods 1,2) or when hardcommit opensearcher=true is performed. these GCs cause heavy latency (average latency is 3 secs. latency during the problem is 80secs) 2. if indexing cycles come too often, which causes softcommits or hardcommits(opensearcher=true) occur with a small interval one after another (around 5-10minutes), I start getting many OOM exceptions. If you're getting OOM, then either you need to change things so Solr requires less heap memory, or you need to increase the heap size. Changing things might be either the config or how you use Solr. Are you tuning your garbage collection? With a 28GB heap, tuning is not optional. It's so important that the startup scripts in 5.0 and 5.1 include it, even though the default max heap is 512MB. Let's do some quick math on your memory. You have four instances of Solr on each machine, each with a 28GB heap. That's 112GB of memory allocated to Java. With 196GB total, you have approximately 84GB of RAM left over for caching your index. A 16-shard index with three replicas means 48 cores. Divide that by 12 machines and that's 4 replicas on each server, presumably one in each Solr instance. You say that the size of each shard is 250GB, so you've got about a terabyte of index on each server, but only 84GB of RAM for caching. Even with SSD, that's not going to be anywhere near enough cache memory for good Solr performance. All these memory issues, including GC tuning, are discussed on this wiki page: http://wiki.apache.org/solr/SolrPerformanceProblems One additional note: By my calculations, each filterCache entry will be at least 23MB in size. This means that if you are using the filterCache and the G1 collector, you will not be able to avoid humongous allocations, which is any allocation larger than half the G1 region size. The max configurable G1 region size is 32MB. You should use the CMS collector for your GC tuning, not G1. If you can reduce the number of documents in each shard, G1 might work well. Thanks, Shawn -- View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068p4204148.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: severe problems with soft and hard commits in a large index
I dont see any of these. I've seen them before in other clusters and uses of SOLR but don't see any of these messages here. Dmitry Kan-2 wrote Do you seen any (a lot?) of the warming searchers on deck, i.e. value for N: PERFORMANCE WARNING: Overlapping onDeckSearchers=N On Wed, May 6, 2015 at 10:58 AM, adfel70 lt; adfel70@ gt; wrote: Hello I have a cluster of 16 shards, 3 replicas. the cluster indexed nested documents. it currently has 3 billion documents overall (parent and children). each shard has around 200 million docs. size of each shard is 250GB. this runs on 12 machines. each machine has 4 SSD disks and 4 solr processes. each process has 28GB heap. each machine has 196GB RAM. I perform periodic indexing throughout the day. each indexing cycle adds around 1.5 million docs. I keep the indexing load light - 2 processes with bulks of 20 docs. My use case demands that each indexing cycle will be visible only when the whole cycle finishes. I tried various methods of using soft and hard commits: 1. using auto hard commit with time=10secs (opensearcher=false) and an explicit soft commit when the indexing finishes. 2. using auto soft commit with time=10/30/60secs during the indexing. 3. not using soft commit at all, just using auto hard commit with time=10secs during the indexing (opensearcher=false) and an explicit hard commit with opensearcher=true when the cycle finishes. with all methods I encounter pretty much the same problem: 1. heavy GCs when soft commit is performed (methods 1,2) or when hardcommit opensearcher=true is performed. these GCs cause heavy latency (average latency is 3 secs. latency during the problem is 80secs) 2. if indexing cycles come too often, which causes softcommits or hardcommits(opensearcher=true) occur with a small interval one after another (around 5-10minutes), I start getting many OOM exceptions. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068.html Sent from the Solr - User mailing list archive at Nabble.com. -- Dmitry Kan Luke Toolbox: http://github.com/DmitryKey/luke Blog: http://dmitrykan.blogspot.com Twitter: http://twitter.com/dmitrykan SemanticAnalyzer: www.semanticanalyzer.info -- View this message in context: http://lucene.472066.n3.nabble.com/severe-problems-with-soft-and-hard-commits-in-a-large-index-tp4204068p4204123.html Sent from the Solr - User mailing list archive at Nabble.com.
getting frequent CorruptIndexException and inconsistent data though core is active
Hi I'm getting org.apache.lucene.index.CorruptIndexException liveDocs.count()=2000699 info.docCount()=2047904 info.getDelCount()=47207 (filename=_ney_1g.del). This just happened for the 4th time in 2 weeks. each time this happens in another core, usually when a replica tries to recover, then it reports that it succeeded, and then the CorruptIndexException is thrown while trying to open searcher. this core is marked as active and thus query can get redirected there and this causes data inconsistency to users. this occurs with solr 4.10.3, should be noted that I use nested docs. ANOTHER problem is that replicas can get inconsistent number of docs with no exception being reported. This occurs usually when one of the replicas goes down during indexing. what I end up getting is the leader being in an older version than the replicas or having less docs than the replicas. switching leaders (stopping the leader so that another replicas become the leader) does not fix the problem. this occurs both in solr 4.10.3 and in solr 4.8 -- View this message in context: http://lucene.472066.n3.nabble.com/getting-frequent-CorruptIndexException-and-inconsistent-data-though-core-is-active-tp4204129.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: /suggest through SolrJ?
Exactly Tomnaso , I was referring to that ! I wrote another mail in the dev mailing list, I will open a Jira Issue for that ! Cheers 2015-04-29 12:16 GMT+01:00 Tommaso Teofili tommaso.teof...@gmail.com: 2015-04-27 19:22 GMT+02:00 Alessandro Benedetti benedetti.ale...@gmail.com : Just had the very same problem, and I confirm that currently is quite a mess to manage suggestions in SolrJ ! I have to go with manual Json parsing. or very not nice NamedList API mess (see an example in JR Oak [1][2]). Regards, Tommaso p.s.: note that this applies to Solr 4.7.1 API, but reading the thread it seems the problem is still there. [1] : https://github.com/apache/jackrabbit-oak/blob/trunk/oak-solr-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/solr/query/SolrQueryIndex.java#L318 [2] : https://github.com/apache/jackrabbit-oak/blob/trunk/oak-solr-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/solr/query/SolrQueryIndex.java#L370 Cheers 2015-02-02 12:17 GMT+00:00 Jan Høydahl jan@cominvent.com: Using the /suggest handler wired to SuggestComponent, the SpellCheckResponse objects are not populated. Reason is that QueryResponse looks for a top-level element named spellcheck else if ( spellcheck.equals( n ) ) { _spellInfo = (NamedListObject) res.getVal( i ); extractSpellCheckInfo( _spellInfo ); } Earlier the suggester was the same as the Spell component, but now with its own component, suggestions are put in suggest. I think we're lacking a SuggestResponse.java for parsing suggest responses..?? -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com 26. sep. 2014 kl. 07.27 skrev Clemens Wyss DEV clemens...@mysign.ch : Thx to you two. Just in case anybody else is trying to do this. The following SolrJ code corresponds to the http request GET http://localhost:8983/solr/solrpedia/suggest?q=atmo of Solr in Action (chapter 10): ... SolrServer server = new HttpSolrServer( http://localhost:8983/solr/solrpedia;); SolrQuery query = new SolrQuery( atmo ); query.setRequestHandler( /suggest ); QueryResponse queryresponse = server.query( query ); ... queryresponse.getSpellCheckResponse().getSuggestions(); ... -Ursprüngliche Nachricht- Von: Shawn Heisey [mailto:s...@elyograg.org] Gesendet: Donnerstag, 25. September 2014 17:37 An: solr-user@lucene.apache.org Betreff: Re: /suggest through SolrJ? On 9/25/2014 8:43 AM, Erick Erickson wrote: You can call anything from SolrJ that you can call from a URL. SolrJ has lots of convenience stuff to set particular parameters, parse the response, etc... But in the end it's communicating with Solr via a URL. Take a look at something like SolrQuery for instance. It has a nice command setFacetPrefix. Here's the entire method: public SolrQuery setFacetPrefix( String field, String prefix ) { this.set( FacetParams.FACET_PREFIX, prefix ); return this; } which is really this.set( facet.prefix, prefix ); All it's really doing is setting a SolrParams key/value pair which is equivalent to facet.prefix=blahblah on a URL. As I remember, there's a setPath method that you can use to set the destination for the request to suggest (or maybe /suggest). It's something like that. Yes, like Erick says, just use SolrQuery for most accesses to Solr on arbitrary URL paths with arbitrary URL parameters. The set method is how you include those parameters. The SolrQuery method Erick was talking about at the end of his email is setRequestHandler(String), and you would set that to /suggest. Full disclosure about what this method actually does: it also sets the qt parameter, but with the modern example Solr config, the qt parameter doesn't do anything -- you must actually change the URL path on the request, which this method will do if the value starts with a forward slash. Thanks, Shawn -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England -- -- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry? William Blake - Songs of Experience -1794 England
Re: Solr not getting Start. Error : Could not find the main class: org.apache.solr.util.SolrCLI
On 5/6/2015 6:37 AM, Markus Heiden wrote: UnsupportedClassVersionError means you have an old JDK. Use a more recent one. Specifically, Unsupported major.minor version 51.0 means you are trying to use Java 6 (1.6.0) to run a program that requires Java 7 (1.7.0). Solr 4.8 and later (including the 5.x versions) requires Java 7. If you're looking for the absolute minimum requirements, you only need the JRE, not the JDK. Thanks, Shawn
Re: Finding out optimal hash ranges for shard split
Yes - I'm using 2 level composite ids and that has caused the imbalance for some shards. Its cars data and the composite ids are of the form year-make!model-and couple of other specifications. e.g. 2013Ford!Edge!123456 - but there are just far too many Ford 2013 or 2011 cars that go and occupy the same shards. This was done so as co-location of these docs is required as well for a few of the search requirements - to avoid it hitting all shards all the time and all queries do have the year and make combinations always specified and its easier to work out the target shard for the query. Regarding storing the hash against each document and then querying to find out the optimal ranges - could it be done so that Solr maintains incremental counters for each of the hash in the range for the shard - and then the collections Splitshard API could use this internally to propose the optimal shard ranges for the split? -- View this message in context: http://lucene.472066.n3.nabble.com/Finding-out-optimal-hash-ranges-for-shard-split-tp4203609p4204124.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Will field type change require complete re-index?
On 5/6/2015 7:03 AM, Vishal Sharma wrote: Now, If I need to change the field type to date from String will this require complete reindex? Yes, it absolutely will require a complete reindex. A change like that probably will result in errors on queries until a reindex is done. You may even need to completely delete the index directory and restart Solr before doing your reindex to get rid of the old segments that have information incompatible with your new schema. http://wiki.apache.org/solr/HowToReindex Thanks, Shawn
Re: severe problems with soft and hard commits in a large index
On 5/6/2015 1:58 AM, adfel70 wrote: I have a cluster of 16 shards, 3 replicas. the cluster indexed nested documents. it currently has 3 billion documents overall (parent and children). each shard has around 200 million docs. size of each shard is 250GB. this runs on 12 machines. each machine has 4 SSD disks and 4 solr processes. each process has 28GB heap. each machine has 196GB RAM. I perform periodic indexing throughout the day. each indexing cycle adds around 1.5 million docs. I keep the indexing load light - 2 processes with bulks of 20 docs. My use case demands that each indexing cycle will be visible only when the whole cycle finishes. I tried various methods of using soft and hard commits: I personally would configure autoCommit on a five minute (maxTime of 30) interval with openSearcher=false. The use case you have outlined (not seeing changed until the indexing is done) demands that you do NOT turn on autoSoftCommit, that you do one manual commit at the end of indexing, which could be either a soft commit or a hard commit. I would recommend a soft commit. Because it is the openSearcher part of a commit that's very expensive, you can successfully do autoCommit with openSearcher=false on an interval like 10 or 15 seconds and not see much in the way of immediate performance loss. That commit is still not free, not only in terms of resources, but in terms of java heap garbage generated. The general advice with commits is to do them as infrequently as you can, which applies to ANY commit, not just those that make changes visible. with all methods I encounter pretty much the same problem: 1. heavy GCs when soft commit is performed (methods 1,2) or when hardcommit opensearcher=true is performed. these GCs cause heavy latency (average latency is 3 secs. latency during the problem is 80secs) 2. if indexing cycles come too often, which causes softcommits or hardcommits(opensearcher=true) occur with a small interval one after another (around 5-10minutes), I start getting many OOM exceptions. If you're getting OOM, then either you need to change things so Solr requires less heap memory, or you need to increase the heap size. Changing things might be either the config or how you use Solr. Are you tuning your garbage collection? With a 28GB heap, tuning is not optional. It's so important that the startup scripts in 5.0 and 5.1 include it, even though the default max heap is 512MB. Let's do some quick math on your memory. You have four instances of Solr on each machine, each with a 28GB heap. That's 112GB of memory allocated to Java. With 196GB total, you have approximately 84GB of RAM left over for caching your index. A 16-shard index with three replicas means 48 cores. Divide that by 12 machines and that's 4 replicas on each server, presumably one in each Solr instance. You say that the size of each shard is 250GB, so you've got about a terabyte of index on each server, but only 84GB of RAM for caching. Even with SSD, that's not going to be anywhere near enough cache memory for good Solr performance. All these memory issues, including GC tuning, are discussed on this wiki page: http://wiki.apache.org/solr/SolrPerformanceProblems One additional note: By my calculations, each filterCache entry will be at least 23MB in size. This means that if you are using the filterCache and the G1 collector, you will not be able to avoid humongous allocations, which is any allocation larger than half the G1 region size. The max configurable G1 region size is 32MB. You should use the CMS collector for your GC tuning, not G1. If you can reduce the number of documents in each shard, G1 might work well. Thanks, Shawn
Trying to get AnalyzingInfixSuggester to work in Solr?
I'm trying to get the AnalyzingInfixSuggester to work but I'm not successful. I'd be grateful if someone can point me to a working example. Problem: My content is product descriptions similar to a BestBuy or NewEgg catalog. My problem is that I'm getting only single words in the suggester results. E.g. if I type 'len', I get the suggester results like 'Lenovo' but not 'Lenovo laptop' or something larger/longer than a single word. There is a suggestion here: http://blog.mikemccandless.com/2013/06/a-new-lucene-suggester-based-on-infix.html that the search at: http://jirasearch.mikemccandless.com/search.py?index=jira is powered by the AnalyzingInfixSuggester If this is true, when I use this suggester, I get more than a few words in the suggester results, but I don't with my setup i.e. on my setup I get only single words. My configuration is searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=fieldtext/str float name=threshold0.005/float str name=buildOnCommittrue/str str name=suggestAnalyzerFieldTypetext_general/str bool name=exactMatchFirsttrue/bool /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count5/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest/str /arr /requestHandler I copy the contents of all of my fields to a single field called 'text'. The ' text_general' type is exactly as in the solr examples: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/example-DIH/solr/db/conf/schema.xml?view=markup I'd be grateful if anyone can help me. I don't know what to look at. Thank you in adance. O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163.html Sent from the Solr - User mailing list archive at Nabble.com.
Completion Suggester in Solr
Hi Is there a equivalent of Completion suggester of ElasticSearch in Solr ? I am a user who uses both Solr and ES, in different projects. I am not able to find a solution in Solr, where i can use : 1) FSA Structure 2) multiple terms as synonyms 3) assign a weight to each document based on certain hueristics, ex: popularity score, user search history etc. Any kind of help , pointers to relevant examples and documentation is highly appreciated. thanks in advance. Pradeep
A defect in Schema API with Add a New Copy Field Rule?
Hi Everyone, I am using the Schema API to add a new copy field per: https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-AddaNewCopyFieldRule Unlike the other Add APIs, this one will not fail if you add an existing copy field object. In fact, after when I call the API over and over, the item will appear over and over in schema.xml file like so: copyField source=author dest=text/ copyField source=author dest=text/ copyField source=author dest=text/ copyField source=author dest=text/ Is this the expected behaviour or a bug? As a side question, is there any harm in having multiple copyField like I ended up with? A final question, why there is no Replace a Copy Field? Is this by design for some limitation or was the API just never implemented? Thanks Steve
5.1.0 Heatmap + Geotools
Hi - I'm very interested in the new heat map capability of Solr 5.1.0. Has anyone looked at combining geotool's HeatmapProcess method with this data? I'm trying this now, but I keep getting an empty image from the GridCoverage2D object. Any pointers/tips? Thank you! -Joe
Re: Trying to get AnalyzingInfixSuggester to work in Solr?
yes textSuggest is of type text_general with below definition fieldType name=text_general class=solr.TextField positionIncrementGap=100 sortMissingLast=true omitNorms=true analyzer type=index tokenizer class=solr.ClassicTokenizerFactory/ filter class=solr.ClassicFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.ShingleFilterFactory maxShingleSize=5 outputUnigrams=true/ /analyzer analyzer type=query charFilter class=solr.MappingCharFilterFactory mapping=mapping-FoldToASCII.txt/ tokenizer class=solr.ClassicTokenizerFactory/ filter class=solr.ClassicFilterFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.KeywordMarkerFilterFactory protected=protwords.txt/ filter class=solr.ShingleFilterFactory maxShingleSize=5 outputUnigrams=true/ /analyzer /fieldType *Rajesh.* On Wed, May 6, 2015 at 4:50 PM, O. Olson olson_...@yahoo.it wrote: Thank you Rajesh for responding so quickly. I tried it again with a restart and a reimport and I still cannot get this to work i.e. I'm seeing no difference. I'm wondering how you define: 'textSuggest' in your schema? In my case I use the field 'text' that is defined as: field name=text type=text_general indexed=true stored=false multiValued=true/ I'm wondering if your 'textSuggest' is of type text_general ? Thank you again for your help O. O. Rajesh Hazari wrote I just tested your config with my schema and it worked. my config : searchComponent class=solr.SpellCheckComponent name=suggest1 lst name=spellchecker str name=name suggest /str str name=classname org.apache.solr.spelling.suggest.Suggester /str str name=lookupImpl org.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory /str str name=field textSuggest /str float name=threshold 0.005 /float str name=buildOnCommit true /str str name=suggestAnalyzerFieldType text_general /str bool name=exactMatchFirst true /bool /lst /searchComponent queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest1 lst name=defaults str name=spellcheck true /str str name=spellcheck.dictionary suggest /str str name=spellcheck.onlyMorePopular true /str str name=spellcheck.count 5 /str str name=spellcheck.collate true /str /lst arr name=components str suggest1 /str /arr /requestHandler http://localhost:8585/solr/collection1/suggest1?q=applerows=10wt=jsonindent=true { responseHeader:{ status:0, QTime:2}, spellcheck:{ suggestions:[ apple,{ numFound:5, startOffset:0, endOffset:5, suggestion:[ * apple * , * apple * and, * apple * and facebook, * apple * and facebook learn, * apple * and facebook learn from]}, collation, * apple * ]}} *Rajesh**.* -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163p4204208.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Trying to get AnalyzingInfixSuggester to work in Solr?
Thank you Rajesh. I think I got a bit of help from the answer at: http://stackoverflow.com/a/29743945 While that example sort of worked for me, I'm not had the time to test what works and what didn't. So far I have found that I need the the field in my searchComponent to be of type 'string'. In my original example I had this as text_general. Next I used the suggest_string fieldType as defined in the StackOverflow answer. I also removed your queryConverter, and it still works, so I think it's not needed. Thank you very much, O. O. Rajesh Hazari wrote I just tested your config with my schema and it worked. my config : searchComponent class=solr.SpellCheckComponent name=suggest1 lst name=spellchecker str name=name suggest /str str name=classname org.apache.solr.spelling.suggest.Suggester /str str name=lookupImpl org.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory /str str name=field textSuggest /str float name=threshold 0.005 /float str name=buildOnCommit true /str str name=suggestAnalyzerFieldType text_general /str bool name=exactMatchFirst true /bool /lst /searchComponent queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest1 lst name=defaults str name=spellcheck true /str str name=spellcheck.dictionary suggest /str str name=spellcheck.onlyMorePopular true /str str name=spellcheck.count 5 /str str name=spellcheck.collate true /str /lst arr name=components str suggest1 /str /arr /requestHandler http://localhost:8585/solr/collection1/suggest1?q=applerows=10wt=jsonindent=true { responseHeader:{ status:0, QTime:2}, spellcheck:{ suggestions:[ apple,{ numFound:5, startOffset:0, endOffset:5, suggestion:[ * apple * , * apple * and, * apple * and facebook, * apple * and facebook learn, * apple * and facebook learn from]}, collation, * apple * ]}} *Rajesh**.* -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163p4204222.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Trying to get AnalyzingInfixSuggester to work in Solr?
Have you seen this? I tried to make something end-to-end with assorted gotchas identified Best, Erick On Wed, May 6, 2015 at 3:09 PM, O. Olson olson_...@yahoo.it wrote: Thank you Rajesh. I think I got a bit of help from the answer at: http://stackoverflow.com/a/29743945 While that example sort of worked for me, I'm not had the time to test what works and what didn't. So far I have found that I need the the field in my searchComponent to be of type 'string'. In my original example I had this as text_general. Next I used the suggest_string fieldType as defined in the StackOverflow answer. I also removed your queryConverter, and it still works, so I think it's not needed. Thank you very much, O. O. Rajesh Hazari wrote I just tested your config with my schema and it worked. my config : searchComponent class=solr.SpellCheckComponent name=suggest1 lst name=spellchecker str name=name suggest /str str name=classname org.apache.solr.spelling.suggest.Suggester /str str name=lookupImpl org.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory /str str name=field textSuggest /str float name=threshold 0.005 /float str name=buildOnCommit true /str str name=suggestAnalyzerFieldType text_general /str bool name=exactMatchFirst true /bool /lst /searchComponent queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest1 lst name=defaults str name=spellcheck true /str str name=spellcheck.dictionary suggest /str str name=spellcheck.onlyMorePopular true /str str name=spellcheck.count 5 /str str name=spellcheck.collate true /str /lst arr name=components str suggest1 /str /arr /requestHandler http://localhost:8585/solr/collection1/suggest1?q=applerows=10wt=jsonindent=true { responseHeader:{ status:0, QTime:2}, spellcheck:{ suggestions:[ apple,{ numFound:5, startOffset:0, endOffset:5, suggestion:[ * apple * , * apple * and, * apple * and facebook, * apple * and facebook learn, * apple * and facebook learn from]}, collation, * apple * ]}} *Rajesh**.* -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163p4204222.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr 3.6.2 under tomcat 8 missing corename in path
On 5/6/2015 2:29 PM, Tim Dunphy wrote: I'm trying to setup an old version of Solr for one of our drupal developers. Apparently only versions 1.x or 3.x will work with the current version of drupal. I'm setting up solr 3.4.2 under tomcat. And I'm getting this error when I start tomcat and surf to the /solr/admin URL: HTTP Status 404 - missing core name in path type Status report message missing core name in path description The requested resource is not available. The URL must include the core name. Your defaultCoreName is collection1, and I'm guessing you don't have a core named collection1. Try browsing to just /solr instead of /solr/admin ... you should get a list of links for valid cores, each of which will take you to the admin page for that core. Probably what you will find is that when you click on one of those links, you will end up on /solr/corename/admin.jsp as the URL in your browser. Thanks, Shawn
solr 3.6.2 under tomcat 8 missing corename in path
I'm trying to setup an old version of Solr for one of our drupal developers. Apparently only versions 1.x or 3.x will work with the current version of drupal. I'm setting up solr 3.4.2 under tomcat. And I'm getting this error when I start tomcat and surf to the /solr/admin URL: HTTP Status 404 - missing core name in path type Status report message missing core name in path description The requested resource is not available. I have solr living in /opt: # ls -ld /opt/solr lrwxrwxrwx. 1 root root 17 May 6 12:48 /opt/solr - apache-solr-3.6.2 And I have my cores located here: # ls -ld /opt/solr/admin/cores drwxr-xr-x. 3 root root 4096 May 6 14:37 /opt/solr/admin/cores Just one core so far, until I can get this working. # ls -l /opt/solr/admin/cores/ total 4 drwxr-xr-x. 5 root root 4096 May 6 14:08 collection1 I have this as my solr.xml file: solr persistent=false cores adminPath=/admin/cores defaultCoreName=collection1 core name=collection1 instanceDir=collection1 / /cores /solr Which is located in these two places: # ls -l /opt/solr/solr.xml /usr/local/tomcat/conf/Catalina/solr.xml -rw-r--r--. 1 root root 169 May 6 14:38 /opt/solr/solr.xml -rw-r--r--. 1 root root 169 May 6 14:38 /usr/local/tomcat/conf/Catalina/solr.xml These are the contents of my /opt/solr directory # ls -l /opt/solr/ total 436 drwxr-xr-x. 3 root root 4096 May 6 14:37 admin -rw-r--r--. 1 root root 176647 Dec 18 2012 CHANGES.txt drwxr-xr-x. 3 root root 4096 May 6 12:48 client drwxr-xr-x. 9 root root 4096 Dec 18 2012 contrib drwxr-xr-x. 3 root root 4096 May 6 12:48 dist drwxr-xr-x. 3 root root 4096 May 6 12:48 docs -rw-r--r--. 1 root root 1274 May 6 13:28 elevate.xml drwxr-xr-x. 11 root root 4096 May 6 12:48 example -rw-r--r--. 1 root root 81331 Dec 18 2012 LICENSE.txt -rw-r--r--. 1 root root 20828 Dec 18 2012 NOTICE.txt -rw-r--r--. 1 root root 5270 Dec 18 2012 README.txt -rw-r--r--. 1 root root 55644 May 6 13:27 schema.xml -rw-r--r--. 1 root root 60884 May 6 13:27 solrconfig.xml -rw-r--r--. 1 root root169 May 6 14:38 solr.xml Yet, when I bounce tomcat, this is the result that I get: HTTP Status 404 - missing core name in path type Status report message missing core name in path description The requested resource is not available. Cany anyone tell me what I'm doing wrong? Thanks!! Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: Trying to get AnalyzingInfixSuggester to work in Solr?
Thank you Rajesh for responding so quickly. I tried it again with a restart and a reimport and I still cannot get this to work i.e. I'm seeing no difference. I'm wondering how you define: 'textSuggest' in your schema? In my case I use the field 'text' that is defined as: field name=text type=text_general indexed=true stored=false multiValued=true/ I'm wondering if your 'textSuggest' is of type text_general ? Thank you again for your help O. O. Rajesh Hazari wrote I just tested your config with my schema and it worked. my config : searchComponent class=solr.SpellCheckComponent name=suggest1 lst name=spellchecker str name=name suggest /str str name=classname org.apache.solr.spelling.suggest.Suggester /str str name=lookupImpl org.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory /str str name=field textSuggest /str float name=threshold 0.005 /float str name=buildOnCommit true /str str name=suggestAnalyzerFieldType text_general /str bool name=exactMatchFirst true /bool /lst /searchComponent queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest1 lst name=defaults str name=spellcheck true /str str name=spellcheck.dictionary suggest /str str name=spellcheck.onlyMorePopular true /str str name=spellcheck.count 5 /str str name=spellcheck.collate true /str /lst arr name=components str suggest1 /str /arr /requestHandler http://localhost:8585/solr/collection1/suggest1?q=applerows=10wt=jsonindent=true { responseHeader:{ status:0, QTime:2}, spellcheck:{ suggestions:[ apple,{ numFound:5, startOffset:0, endOffset:5, suggestion:[ * apple * , * apple * and, * apple * and facebook, * apple * and facebook learn, * apple * and facebook learn from]}, collation, * apple * ]}} *Rajesh**.* -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163p4204208.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Trying to get AnalyzingInfixSuggester to work in Solr?
Just add the queryConverter definition in your solr config you should use see multiple term suggestions. and also make sure you have shingleFilterFactory as one of the filter in you schema field definitions for your field text_general. filter class=solr.ShingleFilterFactory maxShingleSize=5 outputUnigrams=true/ *Rajesh**.* On Wed, May 6, 2015 at 1:47 PM, O. Olson olson_...@yahoo.it wrote: Thank you Rajesh. I'm not familiar with the queryConverter. How do you wire it up to the rest of the setup? Right now, I just put it between the SpellCheckComponent and the RequestHandler i.e. my config is as: searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=fieldtext/str float name=threshold0.005/float str name=buildOnCommittrue/str str name=suggestAnalyzerFieldTypetext_general/str bool name=exactMatchFirsttrue/bool /lst /searchComponent queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count5/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest/str /arr /requestHandler Is this correct? I do not see any difference in my results i.e. the suggestions are the same as before. O. O. Rajesh Hazari wrote make sure you have this query converter defined in your config queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ *Thanks,* *Rajesh**.* -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163p4204173.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Trying to get AnalyzingInfixSuggester to work in Solr?
I just tested your config with my schema and it worked. my config : searchComponent class=solr.SpellCheckComponent name=suggest1 lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=fieldtextSuggest/str float name=threshold0.005/float str name=buildOnCommittrue/str str name=suggestAnalyzerFieldTypetext_general/str bool name=exactMatchFirsttrue/bool /lst /searchComponent queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest1 lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count5/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest1/str /arr /requestHandler http://localhost:8585/solr/collection1/suggest1?q=applerows=10wt=jsonindent=true { responseHeader:{ status:0, QTime:2}, spellcheck:{ suggestions:[ apple,{ numFound:5, startOffset:0, endOffset:5, suggestion:[bapple/b, bapple/b and, bapple/b and facebook, bapple/b and facebook learn, bapple/b and facebook learn from]}, collation,bapple/b]}} *Rajesh**.* On Wed, May 6, 2015 at 2:48 PM, Rajesh Hazari rajeshhaz...@gmail.com wrote: Just add the queryConverter definition in your solr config you should use see multiple term suggestions. and also make sure you have shingleFilterFactory as one of the filter in you schema field definitions for your field text_general. filter class=solr.ShingleFilterFactory maxShingleSize=5 outputUnigrams=true/ *Rajesh**.* On Wed, May 6, 2015 at 1:47 PM, O. Olson olson_...@yahoo.it wrote: Thank you Rajesh. I'm not familiar with the queryConverter. How do you wire it up to the rest of the setup? Right now, I just put it between the SpellCheckComponent and the RequestHandler i.e. my config is as: searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=fieldtext/str float name=threshold0.005/float str name=buildOnCommittrue/str str name=suggestAnalyzerFieldTypetext_general/str bool name=exactMatchFirsttrue/bool /lst /searchComponent queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count5/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest/str /arr /requestHandler Is this correct? I do not see any difference in my results i.e. the suggestions are the same as before. O. O. Rajesh Hazari wrote make sure you have this query converter defined in your config queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ *Thanks,* *Rajesh**.* -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163p4204173.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Trying to get AnalyzingInfixSuggester to work in Solr?
make sure you have this query converter defined in your config queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ *Thanks,* *Rajesh**.* On Wed, May 6, 2015 at 12:39 PM, O. Olson olson_...@yahoo.it wrote: I'm trying to get the AnalyzingInfixSuggester to work but I'm not successful. I'd be grateful if someone can point me to a working example. Problem: My content is product descriptions similar to a BestBuy or NewEgg catalog. My problem is that I'm getting only single words in the suggester results. E.g. if I type 'len', I get the suggester results like 'Lenovo' but not 'Lenovo laptop' or something larger/longer than a single word. There is a suggestion here: http://blog.mikemccandless.com/2013/06/a-new-lucene-suggester-based-on-infix.html that the search at: http://jirasearch.mikemccandless.com/search.py?index=jira is powered by the AnalyzingInfixSuggester If this is true, when I use this suggester, I get more than a few words in the suggester results, but I don't with my setup i.e. on my setup I get only single words. My configuration is searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=fieldtext/str float name=threshold0.005/float str name=buildOnCommittrue/str str name=suggestAnalyzerFieldTypetext_general/str bool name=exactMatchFirsttrue/bool /lst /searchComponent requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count5/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest/str /arr /requestHandler I copy the contents of all of my fields to a single field called 'text'. The ' text_general' type is exactly as in the solr examples: http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/example-DIH/solr/db/conf/schema.xml?view=markup I'd be grateful if anyone can help me. I don't know what to look at. Thank you in adance. O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Trying to get AnalyzingInfixSuggester to work in Solr?
Thank you Rajesh. I'm not familiar with the queryConverter. How do you wire it up to the rest of the setup? Right now, I just put it between the SpellCheckComponent and the RequestHandler i.e. my config is as: searchComponent class=solr.SpellCheckComponent name=suggest lst name=spellchecker str name=namesuggest/str str name=classnameorg.apache.solr.spelling.suggest.Suggester/str str name=lookupImplorg.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory/str str name=fieldtext/str float name=threshold0.005/float str name=buildOnCommittrue/str str name=suggestAnalyzerFieldTypetext_general/str bool name=exactMatchFirsttrue/bool /lst /searchComponent queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ requestHandler class=org.apache.solr.handler.component.SearchHandler name=/suggest lst name=defaults str name=spellchecktrue/str str name=spellcheck.dictionarysuggest/str str name=spellcheck.onlyMorePopulartrue/str str name=spellcheck.count5/str str name=spellcheck.collatetrue/str /lst arr name=components strsuggest/str /arr /requestHandler Is this correct? I do not see any difference in my results i.e. the suggestions are the same as before. O. O. Rajesh Hazari wrote make sure you have this query converter defined in your config queryConverter name=queryConverter class=org.apache.solr.spelling.SuggestQueryConverter/ *Thanks,* *Rajesh**.* -- View this message in context: http://lucene.472066.n3.nabble.com/Trying-to-get-AnalyzingInfixSuggester-to-work-in-Solr-tp4204163p4204173.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: A defect in Schema API with Add a New Copy Field Rule?
Hi Steve, It’s by design that you can copyField the same source/dest multiple times - according to Yonik (not sure where this was discussed), this capability has been used in the past to effectively boost terms in the source field. The API isn’t symmetric here though: I’m guessing deleting a mutiply specified copy field rule will delete all of them, but this isn’t tested, so I’m not sure. There is no replace-copy-field command because copy field rules don’t have dependencies (i.e., nothing else in the schema refers to copy field rules), unlike fields, dynamic fields and field types, so delete-copy-field/add-copy-field works as one would expect. For fields, dynamic fields and field types, a delete followed by an add is not the same as a replace, since (dynamic) fields could have dependent copyFields, and field types could have dependent (dynamic) fields. delete-* commands are designed to fail if there are any existing dependencies, while the replace-* commands will maintain the dependencies if they exist. Steve On May 6, 2015, at 6:44 PM, Steven White swhite4...@gmail.com wrote: Hi Everyone, I am using the Schema API to add a new copy field per: https://cwiki.apache.org/confluence/display/solr/Schema+API#SchemaAPI-AddaNewCopyFieldRule Unlike the other Add APIs, this one will not fail if you add an existing copy field object. In fact, after when I call the API over and over, the item will appear over and over in schema.xml file like so: copyField source=author dest=text/ copyField source=author dest=text/ copyField source=author dest=text/ copyField source=author dest=text/ Is this the expected behaviour or a bug? As a side question, is there any harm in having multiple copyField like I ended up with? A final question, why there is no Replace a Copy Field? Is this by design for some limitation or was the API just never implemented? Thanks Steve
Re: Union and intersection methods in solr DocSet
Hey Chris, Thanks for reply. The exception is ArrayIndexOutOfBound. It is coming because searcher may return bitDocSet for query1 and sortedIntDocSet for query2 [could be possible]. In that case, sortedIntDocSet doesn't implement intersection and will cause this exception. Thanks and regards, Gajendra Dadheech On Thu, May 7, 2015 at 6:06 AM, Chris Hostetter hossman_luc...@fucit.org wrote: : DocSet docset1 = Searcher.getDocSet(query1) : DocSet docset2 = Searcher.getDocSet(query2); : : Docset finalDocset = docset1.intersection(docset2); : : Is this a valid approach ? Give docset could either be a sortedintdocset or : a bitdocset. I am facing ArrayIndexOutOfBoundException when : union/intersected between different kind of docsets. as far as i know, that should be a totally valid usage -- since you didn't provide the details of the stack trace or the code you wrote that produced it it's hard to guess why/where it's causing the exception. FWIW: SolrIndexSearcher has getDocSet methods that take multiple arguments which might be more efficient then doing the intersection directly (and are cache aware) if all you care about is the *size* of the intersection, see the SolrIndexSearcher.numDocs methods. -Hoss http://www.lucidworks.com/
Re: A defect in Schema API with Add a New Copy Field Rule?
On Wed, May 6, 2015 at 8:10 PM, Steve Rowe sar...@gmail.com wrote: It’s by design that you can copyField the same source/dest multiple times - according to Yonik (not sure where this was discussed), this capability has been used in the past to effectively boost terms in the source field. Yep, used to be relatively common. Perhaps the API could be cleaner though if we supported that by passing an optional numTimes or numCopies? Seems like a sane delete / overwrite options would thus be easier? -Yonik
Re: Union and intersection methods in solr DocSet
: DocSet docset1 = Searcher.getDocSet(query1) : DocSet docset2 = Searcher.getDocSet(query2); : : Docset finalDocset = docset1.intersection(docset2); : : Is this a valid approach ? Give docset could either be a sortedintdocset or : a bitdocset. I am facing ArrayIndexOutOfBoundException when : union/intersected between different kind of docsets. as far as i know, that should be a totally valid usage -- since you didn't provide the details of the stack trace or the code you wrote that produced it it's hard to guess why/where it's causing the exception. FWIW: SolrIndexSearcher has getDocSet methods that take multiple arguments which might be more efficient then doing the intersection directly (and are cache aware) if all you care about is the *size* of the intersection, see the SolrIndexSearcher.numDocs methods. -Hoss http://www.lucidworks.com/
Re: severe problems with soft and hard commits in a large index
On 5/6/2015 8:55 AM, adfel70 wrote: Thank you for the detailed answer. How can I decrease the impact of opening a searcher in such a large index? especially the impact of heap usage that causes OOM. See the wiki link I sent. It talks about some of the things that require a lot of heap and ways you can reduce those requirements. The lists are nowhere near complete. http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap regarding GC tuning - I am doint that. here are the params I use: AggresiveOpts UseLargePages ParallelRefProcEnabled CMSParallelRemarkEnabled CMSMaxAbortablePrecleanTime=6000 CMDTriggerPermRatio=80 CMSInitiatingOccupancyFraction=70 UseCMSInitiatinOccupancyOnly CMSFullGCsBeforeCompaction=1 PretenureSizeThreshold=64m CMSScavengeBeforeRemark UseConcMarkSweepGC MaxTenuringThreshold=8 TargetSurvivorRatio=90 SurviorRatio=4 NewRatio=2 Xms16gb Xmn28gb This list seems to have come from re-typing the GC options. If this is a cut/paste, I would not expect it to work -- there are typos and part of each option is missing other characters. Assuming that this is not cut/paste, it is mostly similar to the CMS options that I once used for my own index: http://wiki.apache.org/solr/ShawnHeisey#CMS_.28ConcurrentMarkSweep.29_Collector How many documents per shard are recommended? Note that I use nested documents. total collection size is 3 billion docs, number of parent docs is 600 million. the rest are children. For the G1 collector, you'd want to limit each shard to about 100 million docs. I have no idea about limitations and capabilities where very large memory allocations are concerned with the CMS collector. Running the latest Java 8 is *strongly* recommended, no matter what collector you're using, because recent versions have incorporated GC improvements with large memory allocations. With Java 8u40 and later, the limitations for 16MB huge allocations on the G1 collector might not even apply. Thanks, Shawn
Re: What is the best practice to Backup and delete a core from SOLR Master-Slave architecture
Well, they're just files on disk. You can freely copy the index files around wherever you want. I'd do a few practice runs first though. So: 1 unload the core (or otherwise shut it down). 2 copy the data directory and all sub directories. 3 I'd also copy the conf directory to insure a consistent picture of the index when you restored it. 4 delete the core however you please. Of course before I did 4 I'd try bringing up the core on some other machine a few times, just to be sure you had all the necessary parts... Once you were confident of the process you don't need to restore _every_ time. Best, Erick On Wed, May 6, 2015 at 3:08 AM, sangeetha.subraman...@gtnexus.com sangeetha.subraman...@gtnexus.com wrote: Hi, I am a newbie to SOLR. I have setup Master Slave configuration with SOLR 4.0. I am trying to identify what is the best way to backup an old core and delete the same so as to free up space from the disk. I did get the information on how to unload a core and delete the indexes from the core. Unloading - http://localhost:8983/solr/admin/cores?action=UNLOADcore=core0 Delete Indexes - http://localhost:8983/solr/admin/cores?action=UNLOADcore=core0deleteIndex=true What is the best approach to remove the old core ? * Approach 1 o Unload the core in both Master and Slave server AND delete the index only from Master server (retain the indexes in slave server as a backup). If I am retaining the indexes in Slave server, at later point is there a way to bring those to Master Server ? * Approach 2 o Unload and delete the indexes from both Master and Slave server. Before deleting, take a backup of the data dir of old core from File system. I am not sure if this is even possible ? Is there any other way better way of doing this ? Please let me know Thanks Sangeetha
Re: New core on Solr Cloud
That should have put one replica on each machine, if it did you're fine. Best, Erick On Wed, May 6, 2015 at 3:58 AM, shacky shack...@gmail.com wrote: Ok, I found out that the creation of new core/collection on Solr 5.1 is made with the bin/solr script. So I created a new collection with this command: ./solr create_collection -c test -replicationFactor 3 Is this the correct way? Thank you very much, Bye! 2015-05-06 10:02 GMT+02:00 shacky shack...@gmail.com: Hi. This is my first experience with Solr Cloud. I installed three Solr nodes with three ZooKeeper instances and they seemed to start well. Now I have to create a new replicated core and I'm trying to found out how I can do it. I found many examples about how to create shards and cores, but I have to create one core with only one shard replicated on all three nodes (so basically I want to have the same data on all three nodes). Could you help me to understand what is the correct way to make this, please? Thank you very much! Bye
Re: ZooKeeperException: Could not find configName for collection
Have you looked arond at your directories on disk? I'm _not_ talking about the admin UI here. The default is core discovery mode, which recursively looks under solr_home and thinks there's a core wherever it finds a core.properties file. If you find such a thing, rename it or remove the directory. Another alternative would be to push a configset named new_core up to Zookeeper, that might allow you to see (and then delete) the collection new_core belongs to. It looks like you tried to use the admin UI to create a core and it's all local or something like that. Best, Erick On Wed, May 6, 2015 at 4:00 AM, shacky shack...@gmail.com wrote: Hi list. I created a new collection on my new SolrCloud installation, the new collection is shown and replicated on all three nodes, but on the first node (only on this one) I get this error: new_core: org.apache.solr.common.cloud.ZooKeeperException:org.apache.solr.common.cloud.ZooKeeperException: Could not find configName for collection new_core found:null I cannot see any core named new_core on that node, and I also tried to remove it: root@index1:/opt/solr# ./bin/solr delete -c new_core Connecting to ZooKeeper at zk1,zk2,zk3 ERROR: Collection new_core not found! Could you help me, please? Thank you very much! Bye
Re: Solr 5.0 - uniqueKey case insensitive ?
Yes thanks it's now for me too. Daniel, my pn is always in uppercase and I index them always in uppercase. the problem (solved now after all your answers, thanks) was the request, if users requests with lowercase then solr reply no result and it was not good. but now the problem is solved, I changed in my source file the name pn field to id and in my schema I use a copy field named pn and it works perfectly. Thanks a lot !!! Le 06/05/2015 09:44, Daniel Collins a écrit : Ah, I remember seeing this when we first started using Solr (which was 4.0 because we needed Solr Cloud), I never got around to filing an issue for it (oops!), but we have a note in our schema to leave the key field a normal string (like Bruno we had tried to lowercase it which failed). We didn't really know Solr in those days, and hadn't really thought about it since then, but Hoss' and Erick's explanations make perfect sense now! Since shard routing is (basically) done on hashes of the unique key, if I have 2 documents which are the same, but have values HELLO and hello, they might well hash to completely different shards, so the update logistics would be horrible. Bruno, why do you need to lowercase at all then? You said in your example, that your client application always supplies pn and it is always uppercase, so presumably all adds/updates could be done directly on that field (as a normal string with no lowercasing). Where does the case insensitivity come in, is that only for searching? If so couldn't you add a search field (called id), and update your app to search using that (or make that your default search field, I guess it depends if your calling app explicitly uses the pn field name in its searches). On 6 May 2015 at 01:55, Erick Erickson erickerick...@gmail.com wrote: Well, working fine may be a bit of an overstatement. That has never been officially supported, so it just happened to work in 3.6. As Chris points out, if you're using SolrCloud then this will _not_ work as routing happens early in the process, i.e. before the analysis chain gets the token so various copies of the doc will exist on different shards. Best, Erick On Mon, May 4, 2015 at 4:19 PM, Bruno Mannina bmann...@free.fr wrote: Hello Chris, yes I confirm on my SOLR3.6 it works fine since several years, and each doc added with same code is updated not added. To be more clear, I receive docs with a field name pn and it's the uniqueKey, and it always in uppercase so I must define in my schema.xml field name=id type=string multiValued=false indexed=true required=true stored=true/ field name=pn type=text_general multiValued=true indexed=true stored=false/ ... uniqueKeyid/uniqueKey ... copyField source=id dest=pn/ but the application that use solr already exists so it requests with pn field not id, i cannot change that. and in each docs I receive, there is not id field, just pn field, and i cannot also change that. so there is a problem no ? I must import a id field and request a pn field, but I have a pn field only for import... Le 05/05/2015 01:00, Chris Hostetter a écrit : : On SOLR3.6, I defined a string_ci field like this: : : fieldType name=string_ci class=solr.TextField : sortMissingLast=true omitNorms=true : analyzer : tokenizer class=solr.KeywordTokenizerFactory/ : filter class=solr.LowerCaseFilterFactory/ : /analyzer : /fieldType : : field name=pn type=string_ci multiValued=false indexed=true : required=true stored=true/ I'm really suprised that field would have worked for you (reliably) as a uniqueKey field even in Solr 3.6. the best practice for something like what you describe has always (going back to Solr 1.x) been to use a copyField to create a case insensitive copy of your uniqueKey for searching. if, for some reason, you really want case insensitve *updates* (so a doc with id foo overwrites a doc with id FOO then the only reliable way to make something like that work is to do the lowercassing in an UpdateProcessor to ensure it happens *before* the docs are distributed to the correct shard, and so the correct existing doc is overwritten (even if you aren't using solr cloud) -Hoss http://www.lucidworks.com/ --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com --- Ce courrier électronique ne contient aucun virus ou logiciel malveillant parce que la protection avast! Antivirus est active. http://www.avast.com
Re: Solr cloud clusterstate.json update query ?
Gopal: Did you see my previous answer? Best, Erick On Tue, May 5, 2015 at 9:42 PM, Gopal Jee zgo...@gmail.com wrote: about 2 , live_nodes under zookeeper is ephemeral node (please see zookeeper ephemeral node). So, once connection from solr zkClient to zookeeper is lost, these nodes will disappear automatically. AFAIK, clusterstate.json is updated by overseer based on messages published to a queue in zookeeper by solr zkclients. In case, solr node dies ungracefully, I am not sure how this event is updated in clusterstate.json. *Can someone shed some light *on ungraceful solr shutdown and consequent status update in clusterstate. I guess there would be some ay, because all nodes in a cluster decides clusterstate based on watched clusterstate.json node. They will not be watching live_nodes for updating their state. Gopal On Wed, May 6, 2015 at 6:33 AM, Erick Erickson erickerick...@gmail.com wrote: about 1. This shouldn't be happening, so I wouldn't concentrate there first. The most common reason is that you have a short Zookeeper timeout and the replicas go into a stop-the-world garbage collection that exceeds the timeout. So the first thing to do is to see if that's happening. Here are a couple of good places to start: http://lucidworks.com/blog/garbage-collection-bootcamp-1-0/ http://wiki.apache.org/solr/ShawnHeisey#GC_Tuning_for_Solr 2 Partial answer is that ZK does a keep-alive type thing and if the solr nodes it knows about don't reply, it marks the nodes as down. Best, Erick On Tue, May 5, 2015 at 5:42 AM, Sai Sreenivas K sa...@myntra.com wrote: Could you clarify on the following questions, 1. Is there a way to avoid all the nodes simultaneously getting into recovery state when a bulk indexing happens ? Is there an api to disable replication on one node for a while ? 2. We recently changed the host name on nodes in solr.xml. But the old host entries still exist in the clusterstate.json marked as active state. Though live_nodes has the correct information. Who updates clusterstate.json if the node goes down in an ungraceful fashion without notifying its down state ? Thanks, Sai Sreenivas K --
Solr port went down on remote server
Hi, I have installed Solr on remote server and started on port 8983. Now, I have bind my local machine port 8983 with remote server 8983 of Solr using *ssh* (Ubuntu OS). When I am requesting on Solr for getting the suggestions on remote server through local machine calls. Sometimes it gives response, sometimes doesn't. I am not able to detect the problem that why is it so? Is it remote server binding issue? OR Solr went down ? I am not getting the problem. To detect the problem, I ran a crontab job using telnet command to check existence of port (8983) of Solr. It is working fine without throwing any connection refused error. I am able to detect the problem. Any help please..
Limit the documents for each shard in solr cloud
Hi, Is it possible to restrict number of documents per shard in Solr cloud? Lets say we have Solr cloud with 4 nodes, and on each node we have one leader and one replica. Like wise total we have 8 shards that includes replicas. Now I need to index my documents in such a way that each shard will have only 5 million documents. Total documents in Solr cloud should be 20 million documents. Thanks, Jilani