Re: solr4.1 createNodeSet requires ip addresses?
Hi, I created a ticket and try to describe here https://issues.apache.org/jira/browse/SOLR-4471 Actually search speed, ram and memory usage on solr 4.x compared with 3.6. looks good, only the network is blocked by full copy index from slave. André On 16.02.13 03:25, Mark Miller markrmil...@gmail.com wrote: For 4.2, I'll try and put in https://issues.apache.org/jira/browse/SOLR-4078 soon. Not sure about the behavior your seeing - you might want to file a JIRA issue. - Mark On Feb 15, 2013, at 8:17 PM, Gary Yngve gary.yn...@gmail.com wrote: Hi all, I've been unable to get the collections create API to work with createNodeSet containing hostnames, both localhost and external hostnames. I've only been able to get it working when using explicit IP addresses. It looks like zk stores the IP addresses in the clusterstate.json and live_nodes. Is it possible that Solr Cloud is not doing any hostname resolving but just looking for an explicit match with createNodeSet? This is kind of annoying, in that I am working with EC2 instances and consider it pretty lame to need to use elastic IPs for internal use. I'm hacking around it now (looking up the eth0 inet addr on each machine), but I'm not happy about it. Has anyone else found a better solution? The reason I want to specify explicit nodes for collections is so I can have just one zk ensemble managing collections across different environments that will go up and down independently of each other. Thanks, Gary
SpellCheck - Ignore list of words
Hi All I have a use case where I have a list of words, on which I don't want to perform spellcheck. Like stemming ignores the words listed in protwords.txt file. Any idea, how it can be solved? Thanks Hemant -- View this message in context: http://lucene.472066.n3.nabble.com/SpellCheck-Ignore-list-of-words-tp4041099.html Sent from the Solr - User mailing list archive at Nabble.com.
tlog file questions
Hi I have some questions about tlog files and how are managed. I'm using dih to do incremental data loading, once a day I do a full refresh. these are the request parameters /dataimport?command=full-importcommit=true /dataimport?command=delta-importcommit=trueoptimize=false I was expecting to see removed all the old tlog file when completing a delta/full, but I see that these files remains. Actually the older files gets removed. Am I using the wrong parameters? is there a different parameter to trigger the hard commit? Are there some configuration parameters to configure the number of tlog files to keep? Unfortunately I have very little space on my disks and I need to double check space consumption . I'm using solr 4 Thank you
Custom shard key, shard partitioning
Hi, By defaut SolrCloud partitions records by the hash of the uniqueKey field but we want to do some tests and partition the records by a signed integer field but keep the current uniqueKey unique. I've scanned through several issues concerning distributed index, custom hashing, shard policies etc but i have not found some concise examples or documentation or even blog post on this matter. How do we set up shard partitioning via another than the default uniqueKey field? According to some older resolved issue CloudSolrServer should be cloud aware and send updates to the leader of the correct shards, how does it know this? Must we set up the same partitioning in SolrServer client as well? If so, how? The apidocs do not reveal a lot when i look through them. I probably totally missed an issue or discussion or wiki page. Thanks, Markus
Re: Custom Solr FunctionQuery Error
Hi! Although more than 1 year has passed, could I ask you, Parvin, what was your final approach? I have to deal with a similar problem (http://lucene.472066.n3.nabble.com/Combining-Solr-score-with-customized-user-ratings-for-a-document-td4040200.html), maybe a bit more difficult because it's a by-user score customization, but I would probably find helpful your solution. Thanks! Álvaro -- View this message in context: http://lucene.472066.n3.nabble.com/Custom-Solr-FunctionQuery-Error-tp3615899p4041113.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: get filterCache in Component
Chris, Mihhail, I'd like to avoid issueing a query and spare the cycles. In SOLR-4280 i only look for the smallest DocSet by iterating over them. I would tend to think it's cheaper than getDocSet() and perhaps cacheDocSet(). In case i would add non-usercaches to the cacheMap and create a separate issue for that, would that break things? Be really bad? Thanks, Markus -Original message- From:Mikhail Khludnev mkhlud...@griddynamics.com Sent: Fri 15-Feb-2013 21:17 To: solr-user solr-user@lucene.apache.org Subject: Re: get filterCache in Component Markus, I wonder why you need an access to it. I've thought that current searcher's methods (getDocSet(), cacheDocSet() ) are enough to do everything. Anyway, if you wish, I just looked in code and see that it's available via core.getInfoRegistry().get(filterCache), it can lead to some problems, but should work. On Fri, Feb 15, 2013 at 4:30 PM, Markus Jelsma markus.jel...@openindex.iowrote: Hi, I need to get the filterCache for SOLR-4280. I can create a new issue patching SolrIndexSearcher and adding the missing caches (non-user caches) to the cacheMap so they can be returned using getCache(String) but i'm not sure this is intented. It does work but is this the right path? https://issues.apache.org/jira/browse/SOLR-4280 Thanks, Markus -Original message- From:Markus Jelsma markus.jel...@openindex.io Sent: Thu 14-Feb-2013 13:18 To: solr-user@lucene.apache.org Subject: get filterCache in Component Hi, We need to get the filterCache in a Component but SolrIndexSearcher.getCache(String name) does not return it. It seems the filterCache is not added to cacheMap and can therefore not be returned. SolrCacheQuery,DocSet filterCache = rb.req.getSearcher().getCache(filterCache); Will always return null. Can we get the filterCache via other means or should it be added to the cacheMap so getCache can return it? Thanks, Markus -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re: Custom shard key, shard partitioning
Hi, I was able to implement custom hashing with the use of _shard_ field. It contains the name of shard a document should go to. Works fine. Maybe there's some other method to do the same with the use of solrconfig.xml, but I have not found any docs about it so far. Regards. On 18 February 2013 13:34, Markus Jelsma markus.jel...@openindex.io wrote: Hi, By defaut SolrCloud partitions records by the hash of the uniqueKey field but we want to do some tests and partition the records by a signed integer field but keep the current uniqueKey unique. I've scanned through several issues concerning distributed index, custom hashing, shard policies etc but i have not found some concise examples or documentation or even blog post on this matter. How do we set up shard partitioning via another than the default uniqueKey field? According to some older resolved issue CloudSolrServer should be cloud aware and send updates to the leader of the correct shards, how does it know this? Must we set up the same partitioning in SolrServer client as well? If so, how? The apidocs do not reveal a lot when i look through them. I probably totally missed an issue or discussion or wiki page. Thanks, Markus
Re: Updating data
Hi, i got a problem. problem is i have json file [ { id:5, is_good:{add:1} }, { id:1, is_good:{add:1} }, { id:2, is_good:{add:1} }, { id:3, is_good:{add:1} } ] now due to stopping of tomcat only one of doc [id:5]added in solr. now if i am trying again to post this file for update. it is giving me error multivalue becoz id:5 already updated .. due to this remaining id is not updating in solr. i have 25 lakh doc in a json file. please give me some idea.. -- View this message in context: http://lucene.472066.n3.nabble.com/Updating-data-tp4038492p4041123.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: tlog file questions
On 2/18/2013 4:57 AM, giovanni.bricc...@banzai.it wrote: I have some questions about tlog files and how are managed. I'm using dih to do incremental data loading, once a day I do a full refresh. these are the request parameters /dataimport?command=full-importcommit=true /dataimport?command=delta-importcommit=trueoptimize=false I was expecting to see removed all the old tlog file when completing a delta/full, but I see that these files remains. Actually the older files gets removed. Am I using the wrong parameters? is there a different parameter to trigger the hard commit? Are there some configuration parameters to configure the number of tlog files to keep? Unfortunately I have very little space on my disks and I need to double check space consumption . Your best option is to turn on autoCommit with openSearcher set to false. I use a maxDocs of 25000 and a maxTime of 30 (five minutes). Every 25000 docs, Solr does a hard commit, but because openSearcher is false, it does not change the index at all from the perspective of a client. You would need to choose values appropriate for your installation. !-- the default high-performance update handler -- updateHandler class=solr.DirectUpdateHandler2 autoCommit maxDocs25000/maxDocs maxTime30/maxTime openSearcherfalse/openSearcher /autoCommit updateLog / /updateHandler The hard commit does one important thing here - it closes the current tlog and starts a new one. Solr does not keep very many tlogs around, but if you do a full-import without any commits, the tlog will contain every single document you have. I actually do my index rebuilds in build core and swap them to live when the rebuild is fully complete, but I have double-checked the docs available from a client, and they do not change until the full-import is done. Another thing - I would use optimize=false on the full-import and the delta-import. The only real reason to do an optimize in a modern Solr version is to purge deleted documents. If you are doing a new full-import every day, then you don't have to worry about that, because the new index will not contain any deleted documents. It's true that an optimized index does slightly outperform one with many segments of varying sizes, but generally speaking the huge I/O overhead during the optimize is very detrimental to performance. Thanks, Shawn
SEVERE RecoveryStrategy Recovery failed - trying again... (9)
I am seeing the following error in my Admin console and the core/ cloud status is taking forever to load. SEVERERecoveryStrategyRecovery failed - trying again... (9) What causes this and how can I recover from this mode? Regards, Rohit
Re: SpellCheck - Ignore list of words
The 4.x based spellcheck process just looks in the index and enumerates the terms, there's no special sidecar index. So you'd probably have to create a different field that contained only the words you wanted to be returned as possibilities Best Erick On Mon, Feb 18, 2013 at 5:06 AM, Hemant Verma hemantverm...@gmail.comwrote: Hi All I have a use case where I have a list of words, on which I don't want to perform spellcheck. Like stemming ignores the words listed in protwords.txt file. Any idea, how it can be solved? Thanks Hemant -- View this message in context: http://lucene.472066.n3.nabble.com/SpellCheck-Ignore-list-of-words-tp4041099.html Sent from the Solr - User mailing list archive at Nabble.com.
Errors during index optimization on solrcloud
Hi, I'm running SolrCloud (Solr4) with 1 core, 8 shards and zookeeper My index is being updated every minute, so I'm running optimization once a day. Every time during the optimization there is an error: SEVERE: shard update error StdNode: http://host:port/solr/core_name/ SEVERE: shard update error StdNode: http://host:port/solr/core_name/:org.apache.solr.common.SolrException: Server at http://host:port/solr/core_name/ returned non ok status:503, message:Service Unavailable Any ideas what is causes this error and how to avoid it? thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Errors-during-index-optimization-on-solrcloud-tp4041135.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SEVERE RecoveryStrategy Recovery failed - trying again... (9)
We need to see more of your logs to determine why - there should be some exceptions logged. - Mark On Feb 18, 2013, at 9:47 AM, Cool Techi cooltec...@outlook.com wrote: I am seeing the following error in my Admin console and the core/ cloud status is taking forever to load. SEVERERecoveryStrategyRecovery failed - trying again... (9) What causes this and how can I recover from this mode? Regards, Rohit
Re: Custom shard key, shard partitioning
Yeah, I think we are missing some docs on this… I think the info is in here: https://issues.apache.org/jira/browse/SOLR-2592 But it's not so easy to pick out - I'd been considering going through and writing up some wiki doc for that feature (unless I'm somehow missing it), but just been too busy with other stuff.. Concerning CloudSolrServer, there is a JIRA to make it hash and send updates to the right leader, but currently it still doesn't - it just favors leaders in general over non leaders currently. - Mark On Feb 18, 2013, at 7:34 AM, Markus Jelsma markus.jel...@openindex.io wrote: Hi, By defaut SolrCloud partitions records by the hash of the uniqueKey field but we want to do some tests and partition the records by a signed integer field but keep the current uniqueKey unique. I've scanned through several issues concerning distributed index, custom hashing, shard policies etc but i have not found some concise examples or documentation or even blog post on this matter. How do we set up shard partitioning via another than the default uniqueKey field? According to some older resolved issue CloudSolrServer should be cloud aware and send updates to the leader of the correct shards, how does it know this? Must we set up the same partitioning in SolrServer client as well? If so, how? The apidocs do not reveal a lot when i look through them. I probably totally missed an issue or discussion or wiki page. Thanks, Markus
Re: Errors during index optimization on solrcloud
Not sure - any other errors? An optimize once a day is a very heavy operation by the way! Be sure the gains are worth the pain you pay. - Mark On Feb 18, 2013, at 10:04 AM, adm1n evgeni.evg...@gmail.com wrote: Hi, I'm running SolrCloud (Solr4) with 1 core, 8 shards and zookeeper My index is being updated every minute, so I'm running optimization once a day. Every time during the optimization there is an error: SEVERE: shard update error StdNode: http://host:port/solr/core_name/ SEVERE: shard update error StdNode: http://host:port/solr/core_name/:org.apache.solr.common.SolrException: Server at http://host:port/solr/core_name/ returned non ok status:503, message:Service Unavailable Any ideas what is causes this error and how to avoid it? thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Errors-during-index-optimization-on-solrcloud-tp4041135.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.?
Look at HTMLStripCharFilter, which accepts HTML as its source text, which preserves all the HTML tags in the stored value, but then strips off the HTML tags for tokenization into terms. So, you can search for the actual text terms, but the HTML will still be in the returned field value for highlighting. See: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory -- Jack Krupansky -Original Message- From: Divyanand Tiwari Sent: Monday, February 18, 2013 7:28 AM To: solr-user@lucene.apache.org Subject: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.? Hi everyone, i am new to solr technology and not getting a way to get back the original HTML document with Hits highlighted into it. what configuration and where i can do to instruct SolrCell/ Tika so that it does not strips down the tags of HTML document in the content field. Any support would be greatly appreciated. A waiting for your quick reply.. Thank you!!! -- Regards, Divyanand Tiwari
Re: SpellCheck - Ignore list of words
1. Create a copy of the field and add the exception list to it. 2. Or, add a second spell checker to your spellcheck search component that is a FileBasedSpellChecker with the exceptions in a simple text file. Then reference both spellcheckers with spellcheck.dictionary, with the FileBasedSpellChecker as the first. -- Jack Krupansky -Original Message- From: Hemant Verma Sent: Monday, February 18, 2013 2:06 AM To: solr-user@lucene.apache.org Subject: SpellCheck - Ignore list of words Hi All I have a use case where I have a list of words, on which I don't want to perform spellcheck. Like stemming ignores the words listed in protwords.txt file. Any idea, how it can be solved? Thanks Hemant -- View this message in context: http://lucene.472066.n3.nabble.com/SpellCheck-Ignore-list-of-words-tp4041099.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Errors during index optimization on solrcloud
I think it's best to tweak merge parameters instead and amortize the cost of keeping down the number of segments. Deletes will be naturally expunged as documents come in and segments are merged. For 90% of use cases, this is the best way to go IMO. Even if you just want to get rid of deletes, look into expunge deletes - it merges just what's needed to get rid of deletes, which may not always mean a full optimize down to one segment. My advice on optimize would be to do it when you are not going to get any updates in very often or for a long time. Otherwise it's best just to tune merge parameters and avoid optimize altogether. It's usually pre optimization that leads to the over use of optimize and it's usually unnecessary and quite costly. - Mark On Feb 18, 2013, at 11:12 AM, adm1n evgeni.evg...@gmail.com wrote: Thanks for your response. No, nothing else. Only those errors. By the way, what is the best practice for optimization process - should it be done each period of time (for example cron-based) or it depends on diff between max doc and num docs counts? -- View this message in context: http://lucene.472066.n3.nabble.com/Errors-during-index-optimization-on-solrcloud-tp4041135p4041157.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Updating data
Use set instead of add. See: http://wiki.apache.org/solr/UpdateJSON#Atomic_Updates -- Jack Krupansky -Original Message- From: anurag.jain Sent: Monday, February 18, 2013 6:09 AM To: solr-user@lucene.apache.org Subject: Re: Updating data Hi, i got a problem. problem is i have json file [ { id:5, is_good:{add:1} }, { id:1, is_good:{add:1} }, { id:2, is_good:{add:1} }, { id:3, is_good:{add:1} } ] now due to stopping of tomcat only one of doc [id:5]added in solr. now if i am trying again to post this file for update. it is giving me error multivalue becoz id:5 already updated .. due to this remaining id is not updating in solr. i have 25 lakh doc in a json file. please give me some idea.. -- View this message in context: http://lucene.472066.n3.nabble.com/Updating-data-tp4038492p4041123.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr search – Tika extracted text from PDF not return highlighting snippet
I am replying to this post because I am also facing very similar issue. I am indexing the documents stored in a blob field of a MySQL database. I have described the whole setup in the following blog post: http://tuxdna.wordpress.com/2013/02/04/indexing-the-documents-stored-in-a-database-using-apache-solr-and-apache-tika/ Basically, the blob content is fetched from database, and then it is parsed by Tika and converted into text. All the fields in the datbase table get indexed properly except the blob field ( which was processed by Tika ). It doesn't reflect in Solr schema browser. There are no terms against the text field. I tried with some permutation and combination of the fields in ( db-data-config.xml and schema.xml ) and got it working. I now have to fields text and text1, where text is indexed + stored, and text2 is neither. However if I remove text2 from configuration, I am back to the same problem i.e. the field doesn't get indexed. I don't understand how, the above work around is working. Can anyone give me pointers where I can explore further to understand this behaviour? Is it solvable using copyField ? NOTE: I have described the configuration files and setup in the link above. Thanks in advance! :) /tuxdna -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-search-Tika-extracted-text-from-PDF-not-return-highlighting-snippet-tp3999647p4041180.html Sent from the Solr - User mailing list archive at Nabble.com.
RequestHandler init failure
When trying to use SolrEntityProcessor to do data import from another solr index (solor 4.1) I added the following in solrconfig.xml requestHandler name=/data class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler and create new file data-config.xml with dataConfig document entity name=sep processor=SolrEntityProcessor url=http://wolf:1Xnbdoq@myserver:8995/solr/; query=*:* fl=id,md5_text,title,text/ /document /dataConfig I got the following errors: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.SolrCore.init(SolrCore.java:794) at org.apache.solr.core.SolrCore.init(SolrCore.java:607) at org.apache.solr.core.CoreContainer.createFromZk(CoreContainer.java:949) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1031) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:168) at org.apache.solr.core.SolrCore.init(SolrCore.java:731) ... 13 more Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:438) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:507) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:581) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:154) ... 14 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:422) ... 17 more Feb 18, 2013 7:24:43 PM org.apache.solr.common.SolrException log SEVERE: null:org.apache.solr.common.SolrException: Unable to create core: collection1 at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1654) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1039) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) I assume that it's because jar file related to dataimporthandler is not included in default solr 4.1 distribution. Where can I find it? Thanks Ming
Re: Conditional Field Search without affecting score.
thanks Eric, is this what you are pointing me to ? http://.../solr/select?q=if(exist(title.3),(title.3:xyz),(title.0:xyz)) I believe i should be able to use boost along with proximity too. -- View this message in context: http://lucene.472066.n3.nabble.com/Conditional-Field-Search-without-affecting-score-tp4040657p4041188.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: RequestHandler init failure
Found it by myself. It's here http://mirrors.ibiblio.org/maven2/org/apache/solr/solr-dataimporthandler/4.1.0/ Download and move the jar file to solr-webapp/webapp/WEB-INF/lib directory, and the errors are all gone. Ming On Mon, Feb 18, 2013 at 11:52 AM, Mingfeng Yang mfy...@wisewindow.comwrote: When trying to use SolrEntityProcessor to do data import from another solr index (solor 4.1) I added the following in solrconfig.xml requestHandler name=/data class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler and create new file data-config.xml with dataConfig document entity name=sep processor=SolrEntityProcessor url=http://wolf:1Xnbdoq@myserver:8995/solr/; query=*:* fl=id,md5_text,title,text/ /document /dataConfig I got the following errors: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.SolrCore.init(SolrCore.java:794) at org.apache.solr.core.SolrCore.init(SolrCore.java:607) at org.apache.solr.core.CoreContainer.createFromZk(CoreContainer.java:949) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1031) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) Caused by: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:168) at org.apache.solr.core.SolrCore.init(SolrCore.java:731) ... 13 more Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:438) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:507) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:581) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:154) ... 14 more Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at java.lang.ClassLoader.loadClass(ClassLoader.java:248) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:422) ... 17 more Feb 18, 2013 7:24:43 PM org.apache.solr.common.SolrException log SEVERE: null:org.apache.solr.common.SolrException: Unable to create core: collection1 at org.apache.solr.core.CoreContainer.recordAndThrow(CoreContainer.java:1654) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:1039) at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629) I assume that it's because jar file related to dataimporthandler is not included in default solr 4.1 distribution. Where can I find it? Thanks Ming
Re: Reloading config to zookeeper
I hope my question is somewhat relevant to the discussion. I'm relatively new to zk/SolrCloud, and I have new environment configured with an ZK ensemble (3 nodes) running with SolrCloud. Things are running, yet I'm puzzled since I can't find the Solr congif data on zookeeper nodes. What is the default location? Thanks in advance! /michael -- View this message in context: http://lucene.472066.n3.nabble.com/Reloading-config-to-zookeeper-tp4021901p4041189.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Reloading config to zookeeper
@Marcin - Maybe I mis-understood your process but I don't think you need to reload the collection on each node if you use the expanded collections admin API, i.e. the following will propagate the reload across your cluster for you: http://localhost:8983/solr/admin/collections?action=RELOADname=mycollection See http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collections_API On Mon, Feb 18, 2013 at 1:13 PM, mshirman mshir...@gmail.com wrote: I hope my question is somewhat relevant to the discussion. I'm relatively new to zk/SolrCloud, and I have new environment configured with an ZK ensemble (3 nodes) running with SolrCloud. Things are running, yet I'm puzzled since I can't find the Solr congif data on zookeeper nodes. What is the default location? Thanks in advance! /michael -- View this message in context: http://lucene.472066.n3.nabble.com/Reloading-config-to-zookeeper-tp4021901p4041189.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Is it possible to manually select a shard leader in a running SolrCloud?
Hey all, I feel having to unload the leader core to force an election is hacky, and as far as I know would still leave which node becomes the Leader to chance, ie I cannot guarantee NodeX becomes Leader 100% in all cases. Also, this imposes additional load temporarily. Is there a way to force the winner of the Election, and if not, is there a known feature-request for this? Cheers, Tim Vaillancourt -Original Message- From: Joseph Dale [mailto:joey.d...@gmail.com] Sent: Sunday, February 03, 2013 7:42 AM To: solr-user@lucene.apache.org Subject: Re: Is it possible to manually select a shard leader in a running SolrCloud? With solrclound all cores are collections. The collections API it just a wrapper to call the core api a million times with one command. to /solr/admin/cores?action=CREATEname=core1collection=core1shard=1 Basically your creating the shard again, after leader props have gone out. Solr will check ZK and find a core meeting that description, then simply get a copy of the index from the leader of that shard. On Feb 3, 2013, at 10:37 AM, Brett Hoerner br...@bretthoerner.com wrote: What is the inverse I'd use to re-create/load a core on another machine but make sure it's also known to SolrCloud/as a shard? On Sat, Feb 2, 2013 at 4:01 PM, Joseph Dale joey.d...@gmail.com wrote: To be more clear lets say bob it the leader of core 1. On bob do a /admin/cores?action=unloadname=core1. This removes the core/shard from bob, giving the other servers a chance to grab leader props. -Joey On Feb 2, 2013, at 11:27 AM, Brett Hoerner br...@bretthoerner.com wrote: Hi, I have a 5 server cluster running 1 collection with 20 shards, replication factor of 2. Earlier this week I had to do a rolling restart across the cluster, this worked great and the cluster stayed up the whole time. The problem is that the last node I restarted is now the leader of 0 shards, and is just holding replicas. I've noticed this node has abnormally high load average, while the other nodes (who have the same number of shards, but more leaders on average) are fine. First, I'm wondering if that loud could be related to being a 5x replica and 0x leader? Second, I was wondering if I could somehow flag single shards to re-elect a leader (or force a leader) so that I could more evenly distribute how many leader shards each physical server has running? Thanks.
Japanese mm parameter in Solr3.6.2 generated lots of results with big performance hit
In Solr3.6.1 using text_ja field generated huge number of results, that degraded its performance significantly. The queries that were taking 15ms have gone up to 400ms and the other issue, it is not honoring rows parameter. The output results are not capped by the the number of documents requested using rows=100 parameter but lot more. Has anyone experienced this issue and what is the solution to improve the performance as putting the index into RAM and Cache did not have a significant impact on the performance. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Japanese-mm-parameter-in-Solr3-6-2-generated-lots-of-results-with-big-performance-hit-tp4041200.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCloud configuration in a zookeeper node
I'm relatively new to zk/SolrCloud, and I have new environment configured with an ZK ensemble (3 nodes) running with SolrCloud. Things are running, yet I'm puzzled since I can't find the Solr config data on zookeeper nodes. What is the default location? Thank you in advance! /michael -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-configuration-in-a-zookeeper-node-tp4041205.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud configuration in a zookeeper node
/configs/collectionName You should be able to see this from the Solr admin console as well: Cloud Tree configs collectionName Cheers, Tim On Mon, Feb 18, 2013 at 4:23 PM, mshirman mshir...@gmail.com wrote: I'm relatively new to zk/SolrCloud, and I have new environment configured with an ZK ensemble (3 nodes) running with SolrCloud. Things are running, yet I'm puzzled since I can't find the Solr config data on zookeeper nodes. What is the default location? Thank you in advance! /michael -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-configuration-in-a-zookeeper-node-tp4041205.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Japanese mm parameter in Solr3.6.2 generated lots of results with big performance hit
Maybe you need to turn on autoGeneratePhraseQueries=true on your field type. And turn on debugQuery=true on your query to see what actually get generated. Show us a typical query - the rows parameter should always work, unless it's written wrong. -- Jack Krupansky -Original Message- From: kirpakaro Sent: Monday, February 18, 2013 2:39 PM To: solr-user@lucene.apache.org Subject: Japanese mm parameter in Solr3.6.2 generated lots of results with big performance hit In Solr3.6.1 using text_ja field generated huge number of results, that degraded its performance significantly. The queries that were taking 15ms have gone up to 400ms and the other issue, it is not honoring rows parameter. The output results are not capped by the the number of documents requested using rows=100 parameter but lot more. Has anyone experienced this issue and what is the solution to improve the performance as putting the index into RAM and Cache did not have a significant impact on the performance. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Japanese-mm-parameter-in-Solr3-6-2-generated-lots-of-results-with-big-performance-hit-tp4041200.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: SEVERE RecoveryStrategy Recovery failed - trying again... (9)
There is not error I can see in the logs, my shards are divided over three machines, the cloud runs fine when I don't bring up one of the nodes, the moment I start that particular note, the cloud stops responding, Feb 19, 2013 5:22:22 AM org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener newSearcher INFO: Loading spell index for spellchecker: default Feb 19, 2013 5:22:22 AM org.apache.solr.handler.component.SpellCheckComponent$SpellCheckerListener newSearcher INFO: Loading spell index for spellchecker: wordbreak Feb 19, 2013 5:22:22 AM org.apache.solr.core.SolrCore registerSearcher INFO: [cmn] Registered new searcher Searcher@3b47788d main{StandardDirectoryReader(segments_1dvf:1488121 _2acm(4.1):C13967428/87404 _62w6(4.1):C259989/31792 _8ehw(4.1):C405062/57136 _8um4(4.1):C228434/26526 _a0i1(4.1):C171825/43653 _bgu3(4.1):C315311/30246 _ao6h(4.1):C176468/44702 _b7uu(4.1):C97823/27124 _bjzb(4.1):C77280/8476 _bra3(4.1):C142681/21340 _bzpo(4.1):C198058/23506 _c0jh(4.1):C18201/8171 _c307(4.1):C37984/5305 _c2e0(4.1):C22300/9788 _c1o6(4.1):C23523/8630 _c3hl(4.1):C12034/2871 _c3kw(4.1):C5821/971 _c3l6(4.1):C1106 _c3lh(4.1):C707/1 _c3lu(4.1):C509/2 _c3mf(4.1):C482/1 _c3m5(4.1):C374/2 _c3mc(4.1):C164/2 _c3mh(4.1):C64/3 _c3mi(4.1):C49 _c3mj(4.1):C25 _c3mk(4.1):C12)} Feb 19, 2013 5:22:22 AM org.apache.solr.cloud.ZkController publish INFO: publishing core=cmn state=down Feb 19, 2013 5:22:22 AM org.apache.solr.cloud.ZkController publish INFO: numShards not found on descriptor - reading it from system property Feb 19, 2013 5:22:22 AM org.apache.solr.core.CoreContainer registerCore INFO: registering core: cmn Feb 19, 2013 5:22:22 AM org.apache.solr.cloud.ZkController register INFO: Register replica - core:cmn address:http://10.0.0.205:8080/solr collection:cmn shard:shard2 Feb 19, 2013 5:22:22 AM org.apache.solr.client.solrj.impl.HttpClientUtil createClient INFO: Creating new http client, config:maxConnections=1maxConnectionsPerHost=20connTimeout=3socketTimeout=3retry=false Regards, Ayush Subject: Re: SEVERERecoveryStrategyRecovery failed - trying again... (9) From: markrmil...@gmail.com Date: Mon, 18 Feb 2013 10:21:53 -0500 To: solr-user@lucene.apache.org We need to see more of your logs to determine why - there should be some exceptions logged. - Mark On Feb 18, 2013, at 9:47 AM, Cool Techi cooltec...@outlook.com wrote: I am seeing the following error in my Admin console and the core/ cloud status is taking forever to load. SEVERERecoveryStrategyRecovery failed - trying again... (9) What causes this and how can I recover from this mode? Regards, Rohit
TIMESTAMP
Hi all, I have json file in which there is field name last_login and value of that field in timestamp. I want to store that value in timestamp. do not want to change field type. Now question is how to store timestamp so that when i need output in datetime format it give date time format and whenver i need output in timestamp format it give timestamp format. Please Reply it is very urgent --- i have to do this task today itself. -- View this message in context: http://lucene.472066.n3.nabble.com/TIMESTAMP-tp4041225.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.?
Thank you for replying sir !!! I have two queries related with this - 1) So in this case which request handler I have to use because 'ExtractingRequestHandler' by default strips the html content and the default handler 'UpdateRequestHandler' does not accepts the HTML contrents. 2) How can I 'Extract' 'Index' META information in the HTML document separately. Awaiting your reply Thank you!!!
Re: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.?
Use the standard update handler and pass the entire HTML page as literal text in a Solr XML document for the field that has the HTML strip filter, but be sure to escape the HTML (angle brackets, ampersands, etc.) syntax. You'll have to process meta information yourself. -- Jack Krupansky -Original Message- From: Divyanand Tiwari Sent: Monday, February 18, 2013 10:52 PM To: solr-user@lucene.apache.org Subject: Re: How can i instruct the Solr/ Solr Cell to output the original HTML document which was fed to it.? Thank you for replying sir !!! I have two queries related with this - 1) So in this case which request handler I have to use because 'ExtractingRequestHandler' by default strips the html content and the default handler 'UpdateRequestHandler' does not accepts the HTML contrents. 2) How can I 'Extract' 'Index' META information in the HTML document separately. Awaiting your reply Thank you!!!
Re: JMX generation number is wrong
Should I log a defect in Jira for this? Ari Maniatis On 14/02/13 6:50pm, Aristedes Maniatis wrote: I'm trying to monitor the state of a master-slave Solr4.1 cluster. I can easily get the generation number of the slaves using JMX like this: solr/{corename}/org.apache.solr.handler.ReplicationHandler/generation That works fine. However on the master, this number is always 1. Which makes it rather hard to check if the slaves are lagging behind. Is this a defect in the JMX properties in Solr and I should file a Jira? Ari -- -- Aristedes Maniatis GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A