Re: Query parsing - difference between Analysis and parsedquery_toString output
q: manufacture_t:The Hershey Company^100 OR title_t:The Hershey Company^1000 Firstly, Make sure that manufacture_t and title_t are text_general type, and Let's use this approach instead of your approach q=The Hershey Companyq.op=ANDqf=manufacture_t title_tdefType=edismax -- View this message in context: http://lucene.472066.n3.nabble.com/Query-parsing-difference-between-Analysis-and-parsedquery-toString-output-tp4164851p4164884.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: FW: Complex boost statement
Please try this if(and(exists(query({!v=BUS_CITY:regina})),exists(BUS_IS_NEARBY)),20,1) -- View this message in context: http://lucene.472066.n3.nabble.com/Complex-boost-statement-tp4164572p4164885.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom JSON
check https://issues.apache.org/jira/browse/SOLR-6633 On Fri, Oct 17, 2014 at 5:35 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: I wonder how hard it would be to write an URP to just copy JSON from the request into a store-only field? Regards, Alex On 17/10/2014 1:21 am, Noble Paul noble.p...@gmail.com wrote: The original json is is not stored the fields are extracted and the data is thrown away On Fri, Oct 17, 2014 at 1:18 AM, Scott Dawson sc.e.daw...@gmail.com wrote: Noble, Thanks. You're right. I had some things incorrectly configured but now I can put structured JSON into Solr using the out-of-the-box solrconfig.xml. One additional question: Is there any way to query Solr and receive the original structured JSON document in response? Or does the flattening process that happens during indexing obliterate the original structure with no way to reconstruct it? Thanks again, Scott On Thu, Oct 16, 2014 at 2:10 PM, Noble Paul noble.p...@gmail.com wrote: The end point /update/json/docs is enabled implicitly in Solr irrespective of the solrconfig.xml In schemaless the fields are created automatically by solr. If you have all the fields created in your schema.xml it will work . if you need an id field please use a copy field to create one --Noble On Thu, Oct 16, 2014 at 8:42 PM, Scott Dawson sc.e.daw...@gmail.com wrote: Hello, I'm trying to use the new custom JSON feature described in https://issues.apache.org/jira/browse/SOLR-6304. I'm running Solr 4.10.1. It seems that the new feature, or more specifically, the /update/json/docs endpoint is not enabled out-of-the-box except in the schema-less example. Is there some dependence of the feature on schemaless mode? I've tried pulling the endpoint definition and related pieces of the example-schemaless solrconfig.xml and adding those to the standard solrconfig.xml in the main example but I've run into a cascade of issues. Right now I'm getting a This IndexSchema is not mutable exception when I try to post to the /update/json/docs endpoint. My real question is -- what's the easiest way to get this feature up and running quickly and is this documented somewhere? I'm trying to do a quick proof-of-concept to verify that we can move from our current flat JSON ingestion to a more natural use of structured JSON. Thanks, Scott Dawson -- - Noble Paul -- - Noble Paul -- - Noble Paul
Re: Query parsing - difference between Analysis and parsedquery_toString output [SOLVED]
Thanks guys for a quick reply, Adding ( ) to query values resolved the issue! Tanya -- View this message in context: http://lucene.472066.n3.nabble.com/Query-parsing-difference-between-Analysis-and-parsedquery-toString-output-tp4164851p4164912.html Sent from the Solr - User mailing list archive at Nabble.com.
unstable results on refresh
Hello I have a procedure that sends small data changes during the day to a solrcloud cluster, version 4.8 The cluster is made of three nodes, and three shards, each node contains two shards The procedure has been running for days; I don't know when but at some point one of the cores has gone out of synch and so repeating the same query has began to show small differences. The core graph was not useful, everything seemed active. I have solved the problem reindexing all, because the collection is quite small, but is there a way to fix this problem? Suppose I can figure out which core returns different results, is there a command to force that core to refetch the whole index from its master? Thanks Giovanni
Re: How to properly use Levenstein distance with ~ in Java
Ok, thank you for your response. But why I cannot use '~'? On 20 October 2014 07:40, Ramzi Alqrainy ramzi.alqra...@gmail.com wrote: You can use Levenstein Distance algorithm inside solr without writing code by specifing the source of terms in solrconfig.xml searchComponent name=spellcheck class=solr.SpellCheckComponent lst name=spellchecker str name=classnamesolr.IndexBasedSpellChecker/str str name=spellcheckIndexDir./spellchecker/str str name=fieldcontent/str str name=buildOnCommittrue/str /lst /searchComponent This example shows the results of a simple query that defines a query using the spellcheck.q parameter. The query also includes a spellcheck.build=true parameter, which is needs to be called only once in order to build the index. spellcheck.build should not be specified with for each request. http://localhost:8983/solr/spellCheckCompRH?q=*:*spellcheck.q=hell%20ultrasharspellcheck=truespellcheck.build=true lst name=spellcheck lst name=suggestions lst name=hell int name=numFound1/int int name=startOffset0/int int name=endOffset4/int arr name=suggestion strdell/str /arr /lst lst name=ultrashar int name=numFound1/int int name=startOffset5/int int name=endOffset14/int arr name=suggestion strultrasharp/str /arr /lst /lst /lst Once the suggestions are collected, they are ranked by the configured distance measure (Levenstein Distance by default) and then by aggregate frequency. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-properly-use-Levenstein-distance-with-in-Java-tp4164793p4164883.html Sent from the Solr - User mailing list archive at Nabble.com. -- Pozdrawiam / Best regards Aleksander Sadecki
CoreAdminRequest in SolrCloud
Hi, I'm trying to get all shards statistics in cloud configuration. I'v used CoreAdminRequest but the problem is I get statistics for only shards (or core) in one node (I've 2 nodes): String zkHostString = 10.0.1.4:2181; CloudSolrServer solrServer= new CloudSolrServer(zkHostString); CoreAdminRequest request = new CoreAdminRequest(); request.setAction(CoreAdminAction.STATUS); CoreAdminResponse cores = request.process(solrServer); for (int i = 0; i cores.getCoreStatus().size(); i++) { NamedListObject ll=cores.getCoreStatus().getVal(i); System.out.println(ll.toString()); } Any idea? Regards, Nabil.
suggestion for new custom atomic update
Hi all, This is my use case: I have a stored field, field_a, which is atomic updated (let's say by inc). field_a is stored but not indexed due to the large number of distinct values it can have. I need to index field_b (I need facet and stats on it) which is not in the document but its value is based on a calculation of the recent (e.g. summed) value of field_a. There is no way to do it nowadays. So I thought of a new method: custom atomic update. There will be a new interface in Solr: public interface CustomAtomicUpdater { public void update(SolrInputDocument oldDoc, String fieldName, Object fieldVal) ; } There will be a new attribute for fields in schema.xml called customAtomicUpdateClass (and all support in code, of course). The value is a class which is an implementation of CustomAtomicUpdater. In our example it will be defined for field_a. In method getUpdatedDocument in DistributedUpdateProcessor.java, we will add handling of custom case: } else if (custom.equals(key)) { updateField = true; SchemaField sf = schema.getField(sif.getName()); String customAtomicUpdaterClassName = sf.getCustomAtomicUpdaterClass(); if (customAtomicUpdaterClassName == null) { throw new SolrException(ErrorCode.BAD_REQUEST, There is no customAtomicUpdaterClass defined for + sif + .); } CustomAtomicUpdater updater = schema.getResourceLoader() .newInstance(customAtomicUpdaterClassName, CustomAtomicUpdater.class); if (updater == null) { throw new SolrException(ErrorCode.BAD_REQUEST, Was unable to create instance of + customAtomicUpdaterClassName + .); } updater.update(oldDoc, sif.getName(), fieldVal); } In my implementation I will sum field_a (oldvalue + newvalue) and update field_b according to my logic. Example of use: add doc field name=field_a update=custom128/field /doc /add What do say about my suggestion? Thanks.
RE: CopyField from text to multi value
Thanks Walter! -Original Message- From: Walter Underwood [mailto:wun...@wunderwood.org] Sent: Monday, October 20, 2014 12:09 AM To: solr-user@lucene.apache.org Subject: Re: CopyField from text to multi value I think that info is available with termvectors. That should give a list of the query terms that matched each document, if I understand it correctly. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Oct 19, 2014, at 7:37 AM, Tomer Levi tomer.l...@nice.com wrote: Thanks again for the help. The use case is this. In my UI I would like to indicate which words leaded to every document in the response. It actually seems like a simple highlight case but instead of getting the highlight result as this is a brlong/br string brwith/br text, Our UI team wants a list of words, i.e:[long, with]. So, I assumed that I can just tokenize the original text - copy the tokens into new multi-value fields - ask Solr to highlight the multi-value field That is my use case. Thanks again Tomer -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Sunday, October 19, 2014 5:18 PM To: solr-user@lucene.apache.org Subject: Re: CopyField from text to multi value This really feels like an XY problem, which I think Jack is alluding to. bq: I understand that the analysis chain is applied after the raw input was copied. I need to store the output of the analysis chain as a new multi-value field This statement is really confusing. You can't have the output of the analysis chain used as input to a copyField, it just doesn't work that way which is what you seem to want to do with the second sentence. Then you bring shingles into the picture... So let's take Jack's suggestion and back up and tell us what the use-case you're trying to support is rather than leaving us to guess what problem you're trying to solve.. Best, Erick On Sun, Oct 19, 2014 at 9:43 AM, Jack Krupansky j...@basetechnology.commailto:j...@basetechnology.com wrote: As always, you need to first examine how you intend to query the fields before you dive into data modeling. In this case, is there any particular reason that you need the individual terms as separate values, as opposed to simply using a tokenized text field? -- Jack Krupansky From: Tomer Levi Sent: Sunday, October 19, 2014 9:07 AM To: solr-user@lucene.apache.orgmailto:solr-user@lucene.apache.org Subject: CopyField from text to multi value Hi, I would like to copy a textual field content into a multivalue filed. For example, Let's say my field text contains: I am a solr user I would like to have a multi-value copyFields with the following content: [I, am, a, solr, user] Thanks, Tomer Levi Software Engineer Big Data Group Product Technology Unit (T) +972 (9) 775-2693 tomer.l...@nice.commailto:tomer.l...@nice.com www.nice.comhttp://www.nice.com
Solr replicas - stop replication and start again
Hello, I have a problem which I can't figure out how to solve it. For a little scenario, I've setup a cluster with two nodes, one shard, and two replicas, and both nodes connected to an external ZooKeeper. Great, but now I want to stop replication for an amount of time (or more precisely, to stop it, and then to start it again). I need it because, one of my replica will be on a central server and another one will be on a client server, and in the replica from central server will be inserted a lot of documents, but before these documents to be replicated, I want to process them, and after that to start the replication again. There is a way for doing this? I thought to something about suspending connection to the clients directly from the ZooKeeper, if you know something about that, could be probably a solution. If you can see another configuration, which could resolve this problem, feel free to share with us. Thanks in advance, Andrei -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Custom JSON
Awesome. How long did it take? Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 20 October 2014 03:59, Noble Paul noble.p...@gmail.com wrote: check https://issues.apache.org/jira/browse/SOLR-6633 On Fri, Oct 17, 2014 at 5:35 PM, Alexandre Rafalovitch arafa...@gmail.com wrote: I wonder how hard it would be to write an URP to just copy JSON from the request into a store-only field? Regards, Alex On 17/10/2014 1:21 am, Noble Paul noble.p...@gmail.com wrote: The original json is is not stored the fields are extracted and the data is thrown away On Fri, Oct 17, 2014 at 1:18 AM, Scott Dawson sc.e.daw...@gmail.com wrote: Noble, Thanks. You're right. I had some things incorrectly configured but now I can put structured JSON into Solr using the out-of-the-box solrconfig.xml. One additional question: Is there any way to query Solr and receive the original structured JSON document in response? Or does the flattening process that happens during indexing obliterate the original structure with no way to reconstruct it? Thanks again, Scott On Thu, Oct 16, 2014 at 2:10 PM, Noble Paul noble.p...@gmail.com wrote: The end point /update/json/docs is enabled implicitly in Solr irrespective of the solrconfig.xml In schemaless the fields are created automatically by solr. If you have all the fields created in your schema.xml it will work . if you need an id field please use a copy field to create one --Noble On Thu, Oct 16, 2014 at 8:42 PM, Scott Dawson sc.e.daw...@gmail.com wrote: Hello, I'm trying to use the new custom JSON feature described in https://issues.apache.org/jira/browse/SOLR-6304. I'm running Solr 4.10.1. It seems that the new feature, or more specifically, the /update/json/docs endpoint is not enabled out-of-the-box except in the schema-less example. Is there some dependence of the feature on schemaless mode? I've tried pulling the endpoint definition and related pieces of the example-schemaless solrconfig.xml and adding those to the standard solrconfig.xml in the main example but I've run into a cascade of issues. Right now I'm getting a This IndexSchema is not mutable exception when I try to post to the /update/json/docs endpoint. My real question is -- what's the easiest way to get this feature up and running quickly and is this documented somewhere? I'm trying to do a quick proof-of-concept to verify that we can move from our current flat JSON ingestion to a more natural use of structured JSON. Thanks, Scott Dawson -- - Noble Paul -- - Noble Paul -- - Noble Paul
Re: CoreAdminRequest in SolrCloud
Hello Nabil, isn't that what should be expected? Cores are local to nodes, so you only get the core status from the node you're asking. Cluster status refers to the entire SolrCloud cluster, so you will get the status over all collection/nodes/shards[=cores]. Check the Core Admin REST interface for comparison. Cheers, --Jürgen On 20.10.2014 11:41, nabil Kouici wrote: Hi, I'm trying to get all shards statistics in cloud configuration. I'v used CoreAdminRequest but the problem is I get statistics for only shards (or core) in one node (I've 2 nodes): String zkHostString = 10.0.1.4:2181; CloudSolrServer solrServer= new CloudSolrServer(zkHostString); CoreAdminRequest request = new CoreAdminRequest(); request.setAction(CoreAdminAction.STATUS); CoreAdminResponse cores = request.process(solrServer); for (int i = 0; i cores.getCoreStatus().size(); i++) { NamedListObject ll=cores.getCoreStatus().getVal(i); System.out.println(ll.toString()); } Any idea? Regards, Nabil. -- Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С уважением *i.A. Jürgen Wagner* Head of Competence Center Intelligence Senior Cloud Consultant Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543 E-Mail: juergen.wag...@devoteam.com mailto:juergen.wag...@devoteam.com, URL: www.devoteam.de http://www.devoteam.de/ Managing Board: Jürgen Hatzipantelis (CEO) Address of Record: 64331 Weiterstadt, Germany; Commercial Register: Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071
Re: CoreAdminRequest in SolrCloud
Querying all shards for a collection should look familiar; it's as though SolrCloud didn't even come into play: /http://localhost:8983/solr/collection1/select?q=*:*/ If, on the other hand, you wanted to search just one shard, you can specify that shard, as in: /http://localhost:8983/solr/collection1/select?q=*:*shards=localhost:7574/solr/ If you want to search a group of shards, you can specify them together: /http://localhost:8983/solr/collection1/select?q=*:*shards=localhost:7574/solr,localhost:8983/solr/ Or you can specify a list of servers to choose from for load balancing purposes by using the pipe symbol (|): /http://localhost:8983/solr/collection1/select?q=*:*shards=localhost:7574/solr|localhost:7500/solr/ nabil Kouici wrote Hi, I'm trying to get all shards statistics in cloud configuration. I'v used CoreAdminRequest but the problem is I get statistics for only shards (or core) in one node (I've 2 nodes): String zkHostString = 10.0.1.4:2181; CloudSolrServer solrServer= new CloudSolrServer(zkHostString); CoreAdminRequest request = new CoreAdminRequest(); request.setAction(CoreAdminAction.STATUS); CoreAdminResponse cores = request.process(solrServer); for (int i = 0; i cores.getCoreStatus().size(); i++) { NamedList ll=cores.getCoreStatus().getVal(i); System.out.println(ll.toString()); } Any idea? Regards, Nabil. -- View this message in context: http://lucene.472066.n3.nabble.com/CoreAdminRequest-in-SolrCloud-tp4164918p4164941.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr replicas - stop replication and start again
You can delete replicas from shard by using this command /admin/collections?action=DELETEREPLICAcollection=collectionshard=shardreplica=replica Delete a replica from a given collection and shard. If the corresponding core is up and running the core is unloaded and the entry is removed from the clusterstate. If the node/core is down , the entry is taken off the clusterstate and if the core comes up later it is automatically unregistered. http://lucene.472066.n3.nabble.com/file/n4164943/Screen_Shot_2014-10-20_at_3.png Example : http://10.0.1.6:8983/solr/admin/collections?action=DELETEREPLICAcollection=test2shard=shard2replica=core_node3 -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164943.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr replicas - stop replication and start again
You can also delete shard also -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164945.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr replicas - stop replication and start again
Thank you for your answer, But, how do you 'revive' the replica after that? I tried with add replica, but creates another one...(a solr_node3_replica) Odd solution, but if you solved the problem with reviving the old replica, it could a viable solution. Thank you, Andrei -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164946.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: CoreAdminRequest in SolrCloud
Hi Jürgen, As you can see, I'm not using direct connection to node. It's a CloudServer. Do you have example to how to get Cluster status from solrJ. Regards, Nabil. Le Lundi 20 octobre 2014 13h44, Jürgen Wagner (DVT) juergen.wag...@devoteam.com a écrit : Hello Nabil, isn't that what should be expected? Cores are local to nodes, so you only get the core status from the node you're asking. Cluster status refers to the entire SolrCloud cluster, so you will get the status over all collection/nodes/shards[=cores]. Check the Core Admin REST interface for comparison. Cheers, --Jürgen On 20.10.2014 11:41, nabil Kouici wrote: Hi, I'm trying to get all shards statistics in cloud configuration. I'v used CoreAdminRequest but the problem is I get statistics for only shards (or core) in one node (I've 2 nodes): String zkHostString = 10.0.1.4:2181; CloudSolrServer solrServer= new CloudSolrServer(zkHostString); CoreAdminRequest request = new CoreAdminRequest(); request.setAction(CoreAdminAction.STATUS); CoreAdminResponse cores = request.process(solrServer); for (int i = 0; i cores.getCoreStatus().size(); i++) { NamedListObject ll=cores.getCoreStatus().getVal(i); System.out.println(ll.toString()); } Any idea? Regards, Nabil. -- Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С уважением i.A. Jürgen Wagner Head of Competence Center Intelligence Senior Cloud Consultant Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543 E-Mail: juergen.wag...@devoteam.com, URL: www.devoteam.de Managing Board: Jürgen Hatzipantelis (CEO) Address of Record: 64331 Weiterstadt, Germany; Commercial Register: Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071
Re: CoreAdminRequest in SolrCloud
Hi Nabil, you can get /clusterstate.json from Zookeeper. Check CloudSolrServer.getZkStateReader(): http://lucene.apache.org/solr/4_10_1/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrServer.html Best regards, --Jürgen On 20.10.2014 15:16, nabil Kouici wrote: Hi Jürgen, As you can see, I'm not using direct connection to node. It's a CloudServer. Do you have example to how to get Cluster status from solrJ. Regards, Nabil. Le Lundi 20 octobre 2014 13h44, Jürgen Wagner (DVT) juergen.wag...@devoteam.com a écrit : Hello Nabil, isn't that what should be expected? Cores are local to nodes, so you only get the core status from the node you're asking. Cluster status refers to the entire SolrCloud cluster, so you will get the status over all collection/nodes/shards[=cores]. Check the Core Admin REST interface for comparison. Cheers, --Jürgen On 20.10.2014 11:41, nabil Kouici wrote: Hi, I'm trying to get all shards statistics in cloud configuration. I'v used CoreAdminRequest but the problem is I get statistics for only shards (or core) in one node (I've 2 nodes): String zkHostString = 10.0.1.4:2181; CloudSolrServer solrServer= new CloudSolrServer(zkHostString); CoreAdminRequest request = new CoreAdminRequest(); request.setAction(CoreAdminAction.STATUS); CoreAdminResponse cores = request.process(solrServer); for (int i = 0; i cores.getCoreStatus().size(); i++) { NamedListObject ll=cores.getCoreStatus().getVal(i); System.out.println(ll.toString()); } Any idea? Regards, Nabil. -- Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С уважением *i.A. Jürgen Wagner* Head of Competence Center Intelligence Senior Cloud Consultant Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543 E-Mail: juergen.wag...@devoteam.com mailto:juergen.wag...@devoteam.com, URL: www.devoteam.de http://www.devoteam.de/ Managing Board: Jürgen Hatzipantelis (CEO) Address of Record: 64331 Weiterstadt, Germany; Commercial Register: Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071
Re: CoreAdminRequest in SolrCloud
Thank you Jürgen for this link. However, this will not give number of documents and shard size. Regards, Nabil. Le Lundi 20 octobre 2014 15h23, Jürgen Wagner (DVT) juergen.wag...@devoteam.com a écrit : Hi Nabil, you can get /clusterstate.json from Zookeeper. Check CloudSolrServer.getZkStateReader(): http://lucene.apache.org/solr/4_10_1/solr-solrj/org/apache/solr/client/solrj/impl/CloudSolrServer.html Best regards, --Jürgen On 20.10.2014 15:16, nabil Kouici wrote: Hi Jürgen, As you can see, I'm not using direct connection to node. It's a CloudServer. Do you have example to how to get Cluster status from solrJ. Regards, Nabil. Le Lundi 20 octobre 2014 13h44, Jürgen Wagner (DVT) juergen.wag...@devoteam.com a écrit : Hello Nabil, isn't that what should be expected? Cores are local to nodes, so you only get the core status from the node you're asking. Cluster status refers to the entire SolrCloud cluster, so you will get the status over all collection/nodes/shards[=cores]. Check the Core Admin REST interface for comparison. Cheers, --Jürgen On 20.10.2014 11:41, nabil Kouici wrote: Hi, I'm trying to get all shards statistics in cloud configuration. I'v used CoreAdminRequest but the problem is I get statistics for only shards (or core) in one node (I've 2 nodes): String zkHostString = 10.0.1.4:2181; CloudSolrServer solrServer= new CloudSolrServer(zkHostString); CoreAdminRequest request = new CoreAdminRequest(); request.setAction(CoreAdminAction.STATUS); CoreAdminResponse cores = request.process(solrServer); for (int i = 0; i cores.getCoreStatus().size(); i++) { NamedListObject ll=cores.getCoreStatus().getVal(i); System.out.println(ll.toString()); } Any idea? Regards, Nabil. -- Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С уважением i.A. Jürgen Wagner Head of Competence Center Intelligence Senior Cloud Consultant Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543 E-Mail: juergen.wag...@devoteam.com, URL: www.devoteam.de Managing Board: Jürgen Hatzipantelis (CEO) Address of Record: 64331 Weiterstadt, Germany; Commercial Register: Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071
Re: Solr replicas - stop replication and start again
Another idea, I turned off the replica in which I want to insert data and then to process them, I started again, BUT, without -DzkHost, or -DzkRun, so the new started solr instance. I put my data into it, I stopped again, and I started with -DzkHost that points to my zoo keeper. But the problem is that the ZooKeeper doesn't know about the changes from the new replica, and voila, no replication, no nothing. Any idea? Thank you, Andrei -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164954.html Sent from the Solr - User mailing list archive at Nabble.com.
Word Break Spell Checker Implementation algorithm
Hi, Could you please point me to the link where I can learn about the theory behind the implementation of word break spell checker? Like we know that the solr's DirectSolrSpellCheck component uses levenstian distance algorithm, what is the algorithm used behind the word break spell checker component? How does it detects the space that is needed if it doesn't use shingle? Thanks - David
Re: Recovering from Out of Mem
On 10/19/2014 11:32 PM, Ramzi Alqrainy wrote: You can create a script to ping on Solr every 10 sec. if no response, then restart it (Kill process id and run Solr again). This is the fastest and easiest way to do that on windows. I wouldn't do this myself. Any temporary problem that results in a long query time might result in a true outage while Solr restarts. If OOME is a problem, then you can deal with that by providing a program for Java to call when OOME occurs. Sending notification when ping times get excessive is a good idea, but I wouldn't make it automatically restart, unless you've got a threshold for that action so it only happens when the ping time is *REALLY* high. The real fix for OOME is to make the heap larger or to reduce the heap requirements by changing how Solr is configured or used. http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap Writing a program that has deterministic behavior in an out of memory condition is very difficult. The Lucene devs *have* done this hard work in the lower levels of IndexWriter and the specific Directory implementations, so that OOME doesn't cause *index corruption*. In general, once OOME happens, program operation (and in some cases the status of the most recently indexed documents) is completely undetermined. We can be sure that the data which has already been written to disk will be correct, but nothing beyond that. That's why it is considered better to crash the program and restart it for OOME. Thanks, Shawn
Re: Solr replicas - stop replication and start again
Andrei, I'm wondering if you've considered using Classic replication for this use case. It seems better suited for it. Michael Della Bitta Senior Software Engineer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Mon, Oct 20, 2014 at 9:53 AM, andreic9203 andreic9...@gmail.com wrote: Another idea, I turned off the replica in which I want to insert data and then to process them, I started again, BUT, without -DzkHost, or -DzkRun, so the new started solr instance. I put my data into it, I stopped again, and I started with -DzkHost that points to my zoo keeper. But the problem is that the ZooKeeper doesn't know about the changes from the new replica, and voila, no replication, no nothing. Any idea? Thank you, Andrei -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164954.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr replicas - stop replication and start again
Hello Michael, Do you want to say, the replication from solr, that with master-slave? Thank you, Andrei -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164965.html Sent from the Solr - User mailing list archive at Nabble.com.
SolrCloud use of min_rf through SolrJ
Hi all, I'm trying to make use of the min_rf (minimum replication factor) feature described in https://issues.apache.org/jira/browse/SOLR-5468. According to the ticket, all that is needed is to pass min_rf param into the update request and get back the rf param from the response or even easier make use of CloudSolrServer.getMinAchievedReplicationFactor(). I'm using SolrJ's CloudSolrServer but I couldn't find any way to pass min_rf using the available add() methods when sending a document to Solr, so I resorted to the following UpdateRequest req = new UpdateRequest(); req.setParam(UpdateRequest.MIN_REPFACT, 1); req.add(doc); UpdateResponse response = req.process(cloudSolrServer); int rf = cloudSolrServer.getMinAchievedReplicationFactor(collection_name, response.getResponse()); Still the returned rf value is always -1. How can I utilize min_rf through SolrJ? I'm using Solr 4.10.0 with a collection that has 2 replicas (one leader, one replica). Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-use-of-min-rf-through-SolrJ-tp4164966.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Recovering from Out of Mem
That's why it is considered better to crash the program and restart it for OOME. In the end aren't you also saying the same thing or I misunderstood something? We don't get this issue on master server (indexing). Our real concern is slave where sometimes (rare) so not an obvious heap config issue but when it happens our failover doesn't even work (moving to another slave) as there is no error so I just want a good way to know if there is an OOM and shift to a failover or just have that server restarted. On Mon, Oct 20, 2014 at 7:25 PM, Shawn Heisey apa...@elyograg.org wrote: On 10/19/2014 11:32 PM, Ramzi Alqrainy wrote: You can create a script to ping on Solr every 10 sec. if no response, then restart it (Kill process id and run Solr again). This is the fastest and easiest way to do that on windows. I wouldn't do this myself. Any temporary problem that results in a long query time might result in a true outage while Solr restarts. If OOME is a problem, then you can deal with that by providing a program for Java to call when OOME occurs. Sending notification when ping times get excessive is a good idea, but I wouldn't make it automatically restart, unless you've got a threshold for that action so it only happens when the ping time is *REALLY* high. The real fix for OOME is to make the heap larger or to reduce the heap requirements by changing how Solr is configured or used. http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap Writing a program that has deterministic behavior in an out of memory condition is very difficult. The Lucene devs *have* done this hard work in the lower levels of IndexWriter and the specific Directory implementations, so that OOME doesn't cause *index corruption*. In general, once OOME happens, program operation (and in some cases the status of the most recently indexed documents) is completely undetermined. We can be sure that the data which has already been written to disk will be correct, but nothing beyond that. That's why it is considered better to crash the program and restart it for OOME. Thanks, Shawn -- Regards, Salman Akram
Re: Recovering from Out of Mem
i think we can agree that the basic requirement of *knowing* when the OOM occurs is the minimal requirement, triggering an alert (email, etc) would be the first thing to get into your script once you know when the OOM conditions are occuring you can start to get to the root cause or remedy (adjust heap sizes, or adjust the input side that is triggering the OOM). the correct remedy will obviously require some more deeper investigation into the actual solr usage at the point of OOM and the gc logs (you have these being generated too i hope). just bumping the Xmx because you hit an OOM during an abusive query is no guarantee of a fix and is likely going to cost you OS cache memory space which you want to leave available for holding the actual index data. the real fix would be cleaning up the query (if that is possible) fundamentally, its a preference thing, but i'm personally not a fan of auto restarts as the problem that triggered the original OOM (say an expensive poorly constructed query) may just come back and you get into an oscillating situation of restart after restart. i generally want a human involved when error conditions which should be outliers (like OOM) are happening From: Salman Akram salman.ak...@northbaysolutions.net Sent: Monday, October 20, 2014 08:47 To: Solr Group Subject: Re: Recovering from Out of Mem That's why it is considered better to crash the program and restart it for OOME. In the end aren't you also saying the same thing or I misunderstood something? We don't get this issue on master server (indexing). Our real concern is slave where sometimes (rare) so not an obvious heap config issue but when it happens our failover doesn't even work (moving to another slave) as there is no error so I just want a good way to know if there is an OOM and shift to a failover or just have that server restarted. On Mon, Oct 20, 2014 at 7:25 PM, Shawn Heisey apa...@elyograg.org wrote: On 10/19/2014 11:32 PM, Ramzi Alqrainy wrote: You can create a script to ping on Solr every 10 sec. if no response, then restart it (Kill process id and run Solr again). This is the fastest and easiest way to do that on windows. I wouldn't do this myself. Any temporary problem that results in a long query time might result in a true outage while Solr restarts. If OOME is a problem, then you can deal with that by providing a program for Java to call when OOME occurs. Sending notification when ping times get excessive is a good idea, but I wouldn't make it automatically restart, unless you've got a threshold for that action so it only happens when the ping time is *REALLY* high. The real fix for OOME is to make the heap larger or to reduce the heap requirements by changing how Solr is configured or used. http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap Writing a program that has deterministic behavior in an out of memory condition is very difficult. The Lucene devs *have* done this hard work in the lower levels of IndexWriter and the specific Directory implementations, so that OOME doesn't cause *index corruption*. In general, once OOME happens, program operation (and in some cases the status of the most recently indexed documents) is completely undetermined. We can be sure that the data which has already been written to disk will be correct, but nothing beyond that. That's why it is considered better to crash the program and restart it for OOME. Thanks, Shawn -- Regards, Salman Akram
Verify if solr reload core is successful or not
Hi, How do I verify if Solr core reload is successful or not? I use Solr 4.6. To reload the core I send the below request: http://hostname:7090/solr/admin/cores?action=RELOADcore=core0wt=json Also is the above request synchronous ( I mean will the reload happen before the response is recieved) or does it happen after we get the response to the above request and we have to poll if the reload is successful? Thanks, Prathik
Re: Solr replicas - stop replication and start again
You can add a new replicas but I think you can't revive the old one. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164988.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr replicas - stop replication and start again
Exactly, So what good to delete my replica if I can't then to put it back? Is supposed that this replica contains data, older data, but still updated, which I need them, so to delete my replica, to create another one, to copy all documents and then to put the new documents and processing them, seems to be work in vain... Stil, thank you for your idea. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164991.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Verify if solr reload core is successful or not
when you hit a request in the browser http://localhost:8983/solr/admin/cores?action=RELOADcore=core0 you will receive this response ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime1316/int /lst /response That means that every thing is fine -- View this message in context: http://lucene.472066.n3.nabble.com/Verify-if-solr-reload-core-is-successful-or-not-tp4164981p4164996.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Word Break Spell Checker Implementation algorithm
WordBreakSolrSpellChecker offers suggestions by combining adjacent query terms and/or breaking terms into multiple words. It is a SpellCheckComponent enhancement, leveraging Lucene's WordBreakSpellChecker. It can detect spelling errors resulting from misplaced whitespace without the use of shingle-based dictionaries and provides collation support for word-break errors, including cases where the user has a mix of single-word spelling errors and word-break errors in the same query. It also provides shard support. Here is how it might be configured in solrconfig.xml: http://lucene.472066.n3.nabble.com/file/n4164997/Screen_Shot_2014-10-20_at_9.png Some of the parameters will be familiar from the discussion of the other spell checkers, such as name, classname, and field. New for this spell checker is combineWords, which defines whether words should be combined in a dictionary search (default is true); breakWords, which defines if words should be broken during a dictionary search (default is true); and maxChanges, an integer which defines how many times the spell checker should check collation possibilities against the index (default is 10). The spellchecker can be configured with a traditional checker (ie: DirectSolrSpellChecker). The results are combined and collations can contain a mix of corrections from both spellcheckers. Add It to a Request Handler Queries will be sent to a RequestHandler. If every request should generate a suggestion, then you would add the following to the requestHandler that you are using: http://lucene.472066.n3.nabble.com/file/n4164997/2.png For more details, you can read the below tutorial https://cwiki.apache.org/confluence/display/solr/Spell+Checking -- View this message in context: http://lucene.472066.n3.nabble.com/Word-Break-Spell-Checker-Implementation-algorithm-tp4164955p4164997.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Run a query via SolrJ using Request URL
I found this very ancient bit of code, not sure it even works anymore, but you can give it a try. The problem isn't so much sending the request (if you've got the original query with params, you can call Solr through a plain old HTTP request), but it's parsing the response that's the tedious bit without Solrj, so the following code unmarshals the raw (binary?) response using the JavaBinCodec, and then makes a QueryResponse out of the NamedList. DefaultHttpClient httpclient = new DefaultHttpClient(); HttpGet get = new HttpGet(queryUrl.toURI()); HttpResponse response1 = httpclient.execute(get); StatusLine statusLine = response1.getStatusLine(); HttpEntity entity = response1.getEntity(); InputStream content = entity.getContent(); //NamedListObject namedList = (NamedListObject)new JavaBinCodec().unmarshal(queryUrl.openConnection().getInputStream()); NamedListObject namedList = (NamedListObject)new JavaBinCodec().unmarshal(content); content.close(); QueryResponse resp = new QueryResponse(namedList, null); -Original Message- From: Dickinson, Cliff [mailto:cdickin...@ncaa.org] Sent: Wednesday, October 15, 2014 9:37 AM To: solr-user@lucene.apache.org Subject: Run a query via SolrJ using Request URL I'm faily new to Solr and have run into an issue that I cannot figure out how to solve. I'm trying to implement a Save Search requirement similar to bookmarking to allow the same search to be run in the future. Once the original search is executed from within a Spring app, I use the ClientUtils.toQueryString() to store a copy of the actual request URL that was sent to the Solr Server in a database table(if save is requested). That part works great, but now I can't find anything in the SolrJ API that will allow me to run that query from the original URL rather than having to piece it together again via a SolrQuery object. Is there anything out there to run from this URL or do I need to manually split it out and build the SolrQuery? Thanks in advance for the advice! Cliff This email and any attachments may contain NCAA confidential and privileged information. If you are not the intended recipient, please notify the sender immediately by return email, delete this message and destroy any copies. Any dissemination or use of this information by a person other than the intended recipient is unauthorized and may be illegal.
Re: unstable results on refresh
Can you please provide us the exception when the shard goes out of sync ? Please monitor the logs. -- View this message in context: http://lucene.472066.n3.nabble.com/unstable-results-on-refresh-tp4164913p4165002.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr replicas - stop replication and start again
Yes, that's what I'm suggesting. It seems a perfect fit for a single shard collection with an offsite remote that you don't always want to write to. Michael Della Bitta Senior Software Engineer o: +1 646 532 3062 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinions https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Mon, Oct 20, 2014 at 10:41 AM, andreic9203 andreic9...@gmail.com wrote: Hello Michael, Do you want to say, the replication from solr, that with master-slave? Thank you, Andrei -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-replicas-stop-replication-and-start-again-tp4164931p4164965.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: unstable results on refresh
What are the differences on. The document count or things like facets? This could be important. Also, I think there was a similar thread on the mailing list a week or two ago, might be worth looking for it. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 20 October 2014 04:49, Giovanni Bricconi giovanni.bricc...@banzai.it wrote: Hello I have a procedure that sends small data changes during the day to a solrcloud cluster, version 4.8 The cluster is made of three nodes, and three shards, each node contains two shards The procedure has been running for days; I don't know when but at some point one of the cores has gone out of synch and so repeating the same query has began to show small differences. The core graph was not useful, everything seemed active. I have solved the problem reindexing all, because the collection is quite small, but is there a way to fix this problem? Suppose I can figure out which core returns different results, is there a command to force that core to refetch the whole index from its master? Thanks Giovanni
javascript form data save to XML in server side
hello list, The functionality I would like to add the the existing /browse request handler is using a user interface (e.g.,webform) to collect the user's input. My approach is add a javascript form into the velocity template, below is the code I added to the velocity template(for example): form id=myForm, action=ProcessData.php First name: input type=text name=fname value=br Last name: input type=text name=lname value=br /form And I am using this ProcessData.php to process the user input to generate a XML in the server. My question is 1) how to make solr to run this ProcessData.php? It seems solr does not support php? 2) Where is this ProcessData.php going to be placed in the solr directory? I am a newbie in web programming. I tried very hard to catch up with it. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/javascript-form-data-save-to-XML-in-server-side-tp4165025.html Sent from the Solr - User mailing list archive at Nabble.com.
Is there a problem with -Infinity as boost?
I am considering using a boost as follows: boost=log(qty) Where qty is the quantity in stock of a given product i.e. qty could be 0, 1, 2, 3, … etc. The problem I see is that log(0) is -Infinity. Would this be a problem for Solr? For me it is not a problem because log(0) log(1) log(2) etc. I'd be grateful for any thoughts. One alternative is to use max e.g. boost=max(log(qty), -1) But still this would cause Solr to compute the -Infinity and then discard it. So can I use an expression for boost that would result in –Infinity? Thank you O. O. -- View this message in context: http://lucene.472066.n3.nabble.com/Is-there-a-problem-with-Infinity-as-boost-tp4165036.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is there a problem with -Infinity as boost?
The usual fix for this is log(1+qty). If you might have negative values, you can use log(max(1,qty)). wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Oct 20, 2014, at 3:04 PM, O. Olson olson_...@yahoo.it wrote: I am considering using a boost as follows: boost=log(qty) Where qty is the quantity in stock of a given product i.e. qty could be 0, 1, 2, 3, … etc. The problem I see is that log(0) is -Infinity. Would this be a problem for Solr? For me it is not a problem because log(0) log(1) log(2) etc. I'd be grateful for any thoughts. One alternative is to use max e.g. boost=max(log(qty), -1) But still this would cause Solr to compute the -Infinity and then discard it. So can I use an expression for boost that would result in –Infinity? Thank you O. O.
Shared Directory for two Solr Clouds(Writer and Reader)
Hi Folks, Here are some my ideas to use shared file system with two separate Solr Clouds(Writer Solr Cloud and Reader Solr Cloud). I want to get your valuable feedbacks For prototype, I setup two separate Solr Clouds(one for Writer and the other for Reader). Basically big picture of my prototype is like below. 1. Reader and Writer Solr clouds share the same directory 2. Writer SolrCloud sends the openSearcher commands to Reader Solr Cloud inside postCommit eventHandler. That is, when new data are added to Writer Solr Cloud, writer Solr Cloud sends own openSearcher command to Reader Solr Cloud. 3. Reader opens searcher only when it receives openSearcher commands from Writer SolrCloud 4. Writer has own deletionPolicy to keep old commit points which might be used by running queries on Reader Solr Cloud when new searcher is opened on reader SolrCloud. 5. Reader has no update/no commits. Everything on reader Solr Cloud are read-only. It also creates searcher from directory not from indexer(nrtMode=false). That is, In Writer Solr Cloud, I added postCommit eventListner. Inside the postCommit eventListner, it sends own openSearcher command to reader Solr Cloud's own handler. Then reader Solr Cloud will create openSearcher directly without commit and return the writer's request. With this approach, Writer and Reader can use the same commit points in shared file system in synchronous way. When a Reader SolrCloud starts, it doesn't create openSearcher. Instead. Writer Solr Cloud listens the zookeeper of Reader Solr Cloud. Any change in the reader SolrCloud, writer sends openSearcher command to reader Solr Cloud. Does it make sense? Or am I missing some important stuff? any feedback would be very helpful to me. Thanks, Jae
Re: javascript form data save to XML in server side
One possibility is to send to one of Solr's /update handlers from your page. It won't be straightforward unless you were POSTing a file to /update/extract, but it would be possible for a little bit of JavaScript onSubmit to format the data amenable to Solr. I've not done this myself but it'd be a cute simple example. Solr's admin UI does have a way to add docs that you might want to use as one possible example, though it's lower level than form fields like you're desiring. Erik On Oct 20, 2014, at 16:54, LongY zhangyulin8...@hotmail.com wrote: hello list, The functionality I would like to add the the existing /browse request handler is using a user interface (e.g.,webform) to collect the user's input. My approach is add a javascript form into the velocity template, below is the code I added to the velocity template(for example): form id=myForm, action=ProcessData.php First name: input type=text name=fname value=br Last name: input type=text name=lname value=br /form And I am using this ProcessData.php to process the user input to generate a XML in the server. My question is 1) how to make solr to run this ProcessData.php? It seems solr does not support php? 2) Where is this ProcessData.php going to be placed in the solr directory? I am a newbie in web programming. I tried very hard to catch up with it. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/javascript-form-data-save-to-XML-in-server-side-tp4165025.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: javascript form data save to XML in server side
There is a couple of issues with what you are saying here: 1) You should not be exposing Solr directly to the internet. They would be able to delete all your records and do other damage. /browse endpoint is there to show-off what Solr can do, not to be used in production 2) Solr is Java, it does not run PHP. You can write a custom handler in Java, but you probably don't want to start your programming career from that (too hard). Or you can run Javascript to post back to Solr, but that still requires Solr to be publicly accessible - see 1) 3) If you don't care what language you are using and just want a nice UI that's backed by Solr, you could look at Spring Boot which has Spring Data integration with Solr. 4) Or you might be looking at a content management system that integrates Solr, such as Typo3 (which is PHP) So, you need to step back and think a bit harder about what you are trying to do. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 20 October 2014 16:54, LongY zhangyulin8...@hotmail.com wrote: hello list, The functionality I would like to add the the existing /browse request handler is using a user interface (e.g.,webform) to collect the user's input. My approach is add a javascript form into the velocity template, below is the code I added to the velocity template(for example): form id=myForm, action=ProcessData.php First name: input type=text name=fname value=br Last name: input type=text name=lname value=br /form And I am using this ProcessData.php to process the user input to generate a XML in the server. My question is 1) how to make solr to run this ProcessData.php? It seems solr does not support php? 2) Where is this ProcessData.php going to be placed in the solr directory? I am a newbie in web programming. I tried very hard to catch up with it. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/javascript-form-data-save-to-XML-in-server-side-tp4165025.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Shared Directory for two Solr Clouds(Writer and Reader)
Hi Jae, Sounds a bit complicated and messy to me, but maybe I'm missing something. What are you trying to accomplish with this approach? Which problems do you have that are making you look for non-straight forward setup? Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Mon, Oct 20, 2014 at 7:35 PM, Jaeyoung Yoon jaeyoungy...@gmail.com wrote: Hi Folks, Here are some my ideas to use shared file system with two separate Solr Clouds(Writer Solr Cloud and Reader Solr Cloud). I want to get your valuable feedbacks For prototype, I setup two separate Solr Clouds(one for Writer and the other for Reader). Basically big picture of my prototype is like below. 1. Reader and Writer Solr clouds share the same directory 2. Writer SolrCloud sends the openSearcher commands to Reader Solr Cloud inside postCommit eventHandler. That is, when new data are added to Writer Solr Cloud, writer Solr Cloud sends own openSearcher command to Reader Solr Cloud. 3. Reader opens searcher only when it receives openSearcher commands from Writer SolrCloud 4. Writer has own deletionPolicy to keep old commit points which might be used by running queries on Reader Solr Cloud when new searcher is opened on reader SolrCloud. 5. Reader has no update/no commits. Everything on reader Solr Cloud are read-only. It also creates searcher from directory not from indexer(nrtMode=false). That is, In Writer Solr Cloud, I added postCommit eventListner. Inside the postCommit eventListner, it sends own openSearcher command to reader Solr Cloud's own handler. Then reader Solr Cloud will create openSearcher directly without commit and return the writer's request. With this approach, Writer and Reader can use the same commit points in shared file system in synchronous way. When a Reader SolrCloud starts, it doesn't create openSearcher. Instead. Writer Solr Cloud listens the zookeeper of Reader Solr Cloud. Any change in the reader SolrCloud, writer sends openSearcher command to reader Solr Cloud. Does it make sense? Or am I missing some important stuff? any feedback would be very helpful to me. Thanks, Jae
Re: Shared Directory for two Solr Clouds(Writer and Reader)
I guess I'm not quite sure what the point is. So can you back up a bit and explain what problem this is trying to solve? Because all it really appears to be doing that's not already done with stock Solr is saving some disk space, and perhaps your reader SolrCloud is having some more cycles to devote to serving queries rather than indexing. So I'm curious why 1 standard SolrCloud with selective hard and soft commits doesn't satisfy the need and 2 If 1 is not reasonable, why older-style master/slave replication doesn't work. Unless there's a compelling use-case for this, it seems like there's a lot of complexity here for questionable value. Please note I'm not saying this is a bad idea. It would just be good to understand what problem it's trying to solve. I'm reluctant to introduce complexity without discussing the use-case. Perhaps the existing code could provide a good enough solution. Best, Erick On Mon, Oct 20, 2014 at 7:35 PM, Jaeyoung Yoon jaeyoungy...@gmail.com wrote: Hi Folks, Here are some my ideas to use shared file system with two separate Solr Clouds(Writer Solr Cloud and Reader Solr Cloud). I want to get your valuable feedbacks For prototype, I setup two separate Solr Clouds(one for Writer and the other for Reader). Basically big picture of my prototype is like below. 1. Reader and Writer Solr clouds share the same directory 2. Writer SolrCloud sends the openSearcher commands to Reader Solr Cloud inside postCommit eventHandler. That is, when new data are added to Writer Solr Cloud, writer Solr Cloud sends own openSearcher command to Reader Solr Cloud. 3. Reader opens searcher only when it receives openSearcher commands from Writer SolrCloud 4. Writer has own deletionPolicy to keep old commit points which might be used by running queries on Reader Solr Cloud when new searcher is opened on reader SolrCloud. 5. Reader has no update/no commits. Everything on reader Solr Cloud are read-only. It also creates searcher from directory not from indexer(nrtMode=false). That is, In Writer Solr Cloud, I added postCommit eventListner. Inside the postCommit eventListner, it sends own openSearcher command to reader Solr Cloud's own handler. Then reader Solr Cloud will create openSearcher directly without commit and return the writer's request. With this approach, Writer and Reader can use the same commit points in shared file system in synchronous way. When a Reader SolrCloud starts, it doesn't create openSearcher. Instead. Writer Solr Cloud listens the zookeeper of Reader Solr Cloud. Any change in the reader SolrCloud, writer sends openSearcher command to reader Solr Cloud. Does it make sense? Or am I missing some important stuff? any feedback would be very helpful to me. Thanks, Jae
Re: javascript form data save to XML in server side
I guess the admin UI for adding docs you mentioned is Data Import Handler. If I understand your reply correctly, the idea is to post the javascript form data from the webpage to /update/extract handler. Thank you for shedding some light. -- View this message in context: http://lucene.472066.n3.nabble.com/javascript-form-data-save-to-XML-in-server-side-tp4165025p4165059.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: javascript form data save to XML in server side
In the most recent Solr, there is a Documents page, next after the DataImportHandler page. That's got several different ways to add a document. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 20 October 2014 22:08, LongY zhangyulin8...@hotmail.com wrote: I guess the admin UI for adding docs you mentioned is Data Import Handler. If I understand your reply correctly, the idea is to post the javascript form data from the webpage to /update/extract handler. Thank you for shedding some light. -- View this message in context: http://lucene.472066.n3.nabble.com/javascript-form-data-save-to-XML-in-server-side-tp4165025p4165059.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: javascript form data save to XML in server side
The solr users are trustworthy and it's only for internal use. The purpose of this form is to allow user to directly input data to be further indexed by solr. I am interested in this sentence from your reply which is Or you can run Javascript to post back to Solr. Please bear with me if I ask very simple question on the web programming. From my understanding, javascript is client-side programming language, how is it possible to post the form data back to Solr using javascript. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/javascript-form-data-save-to-XML-in-server-side-tp4165025p4165061.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: javascript form data save to XML in server side
Most of the javascript frameworks (AngularJS, etc) allow to post information back to the server. If you use gmail or yahoo mail or anything else, it's a javascript that lets you send a message. So, if you completely trust your users, you can just have Javascript and Solr and nothing else. Though I would then make sure that Solr has all the fields as stored. Otherwise, if you ever need to reindex, you will have not be able to retrieve index-only fields. Usually that's not an issue as the Solr is NOT the primary storage (database is), but you seem to want to make Solr the primary storage, so you have additional issues to keep in mind.f Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 20 October 2014 22:22, LongY zhangyulin8...@hotmail.com wrote: The solr users are trustworthy and it's only for internal use. The purpose of this form is to allow user to directly input data to be further indexed by solr. I am interested in this sentence from your reply which is Or you can run Javascript to post back to Solr. Please bear with me if I ask very simple question on the web programming. From my understanding, javascript is client-side programming language, how is it possible to post the form data back to Solr using javascript. Thank you. -- View this message in context: http://lucene.472066.n3.nabble.com/javascript-form-data-save-to-XML-in-server-side-tp4165025p4165061.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Verify if solr reload core is successful or not
What would be the response if the Core reload failed due to incorrect configurations? Thanks, Prathik On Mon, Oct 20, 2014 at 11:24 PM, Ramzi Alqrainy ramzi.alqra...@gmail.com wrote: when you hit a request in the browser http://localhost:8983/solr/admin/cores?action=RELOADcore=core0 you will receive this response ?xml version=1.0 encoding=UTF-8? response lst name=responseHeader int name=status0/int int name=QTime1316/int /lst /response That means that every thing is fine -- View this message in context: http://lucene.472066.n3.nabble.com/Verify-if-solr-reload-core-is-successful-or-not-tp4164981p4164996.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: javascript form data save to XML in server side
thank you very much. Alex. You reply is very informative and I really appreciate it. I hope I would be able to help others in this forum like you are in the future. -- View this message in context: http://lucene.472066.n3.nabble.com/javascript-form-data-save-to-XML-in-server-side-tp4165025p4165066.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Verify if solr reload core is successful or not
The response would be http://lucene.472066.n3.nabble.com/file/n4165076/Screen_Shot_2014-10-21_at_7.png -- View this message in context: http://lucene.472066.n3.nabble.com/Verify-if-solr-reload-core-is-successful-or-not-tp4164981p4165076.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Shared Directory for two Solr Clouds(Writer and Reader)
In my case, injest rate is very high(above 300K docs/sec) and data are kept inserted. So CPU is already bottleneck because of indexing. older-style master/slave replication with http or scp takes long to copy big files from master/slave. That's why I setup two separate Solr Clouds. One for indexing and the other for query. Thanks, Jae On Mon, Oct 20, 2014 at 6:22 PM, Erick Erickson erickerick...@gmail.com wrote: I guess I'm not quite sure what the point is. So can you back up a bit and explain what problem this is trying to solve? Because all it really appears to be doing that's not already done with stock Solr is saving some disk space, and perhaps your reader SolrCloud is having some more cycles to devote to serving queries rather than indexing. So I'm curious why 1 standard SolrCloud with selective hard and soft commits doesn't satisfy the need and 2 If 1 is not reasonable, why older-style master/slave replication doesn't work. Unless there's a compelling use-case for this, it seems like there's a lot of complexity here for questionable value. Please note I'm not saying this is a bad idea. It would just be good to understand what problem it's trying to solve. I'm reluctant to introduce complexity without discussing the use-case. Perhaps the existing code could provide a good enough solution. Best, Erick On Mon, Oct 20, 2014 at 7:35 PM, Jaeyoung Yoon jaeyoungy...@gmail.com wrote: Hi Folks, Here are some my ideas to use shared file system with two separate Solr Clouds(Writer Solr Cloud and Reader Solr Cloud). I want to get your valuable feedbacks For prototype, I setup two separate Solr Clouds(one for Writer and the other for Reader). Basically big picture of my prototype is like below. 1. Reader and Writer Solr clouds share the same directory 2. Writer SolrCloud sends the openSearcher commands to Reader Solr Cloud inside postCommit eventHandler. That is, when new data are added to Writer Solr Cloud, writer Solr Cloud sends own openSearcher command to Reader Solr Cloud. 3. Reader opens searcher only when it receives openSearcher commands from Writer SolrCloud 4. Writer has own deletionPolicy to keep old commit points which might be used by running queries on Reader Solr Cloud when new searcher is opened on reader SolrCloud. 5. Reader has no update/no commits. Everything on reader Solr Cloud are read-only. It also creates searcher from directory not from indexer(nrtMode=false). That is, In Writer Solr Cloud, I added postCommit eventListner. Inside the postCommit eventListner, it sends own openSearcher command to reader Solr Cloud's own handler. Then reader Solr Cloud will create openSearcher directly without commit and return the writer's request. With this approach, Writer and Reader can use the same commit points in shared file system in synchronous way. When a Reader SolrCloud starts, it doesn't create openSearcher. Instead. Writer Solr Cloud listens the zookeeper of Reader Solr Cloud. Any change in the reader SolrCloud, writer sends openSearcher command to reader Solr Cloud. Does it make sense? Or am I missing some important stuff? any feedback would be very helpful to me. Thanks, Jae
Re: How to properly use Levenstein distance with ~ in Java
Because ~ is proximity matching. Lucene supports finding words are a within a specific distance away. Search for foo bar within 4 words from each other. foo bar~4 Note that for proximity searches, exact matches are proximity zero, and word transpositions (bar foo) are proximity 1. A query such as foo bar~1000 is an interesting alternative to foo AND bar. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-properly-use-Levenstein-distance-with-in-Java-tp4164793p4165079.html Sent from the Solr - User mailing list archive at Nabble.com.