RE: SolrCloud never fully recovers after slow disks

2013-11-11 Thread Henrik Ossipoff Hansen
The joy was short-lived.

Tonight our environment was “down/slow” a bit longer than usual. It looks like 
two of our nodes never recovered, clusterstate says everything is active. All 
nodes are throwing this in the log (the nodes they have trouble reaching are 
the ones that are affected) - the error comes about several cores:

ERROR - 2013-11-11 09:16:42.735; org.apache.solr.common.SolrException; Error 
while trying to recover. 
core=products_se_shard1_replica2:org.apache.solr.client.solrj.SolrServerException:
 Timeout occured while waiting response from server at: 
http://solr04.cd-et.com:8080/solr
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:431)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:198)
at 
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:342)
at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:219)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
at 
org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
at 
org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
at 
org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
at 
org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
at 
org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
at 
org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
at 
org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
at 
org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
at 
org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
at 
org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:717)
at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:522)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
... 4 more

ERROR - 2013-11-11 09:16:42.736; org.apache.solr.cloud.RecoveryStrategy; 
Recovery failed - trying again... (30) core=products_se_shard1_replica2
--
Henrik Ossipoff Hansen
Developer, Entertainment Trading


On 10. nov. 2013 at 21.07.32, Henrik Ossipoff Hansen 
(h...@entertainment-trading.commailto://h...@entertainment-trading.com) wrote:

Solr version is 4.5.0.

I have done some tweaking. Doubling my Zookeeper timeout values in zoo.cfg and 
the Zookeeper timeout in solr.xml seemed to somewhat minimize the problem, but 
it still did occur. I next stopped all larger batch indexing in the period 
where the issues happened, which also seemed to help somewhat. Now the next 
thing weirds me out a bit - I switched from using Tomcat7 to using the Jetty 
that ships with Solr, and that actually seems to have fixed the last issues 
(together with stopping a few smaller updates - very few).

During the slow period in the night, I get something like this:

03:11:49 ERROR ZkController There was a problem finding the leader in 
zk:org.apache.solr.common.SolrException: Could not get leader props
03:06:47 ERROR Overseer Could not create Overseer node
03:06:47 WARN LeaderElector
03:06:47 WARN ZkStateReader ZooKeeper watch triggered,​ but Solr cannot talk to 
ZK
03:07:41 WARN RecoveryStrategy Stopping recovery for 
zkNodeName=solr04.cd-et.com:8080_solr_auto_suggest_shard1_replica2core=auto_suggest_shard1_replica2

After this, the cluster state seems to be fine, and I'm not being spammed with 
errors in the log files.

Bottom line is that the issues are fixed for now it seems, but I still find it 
weird that Solr was not able to fully receover.

// Henrik Ossipoff

-Original Message-
From: Mark Miller [mailto:markrmil...@gmail.com]
Sent: 10. november 2013 19:27
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud never fully recovers after 

Re: SolrCloud never fully recovers after slow disks

2013-11-11 Thread Yago Riveiro
Hi,   

I have sometimes this exception too, the recovering goes to an state of loop 
and I can only finish the recovering if I restart the replica that has the 
stuck core.

In my case I have ssd but replicas with 40 or 50 gigas. If I have 3 replicas in 
recovery mode and they are replicating from a same node, I have this error.

My rate of indexing is high too (~500 doc/s).

--  
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Monday, November 11, 2013 at 8:27 AM, Henrik Ossipoff Hansen wrote:

 The joy was short-lived.
  
 Tonight our environment was “down/slow” a bit longer than usual. It looks 
 like two of our nodes never recovered, clusterstate says everything is 
 active. All nodes are throwing this in the log (the nodes they have trouble 
 reaching are the ones that are affected) - the error comes about several 
 cores:
  
 ERROR - 2013-11-11 09:16:42.735; org.apache.solr.common.SolrException; Error 
 while trying to recover. 
 core=products_se_shard1_replica2:org.apache.solr.client.solrj.SolrServerException:
  Timeout occured while waiting response from server at: 
 http://solr04.cd-et.com:8080/solr
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:431)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
 at 
 org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:198)
 at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:342)
 at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:219)
 Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:150)
 at java.net.SocketInputStream.read(SocketInputStream.java:121)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
 at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
 at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
 at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
 at 
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
 at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
 at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:717)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:522)
 at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
 at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
 at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
 ... 4 more
  
 ERROR - 2013-11-11 09:16:42.736; org.apache.solr.cloud.RecoveryStrategy; 
 Recovery failed - trying again... (30) core=products_se_shard1_replica2
 --
 Henrik Ossipoff Hansen
 Developer, Entertainment Trading
  
  
 On 10. nov. 2013 at 21.07.32, Henrik Ossipoff Hansen 
 (h...@entertainment-trading.commailto://h...@entertainment-trading.com) 
 wrote:
  
 Solr version is 4.5.0.
  
 I have done some tweaking. Doubling my Zookeeper timeout values in zoo.cfg 
 and the Zookeeper timeout in solr.xml seemed to somewhat minimize the 
 problem, but it still did occur. I next stopped all larger batch indexing in 
 the period where the issues happened, which also seemed to help somewhat. Now 
 the next thing weirds me out a bit - I switched from using Tomcat7 to using 
 the Jetty that ships with Solr, and that actually seems to have fixed the 
 last issues (together with stopping a few smaller updates - very few).
  
 During the slow period in the night, I get something like this:
  
 03:11:49 ERROR ZkController There was a problem finding the leader in 
 zk:org.apache.solr.common.SolrException: Could not get leader props
 03:06:47 ERROR Overseer Could not create Overseer node
 03:06:47 WARN LeaderElector
 03:06:47 WARN ZkStateReader ZooKeeper watch triggered,​ but Solr cannot talk 
 to ZK
 03:07:41 WARN RecoveryStrategy Stopping recovery for 
 zkNodeName=solr04.cd-et.com:8080 
 

Re: Solr timeout after reboot

2013-11-11 Thread michael.boom
Thank you, Peter!
Last weekend I was up until 4am trying to understand why is Solr starting so
so sooo slow, when i had gave enough memory to fit the entire index.
And then I remembered your trick used on the m3.xlarge machines, tried it
and it worked like a charm!
Thank you again!



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-timeout-after-reboot-tp4096408p4100254.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Merging shards and replicating changes in SolrCloud

2013-11-11 Thread michael.boom
Thanks for the comments Shalin,I ended up doing just that, reindexing from
ground up. 



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Merging-shards-and-replicating-changes-in-SolrCloud-tp407p4100255.html
Sent from the Solr - User mailing list archive at Nabble.com.


spellcheck solr 4.3.1

2013-11-11 Thread Daniel Borup
Hey

I am running af solr 4.3.1 and working is implementing spellcheck using 
solr.DirectSolrSpellChecker everything seems to be working  fine but at have 
one issue.

If I search for
http://localhost:8765/solr/MainIndex/spell?q=kim%20AND%20larsen

the result is some hits and the spell component return the following structure.

lst name=spellcheck
lst name=suggestions
bool name=correctlySpelledtrue/bool
/lst
/lst
I would have liked that if some suggest were found they were return

If I do a search for
http://localhost:8765/solr/MainIndex/spell?q=kim%20AND%20larsenn

with larsen spelled wrong (larsenn) the spell component return the following:

lst name=spellcheck
lst name=suggestions
lst name=larsenn
int name=numFound1/int
int name=startOffset8/int
int name=endOffset15/int
int name=origFreq0/int
arr name=suggestion
lst
str name=wordlarsen/str
int name=freq12/int
/lst
/arr
/lst
bool name=correctlySpelledfalse/bool
lst name=collation
str name=collationQuerykim AND larsen/str
int name=hits12/int
lst name=misspellingsAndCorrections
str name=kimkim/str
str name=larsennlarsen/str
/lst
/lst
/lst
/lst

In my point of view this is correct but, if I do the same search as above just 
as an OR search http://localhost:8765/solr/MainIndex/spell?q=kim%20OR%20larsenn
The spell component return some result and:

lst name=spellcheck
lst name=suggestions
bool name=correctlySpelledtrue/bool
/lst
/lst

larsenn now is spelled corrected according to solr, I cannot understand this 
behavior. Is there a setting to adjust the spell component so it always return 
suggestions ? or a way to have suggest in OR search with one wrong word working?






Med venlig hilsen / Best regards

Daniel Borup
Tel: (+45) 28 87 69 18
E-mail: d...@alpha-solutions.dkmailto:d...@alpha-solutions.dk

Alpha Solutions A/S
Sølvgade 10, 1.sal, DK-1307 Copenhagen K
Tel: (+45) 70 20 65 38
Web: www.alpha-solutions.dkhttp://www.alpha-solutions.dk/


** This message including any attachments may contain confidential and/or 
privileged information
intended only for the person or entity to which it is addressed. If you are not 
the intended recipient
you should delete this message. Any printing, copying, distribution or other 
use of this message is strictly prohibited.
If you have received this message in error, please notify the sender 
immediately by telephone
or e-mail and delete all copies of this message and any attachments from your 
system. Thank you.



Adding a server to an existing SOLR cloud cluster

2013-11-11 Thread ade-b
Hi 

We have a SOLRCloud cluster of 3 solr servers (v4.5.0 running under tomcat)
with 1 shard. We added a new SOLR server (v4.5.1) by simply starting tomcat
and pointing it at the zookeeper ensemble used by the existing cluster. My
understanding was that this new server would handshake with zookeeper and
add itself as a replica to the existing cluster.

What has actually happened is that the server is in zookeeper's live_nodes,
but is not in the clusterstate.json file. It also does not have a
CORE/collection associated with it.

Any ideas? I assume I am missing a step. Do I have to manually create the
core on the new server?


Cheers
Ade



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-a-server-to-an-existing-SOLR-cloud-cluster-tp4100275.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Adding a server to an existing SOLR cloud cluster

2013-11-11 Thread primoz . skale
Try manually creating shard replicas on the new server. I think the new 
server is only used automatically when you start you Solr server instance 
with correct command line option (aka. -DnumShards)  - I never liked 
this kind of behaviour. 

The server is not present in clusterstate.json file, because it contains 
no replicas - but it is a live node, as you have already stated.

Best regards,

Primoz



From:   ade-b adrian.bro...@gmail.com
To: solr-user@lucene.apache.org
Date:   11.11.2013 14:48
Subject:Adding a server to an existing SOLR cloud cluster



Hi 

We have a SOLRCloud cluster of 3 solr servers (v4.5.0 running under 
tomcat)
with 1 shard. We added a new SOLR server (v4.5.1) by simply starting 
tomcat
and pointing it at the zookeeper ensemble used by the existing cluster. My
understanding was that this new server would handshake with zookeeper and
add itself as a replica to the existing cluster.

What has actually happened is that the server is in zookeeper's 
live_nodes,
but is not in the clusterstate.json file. It also does not have a
CORE/collection associated with it.

Any ideas? I assume I am missing a step. Do I have to manually create the
core on the new server?


Cheers
Ade



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-a-server-to-an-existing-SOLR-cloud-cluster-tp4100275.html

Sent from the Solr - User mailing list archive at Nabble.com.



Re: Adding a server to an existing SOLR cloud cluster

2013-11-11 Thread ade-b
Thanks.

If I understand what you are saying, it should automatically register itself
with the existing cluster if we start SOLR with the correct command line
options. We tried adding the numShards option to the command line but still
get the same outcome.

We start the new SOLR server using 

/usr/bin/java
-Djava.util.logging.config.file=/mnt/ephemeral/apache-tomcat-7.0.47/conf/logging.properties
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -server
-Xms256m -Xmx1024m -XX:+DisableExplicitGC
-Dsolr.solr.home=/mnt/ephemeral/solr -Dport=8080 -DhostContext=solr
-DnumShards=1 -DzkClientTimeout=15000 -DzkHost=zk ip address
-Djava.endorsed.dirs=/mnt/ephemeral/apache-tomcat-7.0.47/endorsed -classpath
/mnt/ephemeral/apache-tomcat-7.0.47/bin/bootstrap.jar:/mnt/ephemeral/apache-tomcat-7.0.47/bin/tomcat-juli.jar
-Dcatalina.base=/mnt/ephemeral/apache-tomcat-7.0.47
-Dcatalina.home=/mnt/ephemeral/apache-tomcat-7.0.47
-Djava.io.tmpdir=/mnt/ephemeral/apache-tomcat-7.0.47/temp
org.apache.catalina.startup.Bootstrap start

Regards
Ade



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-a-server-to-an-existing-SOLR-cloud-cluster-tp4100275p4100286.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: spellcheck solr 4.3.1

2013-11-11 Thread Dyer, James
There are 2 parameters you want to consider:

First is spellcheck.maxResultsForSuggest.  Because you have an OR query, 
you'll get hits if only 1 query term is in the index.  This parameter lets you 
tune it to make it suggest if the query returns n or fewer hits.  My memory 
tells me, however, that if you leave this parameter out entirely, it will still 
return suggestions for OR queries with some misspelled words (false memory on 
my part?).  Possibly you have this set to 1?  Omitting it might be a better 
option.  See 
http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.maxResultsForSuggest 
.

Second is collateParam, which lets you override certain query parameters when 
the spellchecker is testing collations against the index.  For instance, if you 
have q.op=OR, the spellchecker will return collations that possibly only have 
1 correct term.  The reason is it simply checks if a collation will return any 
hits.  So you can overide this with spellcheck.collateParam.q.op=AND.  The 
same can be done for mm if using edismax.  See 
http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collateParam.XX .

James Dyer
Ingram Content Group
(615) 213-4311

-Original Message-
From: Daniel Borup [mailto:d...@alpha-solutions.dk] 
Sent: Monday, November 11, 2013 7:38 AM
To: solr-user@lucene.apache.org
Subject: spellcheck solr 4.3.1

Hey

I am running af solr 4.3.1 and working is implementing spellcheck using 
solr.DirectSolrSpellChecker everything seems to be working  fine but at have 
one issue.

If I search for
http://localhost:8765/solr/MainIndex/spell?q=kim%20AND%20larsen

the result is some hits and the spell component return the following structure.

lst name=spellcheck
lst name=suggestions
bool name=correctlySpelledtrue/bool
/lst
/lst
I would have liked that if some suggest were found they were return

If I do a search for
http://localhost:8765/solr/MainIndex/spell?q=kim%20AND%20larsenn

with larsen spelled wrong (larsenn) the spell component return the following:

lst name=spellcheck
lst name=suggestions
lst name=larsenn
int name=numFound1/int
int name=startOffset8/int
int name=endOffset15/int
int name=origFreq0/int
arr name=suggestion
lst
str name=wordlarsen/str
int name=freq12/int
/lst
/arr
/lst
bool name=correctlySpelledfalse/bool
lst name=collation
str name=collationQuerykim AND larsen/str
int name=hits12/int
lst name=misspellingsAndCorrections
str name=kimkim/str
str name=larsennlarsen/str
/lst
/lst
/lst
/lst

In my point of view this is correct but, if I do the same search as above just 
as an OR search http://localhost:8765/solr/MainIndex/spell?q=kim%20OR%20larsenn
The spell component return some result and:

lst name=spellcheck
lst name=suggestions
bool name=correctlySpelledtrue/bool
/lst
/lst

larsenn now is spelled corrected according to solr, I cannot understand this 
behavior. Is there a setting to adjust the spell component so it always return 
suggestions ? or a way to have suggest in OR search with one wrong word working?






Med venlig hilsen / Best regards

Daniel Borup
Tel: (+45) 28 87 69 18
E-mail: d...@alpha-solutions.dkmailto:d...@alpha-solutions.dk

Alpha Solutions A/S
Sølvgade 10, 1.sal, DK-1307 Copenhagen K
Tel: (+45) 70 20 65 38
Web: www.alpha-solutions.dkhttp://www.alpha-solutions.dk/


** This message including any attachments may contain confidential and/or 
privileged information
intended only for the person or entity to which it is addressed. If you are not 
the intended recipient
you should delete this message. Any printing, copying, distribution or other 
use of this message is strictly prohibited.
If you have received this message in error, please notify the sender 
immediately by telephone
or e-mail and delete all copies of this message and any attachments from your 
system. Thank you.



Re: SolrCloud never fully recovers after slow disks

2013-11-11 Thread Mark Miller
The socket read timeouts are actually fairly short for recovery - we should 
probably bump them up. Can you file a JIRA issue? It may be a symptom rather 
than a cause, but given a slow env, bumping them up makes sense. 

- Mark

 On Nov 11, 2013, at 8:27 AM, Henrik Ossipoff Hansen 
 h...@entertainment-trading.com wrote:
 
 The joy was short-lived.
 
 Tonight our environment was “down/slow” a bit longer than usual. It looks 
 like two of our nodes never recovered, clusterstate says everything is 
 active. All nodes are throwing this in the log (the nodes they have trouble 
 reaching are the ones that are affected) - the error comes about several 
 cores:
 
 ERROR - 2013-11-11 09:16:42.735; org.apache.solr.common.SolrException; Error 
 while trying to recover. 
 core=products_se_shard1_replica2:org.apache.solr.client.solrj.SolrServerException:
  Timeout occured while waiting response from server at: 
 http://solr04.cd-et.com:8080/solr
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:431)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at 
 org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:198)
at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:342)
at 
 org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:219)
 Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:150)
at java.net.SocketInputStream.read(SocketInputStream.java:121)
at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
at 
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:717)
at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:522)
at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
... 4 more
 
 ERROR - 2013-11-11 09:16:42.736; org.apache.solr.cloud.RecoveryStrategy; 
 Recovery failed - trying again... (30) core=products_se_shard1_replica2
 --
 Henrik Ossipoff Hansen
 Developer, Entertainment Trading
 
 
 On 10. nov. 2013 at 21.07.32, Henrik Ossipoff Hansen 
 (h...@entertainment-trading.commailto://h...@entertainment-trading.com) 
 wrote:
 
 Solr version is 4.5.0.
 
 I have done some tweaking. Doubling my Zookeeper timeout values in zoo.cfg 
 and the Zookeeper timeout in solr.xml seemed to somewhat minimize the 
 problem, but it still did occur. I next stopped all larger batch indexing in 
 the period where the issues happened, which also seemed to help somewhat. Now 
 the next thing weirds me out a bit - I switched from using Tomcat7 to using 
 the Jetty that ships with Solr, and that actually seems to have fixed the 
 last issues (together with stopping a few smaller updates - very few).
 
 During the slow period in the night, I get something like this:
 
 03:11:49 ERROR ZkController There was a problem finding the leader in 
 zk:org.apache.solr.common.SolrException: Could not get leader props
 03:06:47 ERROR Overseer Could not create Overseer node
 03:06:47 WARN LeaderElector
 03:06:47 WARN ZkStateReader ZooKeeper watch triggered,​ but Solr cannot talk 
 to ZK
 03:07:41 WARN RecoveryStrategy Stopping recovery for 
 zkNodeName=solr04.cd-et.com:8080_solr_auto_suggest_shard1_replica2core=auto_suggest_shard1_replica2
 
 After this, the cluster state seems to be 

Re: Adding a server to an existing SOLR cloud cluster

2013-11-11 Thread primoz . skale
According to the wiki pages it should, but I have not really tried it yet 
- I like to make the bookeeping myself :)

I am sorry but someones with more knowledge of Solr will have to answer 
your question.

Primoz



From:   ade-b adrian.bro...@gmail.com
To: solr-user@lucene.apache.org
Date:   11.11.2013 15:44
Subject:Re: Adding a server to an existing SOLR cloud cluster



Thanks.

If I understand what you are saying, it should automatically register 
itself
with the existing cluster if we start SOLR with the correct command line
options. We tried adding the numShards option to the command line but 
still
get the same outcome.

We start the new SOLR server using 

/usr/bin/java
-Djava.util.logging.config.file=/mnt/ephemeral/apache-tomcat-7.0.47/conf/logging.properties
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -server
-Xms256m -Xmx1024m -XX:+DisableExplicitGC
-Dsolr.solr.home=/mnt/ephemeral/solr -Dport=8080 -DhostContext=solr
-DnumShards=1 -DzkClientTimeout=15000 -DzkHost=zk ip address
-Djava.endorsed.dirs=/mnt/ephemeral/apache-tomcat-7.0.47/endorsed 
-classpath
/mnt/ephemeral/apache-tomcat-7.0.47/bin/bootstrap.jar:/mnt/ephemeral/apache-tomcat-7.0.47/bin/tomcat-juli.jar
-Dcatalina.base=/mnt/ephemeral/apache-tomcat-7.0.47
-Dcatalina.home=/mnt/ephemeral/apache-tomcat-7.0.47
-Djava.io.tmpdir=/mnt/ephemeral/apache-tomcat-7.0.47/temp
org.apache.catalina.startup.Bootstrap start

Regards
Ade



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-a-server-to-an-existing-SOLR-cloud-cluster-tp4100275p4100286.html

Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unit of dimension for solr field

2013-11-11 Thread eakarsu
Thanks Upayavira 

It seems it needs too much work. I will have several more fields that will
have unit values.
Do we have more quicker way of implementing it?

We have Currency filed coming as default with SOLR. Can we use it?
Creating conversion rate table for each field? What I am expecting from
units is similar to currency field

Erol Akarsu




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100295.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud never fully recovers after slow disks

2013-11-11 Thread Henrik Ossipoff Hansen
I will file a JIRA later today.

What I don’t get though (I haven’t looked much into any actual Solr code) is 
that at this point, our systems are running fine, so timeouts shouldn’t be an 
issue. Those two nodes though, is somehow left in a state where their response 
time is up to around 120k ms - which is fairly high - everything else is 
running like normal at this point.
--
Henrik Ossipoff Hansen
Developer, Entertainment Trading


On 11. nov. 2013 at 16.01.58, Mark Miller 
(markrmil...@gmail.commailto://markrmil...@gmail.com) wrote:

The socket read timeouts are actually fairly short for recovery - we should 
probably bump them up. Can you file a JIRA issue? It may be a symptom rather 
than a cause, but given a slow env, bumping them up makes sense.

- Mark

 On Nov 11, 2013, at 8:27 AM, Henrik Ossipoff Hansen 
 h...@entertainment-trading.com wrote:

 The joy was short-lived.

 Tonight our environment was “down/slow” a bit longer than usual. It looks 
 like two of our nodes never recovered, clusterstate says everything is 
 active. All nodes are throwing this in the log (the nodes they have trouble 
 reaching are the ones that are affected) - the error comes about several 
 cores:

 ERROR - 2013-11-11 09:16:42.735; org.apache.solr.common.SolrException; Error 
 while trying to recover. 
 core=products_se_shard1_replica2:org.apache.solr.client.solrj.SolrServerException:
  Timeout occured while waiting response from server at: 
 http://solr04.cd-et.com:8080/solr
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:431)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
 at 
 org.apache.solr.cloud.RecoveryStrategy.sendPrepRecoveryCmd(RecoveryStrategy.java:198)
 at 
 org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:342)
 at org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:219)
 Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
 at java.net.SocketInputStream.read(SocketInputStream.java:150)
 at java.net.SocketInputStream.read(SocketInputStream.java:121)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.fillBuffer(AbstractSessionInputBuffer.java:166)
 at 
 org.apache.http.impl.io.SocketInputBuffer.fillBuffer(SocketInputBuffer.java:90)
 at 
 org.apache.http.impl.io.AbstractSessionInputBuffer.readLine(AbstractSessionInputBuffer.java:281)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:92)
 at 
 org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:62)
 at 
 org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:254)
 at 
 org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:289)
 at 
 org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:252)
 at 
 org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:191)
 at 
 org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:300)
 at 
 org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:127)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:717)
 at 
 org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:522)
 at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:906)
 at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:805)
 at 
 org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:784)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:365)
 ... 4 more

 ERROR - 2013-11-11 09:16:42.736; org.apache.solr.cloud.RecoveryStrategy; 
 Recovery failed - trying again... (30) core=products_se_shard1_replica2
 --
 Henrik Ossipoff Hansen
 Developer, Entertainment Trading


 On 10. nov. 2013 at 21.07.32, Henrik Ossipoff Hansen 
 (h...@entertainment-trading.commailto://h...@entertainment-trading.com) 
 wrote:

 Solr version is 4.5.0.

 I have done some tweaking. Doubling my Zookeeper timeout values in zoo.cfg 
 and the Zookeeper timeout in solr.xml seemed to somewhat minimize the 
 problem, but it still did occur. I next stopped all larger batch indexing in 
 the period where the issues happened, which also seemed to help somewhat. Now 
 the next thing weirds me out a bit - I switched from using Tomcat7 to using 
 the Jetty that ships with Solr, and that actually seems to have fixed the 
 last issues (together with stopping a few smaller updates - very few).

 During the slow period in the night, I get something like this:

 03:11:49 ERROR ZkController There was a problem finding the leader in 
 zk:org.apache.solr.common.SolrException: Could not get leader props
 03:06:47 ERROR Overseer 

Re: Unit of dimension for solr field

2013-11-11 Thread Ryan Cutter
I think Upayavira's suggestion of writing a filter factory fits what you're
asking for.  However, the other end of cleverness is to simple use
solr.TrieIntField and store everything in MB.  So for 1TB you'd
write 51200.  A range query for 256MB to 1GB would be field:[256 TO 1024].

Conversion from MB to your displayed unit (2TB, for example) would happen
in the application layer.  But using trie ints would be simple and
efficient.

- Ryan


On Mon, Nov 11, 2013 at 7:06 AM, eakarsu eaka...@gmail.com wrote:

 Thanks Upayavira

 It seems it needs too much work. I will have several more fields that will
 have unit values.
 Do we have more quicker way of implementing it?

 We have Currency filed coming as default with SOLR. Can we use it?
 Creating conversion rate table for each field? What I am expecting from
 units is similar to currency field

 Erol Akarsu




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100295.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr 3.5 and 4.4 - ParsedQuery is different for q=*

2013-11-11 Thread Shawn Heisey
On 11/10/2013 10:12 PM, subacini Arunkumar wrote:
 We are upgrading from Solr 3.5 to Solr 4.4. The response from 3.5 and 4.4
 are different. I have attached request and response [highlighted the major
 difference in RED]
 
 
 Can you please let me know how to change parsedQuery from *MatchAllDocsQuery
 to text.*
 
 
 *Also, in solr 4.4 if fq or q param has * , we are having this issue.
 Otheriwse parsedQuery value is text:searchStr*

Your attachment never made it through, because most attachments cannot
be sent to the mailing list.  Nothing was colored red -- the list
doesn't really do HTML email.

I can't tell if you're using * for emphasis or whether all asterisks
were literally there.  I'm going to assume that you aren't trying to
emphasize things.  Apologies if I'm wrong.

I can tell you that q=* is not really a valid query for any version of
Solr.  If you meant all documents with the standard query parser, use
q=*:*, which is a special shortcut for all documents.  If you meant all
documents with the dismax or edismax query parster, then set q.alt to
*:* and either pass an empty q value, or don't include q at all.

I'm really confused about what your filter query is supposed to
accomplish.  Four asterisks, one of which is escaped?  I have no idea
what that is supposed to do.

To go much further, we'll need to know what you are trying to
accomplish.  We will also need to see the config of your /select handler
on both versions, the field name(s) that you are trying to search, as
well as info from schema.xml about the field(s) and any related
fieldType settings.

Thanks,
Shawn



Re: Unit of dimension for solr field

2013-11-11 Thread Jack Krupansky
A custom token filter may indeed be the right way to go, but an alternative 
is the combination of an update processor and a query preprocessor.


The update processor, which could be a JavaScript script could normalize the 
string into a simple integer byte count. You might also want to keep 
separate fields, one for the raw string and one for the final byte count. A 
JavaScript script would be a lot easier to develop than a custom token 
filter.


A query preprocessor could do two things: First, the same string to byte 
count normalization as the update processor, plus generate a range query. 
So, for example, a query for 0.5 TB could match 512 GB, 500 GB, etc, with 
[5000 TO 4999].


Technically, you could implement a query preprocessor as a plugin Solr 
search component, but if that sounds like too much effort, an 
application-level implementation would probably be easier to master.


-- Jack Krupansky

-Original Message- 
From: Ryan Cutter

Sent: Monday, November 11, 2013 10:18 AM
To: solr-user@lucene.apache.org
Subject: Re: Unit of dimension for solr field

I think Upayavira's suggestion of writing a filter factory fits what you're
asking for.  However, the other end of cleverness is to simple use
solr.TrieIntField and store everything in MB.  So for 1TB you'd
write 51200.  A range query for 256MB to 1GB would be field:[256 TO 1024].

Conversion from MB to your displayed unit (2TB, for example) would happen
in the application layer.  But using trie ints would be simple and
efficient.

- Ryan


On Mon, Nov 11, 2013 at 7:06 AM, eakarsu eaka...@gmail.com wrote:


Thanks Upayavira

It seems it needs too much work. I will have several more fields that will
have unit values.
Do we have more quicker way of implementing it?

We have Currency filed coming as default with SOLR. Can we use it?
Creating conversion rate table for each field? What I am expecting from
units is similar to currency field

Erol Akarsu




--
View this message in context:
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100295.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Unit of dimension for solr field

2013-11-11 Thread eakarsu
Ryan and Upayavira,

Do we have an example skeleton to do this for schema.xml and solrconfig.xml?
Example java class that would help to build UnitResolvingFilterFactory
class?

Thanks

Erol Akarsu



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100303.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to cancel a collection 'optimize'?

2013-11-11 Thread Hoggarth, Gil
We have an internal Solr collection with ~1 billion documents. It's
split across 24 shards and uses ~3.2TB of disk space. Unfortunately
we've triggered an 'optimize' on the collection (via a restarted browser
tab), which has raised the disk usage to 4.6TB, with 130GB left on the
disk volume.

 

As I fully expect Solr to use up all of the disk space as the collection
is more than 50% of the disk volume, how can I cancel this optimize? And
separately, if I were to reissue with maxSegments=(high number, eg 40),
should I still expect the same disk usage? (I'm presuming so as doesn't
it need to gather the whole index to determine which docs should go into
which segments?)

 

Solr 4.4 on RHEL6.4, 160GB RAM, 5GB per shard.

 

(Great conference last week btw - so much to learn!)

 

 

Gil Hoggarth

Web Archiving Technical Services Engineer 

The British Library, Boston Spa, West Yorkshire, LS23 7BQ

Tel: 01937 546163

 



[Solr 4] Data grouping on weeks

2013-11-11 Thread Jamshaid Ashraf
Hi,

I'm new with solr and wanted to group data on weeks, is there any built-in
date round function so I give date to this function and it return me the
week of the year.

For example I query to solr against date (01/01/2013) it should return me
(1st week of 2013).

Like I have following documents in solr:

Doc1  CreatedDate: 1/1/2013 Data:ABC
Doc2  CreatedDate: 4/1/2013 Data:ABC
Doc3  CreatedDate: 3/2/2013 Data:ABC
Doc4  CreatedDate: 4/2/2013 Data:ABC
Doc5  CreatedDate: 12/2/2013 Data:ABC

Result should be:

2013 Week1 :2 records
2013 Week7 :2 records
2013 Week8 :1 record


Thanks in advance!

Jamshaid


Re: How to cancel a collection 'optimize'?

2013-11-11 Thread Otis Gospodnetic
Hi Gil,
(we spoke in Dublin, didn't we?)

Short of stopping Solr I have a feeling there isn't much you can
do hm. or, I wonder if you could somehow get a thread dump,
get the PID of the thread (since I believe threads in Linux are run as
processes), and then kill that thread... Feels scary and I'm not sure
what this might do to the index, but maybe somebody else can jump in
and comment on this approach or suggest a better one.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr  Elasticsearch Support * http://sematext.com/


On Mon, Nov 11, 2013 at 10:44 AM, Hoggarth, Gil gil.hogga...@bl.uk wrote:
 We have an internal Solr collection with ~1 billion documents. It's
 split across 24 shards and uses ~3.2TB of disk space. Unfortunately
 we've triggered an 'optimize' on the collection (via a restarted browser
 tab), which has raised the disk usage to 4.6TB, with 130GB left on the
 disk volume.



 As I fully expect Solr to use up all of the disk space as the collection
 is more than 50% of the disk volume, how can I cancel this optimize? And
 separately, if I were to reissue with maxSegments=(high number, eg 40),
 should I still expect the same disk usage? (I'm presuming so as doesn't
 it need to gather the whole index to determine which docs should go into
 which segments?)



 Solr 4.4 on RHEL6.4, 160GB RAM, 5GB per shard.



 (Great conference last week btw - so much to learn!)





 Gil Hoggarth

 Web Archiving Technical Services Engineer

 The British Library, Boston Spa, West Yorkshire, LS23 7BQ

 Tel: 01937 546163





Re: Adding a server to an existing SOLR cloud cluster

2013-11-11 Thread michael.boom
From my understanding, if your already existing cluster satisfies your
collection (already live nodes = nr shards * replication factor) there
wouldn't be any need for creating additional replicas on the new server,
unless you directly ask for them, after startup.
I usually just add the machine to the cluster and the manually create the
replicas I need.



-
Thanks,
Michael
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Adding-a-server-to-an-existing-SOLR-cloud-cluster-tp4100275p4100313.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [Solr 4] Data grouping on weeks

2013-11-11 Thread Erick Erickson
You're probably looking at date math, see:
http://lucene.apache.org/solr/4_5_1/solr-core/org/apache/solr/util/DateMathParser.html

You're probably going to be faceting to get these counts, see facet
ranges here:
http://wiki.apache.org/solr/SimpleFacetParameters#Facet_by_Range

So the start is something like date/YEAR, then gaps of +7DAYS or some such

Best,
Erick


On Mon, Nov 11, 2013 at 10:51 AM, Jamshaid Ashraf jamshaid...@gmail.comwrote:

 Hi,

 I'm new with solr and wanted to group data on weeks, is there any built-in
 date round function so I give date to this function and it return me the
 week of the year.

 For example I query to solr against date (01/01/2013) it should return me
 (1st week of 2013).

 Like I have following documents in solr:

 Doc1  CreatedDate: 1/1/2013 Data:ABC
 Doc2  CreatedDate: 4/1/2013 Data:ABC
 Doc3  CreatedDate: 3/2/2013 Data:ABC
 Doc4  CreatedDate: 4/2/2013 Data:ABC
 Doc5  CreatedDate: 12/2/2013 Data:ABC

 Result should be:

 2013 Week1 :2 records
 2013 Week7 :2 records
 2013 Week8 :1 record


 Thanks in advance!

 Jamshaid



RE: How to cancel a collection 'optimize'?

2013-11-11 Thread Hoggarth, Gil
Hi Otis, thanks for the response. I could stop the whole Solr service as
as yet there's no audience access to it, but might it be left in an
incomplete state and thus try to complete optimisation when the service
is restarted?

[Yes, we did speak in Dublin - you can see we need that monitoring
service! Must set up the demo version, asap!]

-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] 
Sent: 11 November 2013 16:02
To: solr-user@lucene.apache.org
Subject: Re: How to cancel a collection 'optimize'?

Hi Gil,
(we spoke in Dublin, didn't we?)

Short of stopping Solr I have a feeling there isn't much you can do
hm. or, I wonder if you could somehow get a thread dump, get the PID
of the thread (since I believe threads in Linux are run as processes),
and then kill that thread... Feels scary and I'm not sure what this
might do to the index, but maybe somebody else can jump in and comment
on this approach or suggest a better one.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics Solr 
Elasticsearch Support * http://sematext.com/


On Mon, Nov 11, 2013 at 10:44 AM, Hoggarth, Gil gil.hogga...@bl.uk
wrote:
 We have an internal Solr collection with ~1 billion documents. It's 
 split across 24 shards and uses ~3.2TB of disk space. Unfortunately 
 we've triggered an 'optimize' on the collection (via a restarted 
 browser tab), which has raised the disk usage to 4.6TB, with 130GB 
 left on the disk volume.



 As I fully expect Solr to use up all of the disk space as the 
 collection is more than 50% of the disk volume, how can I cancel this 
 optimize? And separately, if I were to reissue with maxSegments=(high 
 number, eg 40), should I still expect the same disk usage? (I'm 
 presuming so as doesn't it need to gather the whole index to determine

 which docs should go into which segments?)



 Solr 4.4 on RHEL6.4, 160GB RAM, 5GB per shard.



 (Great conference last week btw - so much to learn!)





 Gil Hoggarth

 Web Archiving Technical Services Engineer

 The British Library, Boston Spa, West Yorkshire, LS23 7BQ

 Tel: 01937 546163





Re: Function query matching

2013-11-11 Thread Peter Keegan
I replaced the frange filter with the following filter and got the correct
no. of results and it was 3X faster:

select?qq={!edismax v='news' qf='title^2
body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!edismax
v='news' qf='title^2 body'}

Then, I tried to simplify the query with parameter substitution, but 'fq'
didn't parse correctly:

select?qq={!edismax v='news' qf='title^2
body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq=$qq

What is the proper syntax?

Thanks,
Peter


On Thu, Nov 7, 2013 at 2:16 PM, Peter Keegan peterlkee...@gmail.com wrote:

 I'm trying to used a normalized score in a query as I described in a
 recent thread titled Re: How to get similarity score between 0 and 1 not
 relative score

 I'm using this query:
 select?qq={!edismax v='news' qf='title^2
 body'}scaledQ=scale(product(query($qq),1),0,1)q={!func}sum(product(0.75,$scaledQ),product(0.25,field(myfield)))fq={!frange
 l=0.001}$q

 Is there another way to accomplish this using dismax boosting?



 On Thu, Nov 7, 2013 at 12:55 PM, Jason Hellman 
 jhell...@innoventsolutions.com wrote:

 You can, of course, us a function range query:

 select?q=text:newsfq={!frange l=0 u=100}sum(x,y)


 http://lucene.apache.org/solr/4_5_1/solr-core/org/apache/solr/search/FunctionRangeQParserPlugin.html

 This will give you a bit more flexibility to meet your goal.

 On Nov 7, 2013, at 7:26 AM, Erik Hatcher erik.hatc...@gmail.com wrote:

  Function queries score (all) documents, but don't filter them.  All
 documents effectively match a function query.
 
Erik
 
  On Nov 7, 2013, at 1:48 PM, Peter Keegan peterlkee...@gmail.com
 wrote:
 
  Why does this function query return docs that don't match the embedded
  query?
  select?qq=text:newsq={!func}sum(query($qq),0)
 





Re: Function query matching

2013-11-11 Thread Yonik Seeley
On Mon, Nov 11, 2013 at 11:39 AM, Peter Keegan peterlkee...@gmail.com wrote:
 fq=$qq

 What is the proper syntax?

fq={!query v=$qq}

-Yonik
http://heliosearch.com -- making solr shine


RE: date range tree

2013-11-11 Thread Andreas Owen
Has someone at least got a idee how i could do a year/month-date-tree? 

In Solr-Wiki it is mentioned that facet.date.gap=+1DAY,+2DAY,+3DAY,+10DAY
should create 4 buckets but it doesn't work


-Original Message-
From: Andreas Owen [mailto:a...@conx.ch] 
Sent: Donnerstag, 7. November 2013 18:23
To: solr-user@lucene.apache.org
Subject: date range tree

I would like to make a facet on a date field with the following tree:

 

2013

4.Quartal

December

November

Oktober

3.Quartal

September

August

Juli

2.Quartal

June

Mai

April

1.   Quartal

March

February

January

2012 .

Same as above

 

 

So far I have this in solrconfig.xml:

 

str
name=facet.date{!ex=last_modified,thema,inhaltstyp,doctype}last_modified
/str

   str
name=facet.date.gap+1MONTH/str

   str
name=facet.date.endNOW/MONTH/str

   str
name=facet.date.startNOW/MONTH-36MONTHS/str

   str
name=facet.date.otherafter/str

 

Can I do this in one query or do I need multiple queries? If yes how would I
do the second and keep all the facet queries in the count?




Re: Function query matching

2013-11-11 Thread Peter Keegan
Thanks


On Mon, Nov 11, 2013 at 11:46 AM, Yonik Seeley yo...@heliosearch.comwrote:

 On Mon, Nov 11, 2013 at 11:39 AM, Peter Keegan peterlkee...@gmail.com
 wrote:
  fq=$qq
 
  What is the proper syntax?

 fq={!query v=$qq}

 -Yonik
 http://heliosearch.com -- making solr shine



qf match density?

2013-11-11 Thread Michael Tracey
While doing a search like:

q=great+gatsbydefType=edismaxqf=title^1.8

records with a title of great gatsby / great gatsby always score higher than 
great gatsby just a single time.

How do I express that a single match should be just as important as having the 
query match multiple times in the title field?

Thanks, m.


Nutch 1.7 + AJAX Solr returning ALL contents vs. SPECIFIC

2013-11-11 Thread Reyes, Mark
Hi:

I was encouraged to explore the Solr mail list, specifically regarding the 
fl–parameter.  What is that parameter for and can it accomplish my original 
task of crawling/indexing specific html components versus parsing the entire 
page?

My original question is listed below (previously on the Nutch mail list):

---
I’m using Nutch 1.7 to crawl/index the pages of my domain to Solr and 
JavaScript library AJAX Solr to capture that index as JSON, which would then 
print that to the front-end.
My question is, if it’s possible to have specific content return (i.e. An H2 
tag and a p tag) on the search results page versus all contents of that page?
---

Thanks again,
Mark





IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages sent from Bridgepoint Education may contain information that is 
confidential and may be legally privileged. Please do not read, copy, forward 
or store this message unless you are an intended recipient of it. If you 
received this transmission in error, please notify the sender by reply e-mail 
and delete the message and any attachments.

Re: Unit of dimension for solr field

2013-11-11 Thread eakarsu
Can DelimitedPayloadTokenFilterFactory be used to store unit dimension
information? This factory class can store extra information for field.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100345.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: dropping noise words and maintaining the relevancy

2013-11-11 Thread Susheel Kumar
Hello,

On dropping noise words we have scenario that we have to only drop ending noise 
words. For e.g. 160 Associates LP, the noise words here are Associates and LP 
but we only want to drop the LP one which is a ending noise word.

If we use stop words, it will drop both words and make search key as 160. 

Any suggestion?

Thanks in advance. 


-Original Message-
From: Susheel Kumar [mailto:susheel.ku...@thedigitalgroup.net] 
Sent: Thursday, October 31, 2013 9:59 PM
To: solr-user@lucene.apache.org
Subject: RE: dropping noise words and maintaining the relevancy

Thanks, Kranti. Nice suggestion. I'll try it out. 

-Original Message-
From: Kranti Parisa [mailto:kranti.par...@gmail.com]
Sent: Thursday, October 31, 2013 3:18 PM
To: solr-user@lucene.apache.org
Subject: Re: dropping noise words and maintaining the relevancy

One possible approach is you can populate the titles in a field (say
exactMatch) and point your search query to exactMatch:160 Associates LP
OR text:160 Associates LP
assuming that you have all the text populated into the field called text

you can also use field level boosting with the above query, example
exactMatch:160 Associates LP^10 OR text:160 Associates LP^5


Thanks,
Kranti K. Parisa
http://www.linkedin.com/in/krantiparisa



On Thu, Oct 31, 2013 at 4:00 PM, Susheel Kumar  
susheel.ku...@thedigitalgroup.net wrote:

 Hello,

 We have a very particular requirement of dropping noise words (LP, 
 LLP, LLC, Corp, Corporation, Inc, Incoporation, PA, Professional 
 Association, Attorney at law, GP, General Partnership etc.) at the end 
 of search key but maintaining the relevancy. For e.g.

 If user search for 160 Associates LP, we want search to return in 
 their below relevancy order. Basically if exact / similar match is 
 present, it comes first followed by other results.

 160 Associates LP
 160 Associates
 160 Associates LLC
 160 Associates LLLP
 160 Hilton Associates

 If I handle this through Stop words then LP will get dropped from 
 search key and then all results will come but exact match will be 
 shown somewhere lower or deep.

 Regards and appreciate your help.
 Susheel



HTTP 500 error when invoking a REST client in Solr Analyzer

2013-11-11 Thread Dileepa Jayakody
Hi All,

I am working on a custom analyzer in Solr to post content to Apache Stanbol
for enhancement during indexing. To post content to Stanbol, inside my
custom analyzer's incrementToken() method I have written below code using
Jersey client API sample [1];

public boolean incrementToken() throws IOException {
if (!input.incrementToken()) {
  return false;
}
char[] buffer = charTermAttr.buffer();
String content = new String(buffer);
Client client = Client.create();
WebResource webResource = client.resource(http://localhost:8080/enhancer;);
ClientResponse response = webResource.type(text/plain).accept(new
MediaType(application, rdf+xml)).post(ClientResponse.class, content);
int status = response.getStatus();
if (status != 200  status != 201  status != 202) {
throw new RuntimeException(Failed : HTTP error code : 
 + response.getStatus());
}

String output = response.getEntity(String.class);
System.out.println(output);
   charTermAttr.setEmpty();
   char[] newBuffer = output.toCharArray();
   charTermAttr.copyBuffer(newBuffer, 0, newBuffer.length);
return true;
}

When testing the analyzer I always get a HTTP 500 response from Stanbol
server and I cannot process the enhancement response properly. But I could
successfully execute the same jersey client code above in a Java
application (in main method) and retrieve desired enhancement response from
Stanbol.

Any ideas why I always get a HTTP 500 error when invoking a rest endpoint
in Solr analyzer? Could it be a permission problem in my Solr analyzer ?
Appreciate your help.

Thanks,
Dileepa

[1]
https://blogs.oracle.com/enterprisetechtips/entry/consuming_restful_web_services_with

[2]
6424 [qtp918598659-11] ERROR org.apache.solr.core.SolrCore  –
java.lang.RuntimeException: Failed : HTTP error code : 500
at
com.solr.test.analyzer.ContentFilter.incrementToken(ContentFilter.java:70)
at
org.apache.solr.handler.AnalysisRequestHandlerBase.analyzeTokenStream(AnalysisRequestHandlerBase.java:179)
at
org.apache.solr.handler.AnalysisRequestHandlerBase.analyzeValue(AnalysisRequestHandlerBase.java:126)
at
org.apache.solr.handler.FieldAnalysisRequestHandler.analyzeValues(FieldAnalysisRequestHandler.java:221)
at
org.apache.solr.handler.FieldAnalysisRequestHandler.handleAnalysisRequest(FieldAnalysisRequestHandler.java:190)
at
org.apache.solr.handler.FieldAnalysisRequestHandler.doAnalysis(FieldAnalysisRequestHandler.java:101)
at
org.apache.solr.handler.AnalysisRequestHandlerBase.handleRequestBody(AnalysisRequestHandlerBase.java:59)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
at org.eclipse.jetty.server.Server.handle(Server.java:368)
at
org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
at
org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
at
org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942)
at
org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004)
at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640)
at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
at
org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
at
org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264)
at

Re: Solr grouping performance porblem

2013-11-11 Thread shamik
Thanks Joel, appreciate your help. Is Solr 4.6 due this year ?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-grouping-performance-porblem-tp4098565p4100358.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: HTTP 500 error when invoking a REST client in Solr Analyzer

2013-11-11 Thread Dileepa Jayakody
This seems to be a weird intermittent issue when I use the Analysis UI (
http://localhost:8983/solr/#/collection1/analysis) for testing my Analyzer.
It works fine when I hard code the input value in the Analyzer and index. I
gave the same input : Tim Bernes Lee is a professor at MIT hard coded in
the Analyzer class and from the Solr Analysis UI. The UI response failed
intermittently when I adjust the field value.
This could be a problem with character encoding of the field value it seems.

Thanks,
Dileepa


On Tue, Nov 12, 2013 at 1:33 AM, Dileepa Jayakody dileepajayak...@gmail.com
 wrote:

 Hi All,

 I am working on a custom analyzer in Solr to post content to Apache
 Stanbol for enhancement during indexing. To post content to Stanbol, inside
 my custom analyzer's incrementToken() method I have written below code
 using Jersey client API sample [1];

 public boolean incrementToken() throws IOException {
 if (!input.incrementToken()) {
   return false;
 }
 char[] buffer = charTermAttr.buffer();
 String content = new String(buffer);
 Client client = Client.create();
 WebResource webResource = client.resource(http://localhost:8080/enhancer
 );
  ClientResponse response = webResource.type(text/plain).accept(new
 MediaType(application, rdf+xml)).post(ClientResponse.class, content);
  int status = response.getStatus();
 if (status != 200  status != 201  status != 202) {
  throw new RuntimeException(Failed : HTTP error code : 
  + response.getStatus());
  }

 String output = response.getEntity(String.class);
  System.out.println(output);
charTermAttr.setEmpty();
char[] newBuffer = output.toCharArray();
 charTermAttr.copyBuffer(newBuffer, 0, newBuffer.length);
 return true;
 }

 When testing the analyzer I always get a HTTP 500 response from Stanbol
 server and I cannot process the enhancement response properly. But I could
 successfully execute the same jersey client code above in a Java
 application (in main method) and retrieve desired enhancement response from
 Stanbol.

 Any ideas why I always get a HTTP 500 error when invoking a rest endpoint
 in Solr analyzer? Could it be a permission problem in my Solr analyzer ?
 Appreciate your help.

 Thanks,
 Dileepa

 [1]
 https://blogs.oracle.com/enterprisetechtips/entry/consuming_restful_web_services_with

 [2]
 6424 [qtp918598659-11] ERROR org.apache.solr.core.SolrCore  –
 java.lang.RuntimeException: Failed : HTTP error code : 500
 at
 com.solr.test.analyzer.ContentFilter.incrementToken(ContentFilter.java:70)
  at
 org.apache.solr.handler.AnalysisRequestHandlerBase.analyzeTokenStream(AnalysisRequestHandlerBase.java:179)
 at
 org.apache.solr.handler.AnalysisRequestHandlerBase.analyzeValue(AnalysisRequestHandlerBase.java:126)
  at
 org.apache.solr.handler.FieldAnalysisRequestHandler.analyzeValues(FieldAnalysisRequestHandler.java:221)
 at
 org.apache.solr.handler.FieldAnalysisRequestHandler.handleAnalysisRequest(FieldAnalysisRequestHandler.java:190)
  at
 org.apache.solr.handler.FieldAnalysisRequestHandler.doAnalysis(FieldAnalysisRequestHandler.java:101)
 at
 org.apache.solr.handler.AnalysisRequestHandlerBase.handleRequestBody(AnalysisRequestHandlerBase.java:59)
  at
 org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
 at
 org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:241)
  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1859)
 at
 org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:703)
  at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:406)
 at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:195)
  at
 org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
 at
 org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
  at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
 at
 org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557)
  at
 org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
 at
 org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
  at
 org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
 at
 org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
  at
 org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
 at
 org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
  at
 org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
 at
 org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)
  at
 org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
 at org.eclipse.jetty.server.Server.handle(Server.java:368)
  at
 

Re: Unit of dimension for solr field

2013-11-11 Thread Erick Erickson
You seem to be consistently missing the problem that your queries will not
work as expected. How would you do a range query without writing a some
kind of custom code that looked at the payloads to determine the normalized
units?

The simplest way to do this is probably have your ingestion side normalize.
Put the original (complete with units) in a field that has indexed=false,
this will only be used for showing in the results list.

_Also_ add the normalized field to another filed that you set
indexed=true and stored=false to. that will allow range searches,
faceting, etc.

HTH,
Erick


On Mon, Nov 11, 2013 at 2:36 PM, eakarsu eaka...@gmail.com wrote:

 Can DelimitedPayloadTokenFilterFactory be used to store unit dimension
 information? This factory class can store extra information for field.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Unit-of-dimension-for-solr-field-tp4100209p4100345.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr grouping performance porblem

2013-11-11 Thread Erick Erickson
In fact, there's some movement towards starting the release process this
week, stay tuned!

Erick


On Mon, Nov 11, 2013 at 4:12 PM, shamik sham...@gmail.com wrote:

 Thanks Joel, appreciate your help. Is Solr 4.6 due this year ?



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Solr-grouping-performance-porblem-tp4098565p4100358.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr grouping performance porblem

2013-11-11 Thread Shawn Heisey

On 11/11/2013 2:12 PM, shamik wrote:

Thanks Joel, appreciate your help. Is Solr 4.6 due this year ?


The job of release manager for 4.6 has already been claimed.  There 
should be a release candidate posted on the dev list sometime on 
November 12th (tomorrow) in the USA timezones, unless a serious problem 
is discovered.


After the RC gets posted, there is a 72-hour voting period where 
committers vote whether or not to release that version.  If someone 
finds a problem that warrants a negative vote during that 72 hour 
period, it will be put on hold until the problem is fixed.  A new RC 
will eventually be made available and the 72-hour voting period will 
begin again.  When the vote finally passes, the release process will 
begin.  It typically takes 2-3 days after that before the official 
announcement is made.


What this means in real terms is that 4.6 will most likely be out before 
the end of November.  It would take a major series of bugs and problems 
tokeep that from happening.


Because of the upcoming holiday madness, I think 4.7 is not likely to 
happen before next year.


Thanks,
Shawn



Re: How to cancel a collection 'optimize'?

2013-11-11 Thread Yonik Seeley
On Mon, Nov 11, 2013 at 11:28 AM, Hoggarth, Gil gil.hogga...@bl.uk wrote:
 I could stop the whole Solr service as
 as yet there's no audience access to it, but might it be left in an
 incomplete state and thus try to complete optimisation when the service
 is restarted?

Should be fine.

Lucene has a write-once architecture... existing segment files are not
changed, and only deleted when a merge (producing a new segment
containing the old segment) has completed.  So if you stop things in
the middle of a commit/optimize, the index should always correctly
open on the last completed commit/optimize.

-Yonik
http://heliosearch.com -- making solr shine


Why do people want to deploy to Tomcat?

2013-11-11 Thread Alexandre Rafalovitch
Hello,

I keep seeing here and on Stack Overflow people trying to deploy Solr to
Tomcat. We don't usually ask why, just help when where we can.

But the question happens often enough that I am curious. What is the actual
business case. Is that because Tomcat is well known? Is it because other
apps are running under Tomcat and it is ops' requirement? Is it because
Tomcat gives something - to Solr - that Jetty does not?

It might be useful to know. Especially, since Solr team is considering
making the server part into a black box component. What use cases will that
break?

So, if somebody runs Solr under Tomcat (or needed to and gave up), let's
use this thread to collect this knowledge.

Regards,
   Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all at
once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD book)


Indexing a token to a different field in a custom filter

2013-11-11 Thread Dileepa Jayakody
Hi All,

In my custom filter, I need to index the processed token into a different
field. The processed token is a Stanbol enhancement response.

The solution I have so far found is to use a Solr client (solj) to add a
new Document with my processed field into Solr. Below is the sample code
segment;

 SolrServer server = new HttpSolrServer(http://localhost:8983/solr/;);
SolrInputDocument doc1 = new SolrInputDocument();
doc1.addField( id, id1, 1.0f );
doc1.addField(stanbolResponse, response);
try {
server.add(doc1);
server.commit();
} catch (SolrServerException e) {
e.printStackTrace();
}


This mechanism requires a new HTTP call to the local Solr server for every
token I process for the stanbolRequest field, and I feel it's not very
efficient.

Is there any other alternative way to invoke a update request to add a new
field to the indexing document within the filter (without making an
explicit HTTP call using Solrj)?

Thanks,
Dileepa


SolrCloud 4.5.1 and Zookeeper SASL

2013-11-11 Thread Sven Stark
Howdy.

We are testing to upgrade Solr from 4.3 to 4.5.1 . We're using SolrCloud
and our problem is that the core does not appear to be loaded anymore.

We've set logging to DEBUG and we've found lots of those

2013-11-12 06:30:43,339 [pool-2-thread-1-SendThread(our.zookeeper.com:2181)]
DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient âU+0080U+0093
Could not retrieve login configuration: java.lang.SecurityException: Unable
to locate a login configuration

Zookeeper is up and running.

Is there any doco on how to disable SASL ? Or what changes were made to
SolrCould exactly?

Much appreciated,
Sven


Re: eDisMax, multiple language support and stopwords

2013-11-11 Thread Liu Bo
Happy to see some one have similar solutions as ours.

we have similar multi-language search feature and we index different
language content to _fr, _en field like you've done

but in search, we need a language code as a parameter to specify the
language client wants to search on which is normally decided by the website
visited, such as: qf=name descriptionlanguage=en

and in our search components we find the right field: name_en and
description_en to be searched on

we used to support on all language search and removed that later, as the
site tells the customer which language is supported, we also don't think we
have many language experts on our web sites that knows more than two
language and need to search them at the same time.


On 7 November 2013 23:01, Tom Mortimer tom.m.f...@gmail.com wrote:

 Ah, thanks Markus. I think I'll just add the Boolean operators to the
 stopwords list in that case.

 Tom



 On 7 November 2013 12:01, Markus Jelsma markus.jel...@openindex.io
 wrote:

  This is an ancient problem. The issue here is your mm-parameter, it gets
  confused because for separate fields different amount of tokens are
  filtered/emitted so it is never going to work just like this. The easiest
  option is not to use the stopfilter.
 
 
 
 http://lucene.472066.n3.nabble.com/Dismax-Minimum-Match-Stopwords-Bug-td493483.html
  https://issues.apache.org/jira/browse/SOLR-3085
 
  -Original message-
   From:Tom Mortimer tom.m.f...@gmail.com
   Sent: Thursday 7th November 2013 12:50
   To: solr-user@lucene.apache.org
   Subject: eDisMax, multiple language support and stopwords
  
   Hi all,
  
   Thanks for the help and advice I've got here so far!
  
   Another question - I want to support stopwords at search time, so that
  e.g.
   the query oscar and wilde is equivalent to oscar wilde (this is
 with
   lowercaseOperators=false). Fair enough, I have stopword and in the
  query
   analyser chain.
  
   However, I also need to support French as well as English, so I've got
  _en
   and _fr versions of the text fields, with appropriate stemming and
   stopwords. I index French content into the _fr fields and English into
  the
   _en fields. I'm searching with eDisMax over both versions, e.g.:
  
   str name=qfheadline_en headline_fr/str
  
   However, this means I get no results for oscar and wilde. The parsed
   query is:
  
   (+((DisjunctionMaxQuery((headline_fr:osca | headline_en:oscar))
   DisjunctionMaxQuery((headline_fr:and))
   DisjunctionMaxQuery((headline_fr:wild |
 headline_en:wild)))~3))/no_coord
  
   If I add and to the French stopwords list, I *do* get results, and
 the
   parsed query is:
  
   (+((DisjunctionMaxQuery((headline_fr:osca | headline_en:oscar))
   DisjunctionMaxQuery((headline_fr:wild |
 headline_en:wild)))~2))/no_coord
  
   This implies that the only solution is to have a minimal, shared
  stopwords
   list for all languages I want to support. Is this correct, or is there
 a
   way of supporting this kind of searching with per-language stopword
  lists?
  
   Thanks for any ideas!
  
   Tom
  
 




-- 
All the best

Liu Bo


Replicate Solr Cloud

2013-11-11 Thread Aji Jaya
Hi

i want to create solr cloud like this:

1 solr cloud in location A, and another solr cloud in location B how to
make that solr cloud is location B is replicate solr cloud in location A.


And if all node in slor cloud A is die slor cloud B is still working and
vice versa.

any body know how to create this


thanks


Re: Multi-core support for indexing multiple servers

2013-11-11 Thread Liu Bo
like Erick said, merge data from different datasource could be very
difficult, SolrJ is much easier to use but may need another application to
do handle index process if you don't want to extends solr much.

I eventually end up with a customized request handler which use SolrWriter
from DIH package to index data,

So that I can fully control the index process, quite like SolrJ, you can
write code to convert your data into SolrInputDocument, and then post them
to SolrWriter, SolrWriter will handles the rest stuff.


On 8 November 2013 21:46, Erick Erickson erickerick...@gmail.com wrote:

 Yep, you can define multiple data sources for use with DIH.

 Combining data from those multiple sources into a single
 index can be a bit tricky with DIH, personally I tend to prefer
 SolrJ, but that's mostly personal preference, especially if
 I want to get some parallelism going on.

 But whatever works

 Erick


 On Thu, Nov 7, 2013 at 11:17 PM, manju16832003 manju16832...@gmail.com
 wrote:

  Eric,
  Just a question :-), wouldn't it be easy to use DIH to pull data from
  multiple data sources.
 
  I do use DIH to do that comfortably. I have three data sources
   - MySQL
   - URLDataSource that returns XML from an .NET application
   - URLDataSource that connects to an API and return XML
 
  Here is part of data-config data source settings
  dataSource type=JdbcDataSource name=solr
  driver=com.mysql.jdbc.Driver
  url=jdbc:mysql://localhost/employeeDB batchSize=-1 user=root
  password=root/
 dataSource name=CRMServer type=URLDataSource encoding=UTF-8
  connectionTimeout=5000 readTimeout=1/
 dataSource name=ImageServer type=URLDataSource
 encoding=UTF-8
  connectionTimeout=5000 readTimeout=1/
 
 
  Of course, in application I do the same.
  To construct my results, I do connect to MySQL and those two data
 sources.
 
  Basically we have two point of indexing
   - Using DIH at one time indexing
   - At application whenever there is transaction to the details that we
 are
  storing in Solr.
 
 
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Multi-core-support-for-indexing-multiple-servers-tp4099729p4099933.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 




-- 
All the best

Liu Bo