Re: SPLITSHARD not working in SOLR-4.4.0

2013-10-16 Thread Shalin Shekhar Mangar
Sorry I misunderstood. That NPE can only happen if the uniqueKey is not
defined. The code already checks for a reader.fields() returning null.


On Wed, Oct 16, 2013 at 11:22 AM, Shalin Shekhar Mangar 
shalinman...@gmail.com wrote:

 Just to be clear, you had a required uniqueKey defined in the schema
 before you indexed any document, is that correct?

 It is possible to have a NPE in that line if there is an empty segment or
 if there are documents but no fields! I'm curious to understand how you
 ended up with an index like that.


 On Wed, Oct 16, 2013 at 11:01 AM, RadhaJayalakshmi 
 rlakshminaraya...@inautix.co.in wrote:

 Thanks for the response!!
 Yes i have defined unique key in the schema... Still it is throwing the
 same
 error..
 Is this SPLITSHARD a new feature that is under development in solr 4.4?
 Has
 anyone able to split the shards using SPLITSHARD successfully?




 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SPLITSHARD-not-working-in-SOLR-4-4-0-tp4095623p4095789.html
 Sent from the Solr - User mailing list archive at Nabble.com.




 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
Shalin Shekhar Mangar.


Re: limiting deep pagination

2013-10-16 Thread Furkan KAMACI
I just wonder that: Don't you implement a custom API that interacts with
Solr and limits such kinds of requestst? (I know that you are asking about
how to do that in Solr but I handle such situations at my custom search
APIs and want to learn what fellows do)


9 Ekim 2013 Çarşamba tarihinde Michael Sokolov 
msoko...@safaribooksonline.com adlı kullanıcı şöyle yazdı:
 On 10/8/13 6:51 PM, Peter Keegan wrote:

 Is there a way to configure Solr 'defaults/appends/invariants' such that
 the product of the 'start' and 'rows' parameters doesn't exceed a given
 value? This would be to prevent deep pagination.  Or would this require a
 custom requestHandler?

 Peter

 Just wondering -- isn't it the sum that you should be concerned about
rather than the product?  Actually I think what we usually do is limit both
independently, with slightly different concerns, since. eg start=1,
rows=1000 causes memory problems if you have large fields in your results,
where start=1000, rows=1 may not actually be a problem

 -Mike



Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
I sometimes also do get null ranges when doing colletions/cores API 
actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed 
because zkCli had problems with putfile command, but in 4.5.0 it works 
OK. All you have to do is download clusterstate.json from ZK (get 
/clusterstate.json), fix ranges to appropriate values and upload the file 
back to ZK with zkCli. 

But why those null ranges happen at all is beyond me :)

Primoz



From:   Shalin Shekhar Mangar shalinman...@gmail.com
To: solr-user@lucene.apache.org
Date:   16.10.2013 07:37
Subject:Re: Regarding Solr Cloud issue...



I'm sorry I am not able to reproduce this issue.

I started 5 solr-4.4 instances.
I copied example directory into example1, example2, example3 and example4
cd example; java -Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar start.jar
cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar start.jar
cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar start.jar

After that I invoked:
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1


I can see all shards having non-null ranges in clusterstate.


On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote:

 Hi Shalin,.

 Thank you for your quick reply. I appreciate all the help.

 I started the solr cloud servers first...with 5 nodes.

 then i issued a command like below to create the shards -


 
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1

 
 
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4

 

 Please advice.

 Regards,
 Chris


 On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar 
 shalinman...@gmail.com wrote:

  How did you create these shards? Can you tell us how to reproduce the
  issue?
 
  Any shard in a collection with compositeId router should never have 
null
  ranges.
 
 
  On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote:
 
   Hi,
  
   I am using solr 4.4 as cloud. while creating shards i see that the 
last
   shard has range of null. i am not sure if this is a bug.
  
   I am stuck with having null value for the range in clusterstate.json
   (attached below)
  
   shard5:{ range:null, state:active, 
replicas:{core_node1:{
   state:active, core:Web_shard5_replica1,
   node_name:domain-name.com:1981_solr, base_url:
   http://domain-name.com:1981/solr;, leader:true,
   router:compositeId},
  
   I tried to use zookeeper cli to change this, but it was not able to. 
I
   tried to locate this file, but didn't find it anywhere.
  
   Can you please let me know how do i change the range from null to
  something
   meaningful? i have the range that i need, so if i can find the file,
  maybe
   i can change it manually.
  
   My next question is - can we have a catch all for ranges, i mean if
  things
   don't match any other range then insert in this shard..is this
 possible?
  
   Kindly advice.
   Chris
  
 
 
 
  --
  Regards,
  Shalin Shekhar Mangar.
 




-- 
Regards,
Shalin Shekhar Mangar.



Re: Cores with lot of folders with prefix index.XXXXXXX

2013-10-16 Thread primoz . skale
I will certainly try, but give me some time :)

Primoz



From:   Shalin Shekhar Mangar shalinman...@gmail.com
To: solr-user@lucene.apache.org
Date:   16.10.2013 07:05
Subject:Re: Cores with lot of folders with prefix index.XXX



I think that's an acceptable strategy. Can you put up a patch?


On Tue, Oct 15, 2013 at 2:32 PM, primoz.sk...@policija.si wrote:

 I have a question for developers of Solr regarding the issue of
 left-over index folders when replication fails. Could be this issue
 resolved quickly if when replication starts Solr creates a flag file 
in
 index. folder and when replication ends (and commits) this file is
 deleted? In this case if a server is restarted (or on schedule) it could
 quickly scan all the index. folders and delete those (maybe not 
the
 last one or those relevant to the index.properties file) that still
 *contain* a flag file and are so unfinished and uncommited.

 I have not really looked at the code yet so I may have a different view 
on
 the workings of replication. Would the solution I described at least
 address this issue?

 Best regards,

 Primoz





 From:   primoz.sk...@policija.si
 To: solr-user@lucene.apache.org
 Date:   11.10.2013 12:46
 Subject:Re: Cores with lot of folders with prefix index.XXX



 Thanks, I guess I was wrong after all in my last post.

 Primož




 From:   Shalin Shekhar Mangar shalinman...@gmail.com
 To: solr-user@lucene.apache.org
 Date:   11.10.2013 12:43
 Subject:Re: Cores with lot of folders with prefix index.XXX



 There are open issues related to extra index.XXX folders lying around if
 replication/recovery fails. See
 https://issues.apache.org/jira/browse/SOLR-4506


 On Fri, Oct 11, 2013 at 4:06 PM, Yago Riveiro
 yago.rive...@gmail.comwrote:

  The thread that you point is about master / slave - replication, Is 
this
  issue valid on SolrCloud context?
 
  I check the index.properties and indeed the variable index=index.X
  point to a folder, the others can be deleted without any scary side
 effect?
 
 
  --
  Yago Riveiro
  Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
 
 
  On Friday, October 11, 2013 at 11:31 AM, primoz.sk...@policija.si 
wrote:
 
   Do you have a lot of failed replications? Maybe those folders have
   something to do with this (please see the last answer at
  
 

 
http://stackoverflow.com/questions/3145192/why-does-my-solr-slave-index-keep-growing



   ). If your disk space is valuable check index.properties file under
 data
   folder and try to determine which folders can be safely deleted.
  
   Primo¾
  
  
  
  
   From: Yago Riveiro yago.rive...@gmail.com (mailto:
  yago.rive...@gmail.com)
   To: solr-user@lucene.apache.org (mailto:solr-user@lucene.apache.org)
   Date: 11.10.2013 12:13
   Subject: Re: Cores with lot of folders with prefix index.XXX
  
  
  
   I have ssd's therefor my space is like gold, I can have 30% of my
 space
   waste in failed replications, or replications that are not cleaned.
  
   The question for me is if this a normal behaviour or is a bug. If is 
a
   normal behaviour I have a trouble because a ssd with more than 512G 
is
   expensive.
  
   --
   Yago Riveiro
   Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
  
  
   On Friday, October 11, 2013 at 11:03 AM,
 primoz.sk...@policija.si(mailto:
  primoz.sk...@policija.si) wrote:
  
I think this is connected to replications being made? I also have
 quite
some of them but currently I am not worried :)
   
  
  
  
 
 
 


 --
 Regards,
 Shalin Shekhar Mangar.





-- 
Regards,
Shalin Shekhar Mangar.



Re: Regarding Solr Cloud issue...

2013-10-16 Thread Shalin Shekhar Mangar
Chris, can you post your complete clusterstate.json? Do all shards have a
null range? Also, did you issue any core admin CREATE commands apart from
the create collection api.

Primoz, I was able to reproduce this but by doing an illegal operation.
Suppose I create a collection with numShards=5 and then I issue a core
admin create command such as:
http://localhost:8983/solr/admin/cores?action=CREATEname=xyzcollection=mycollection51shard=shard6

Then a shard6 is added to the collection with a null range. This is a bug
because we should never allow such a core admin create to succeed anyway.
I'll open an issue.



On Wed, Oct 16, 2013 at 11:49 AM, primoz.sk...@policija.si wrote:

 I sometimes also do get null ranges when doing colletions/cores API
 actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed
 because zkCli had problems with putfile command, but in 4.5.0 it works
 OK. All you have to do is download clusterstate.json from ZK (get
 /clusterstate.json), fix ranges to appropriate values and upload the file
 back to ZK with zkCli.

 But why those null ranges happen at all is beyond me :)

 Primoz



 From:   Shalin Shekhar Mangar shalinman...@gmail.com
 To: solr-user@lucene.apache.org
 Date:   16.10.2013 07:37
 Subject:Re: Regarding Solr Cloud issue...



 I'm sorry I am not able to reproduce this issue.

 I started 5 solr-4.4 instances.
 I copied example directory into example1, example2, example3 and example4
 cd example; java -Dbootstrap_confdir=./solr/collection1/conf
 -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
 cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar
 cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar start.jar
 cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar start.jar
 cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar start.jar

 After that I invoked:

 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1


 I can see all shards having non-null ranges in clusterstate.


 On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote:

  Hi Shalin,.
 
  Thank you for your quick reply. I appreciate all the help.
 
  I started the solr cloud servers first...with 5 nodes.
 
  then i issued a command like below to create the shards -
 
 
 

 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1

  
 

 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4

  
 
  Please advice.
 
  Regards,
  Chris
 
 
  On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
   How did you create these shards? Can you tell us how to reproduce the
   issue?
  
   Any shard in a collection with compositeId router should never have
 null
   ranges.
  
  
   On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote:
  
Hi,
   
I am using solr 4.4 as cloud. while creating shards i see that the
 last
shard has range of null. i am not sure if this is a bug.
   
I am stuck with having null value for the range in clusterstate.json
(attached below)
   
shard5:{ range:null, state:active,
 replicas:{core_node1:{
state:active, core:Web_shard5_replica1,
node_name:domain-name.com:1981_solr, base_url:
http://domain-name.com:1981/solr;, leader:true,
router:compositeId},
   
I tried to use zookeeper cli to change this, but it was not able to.
 I
tried to locate this file, but didn't find it anywhere.
   
Can you please let me know how do i change the range from null to
   something
meaningful? i have the range that i need, so if i can find the file,
   maybe
i can change it manually.
   
My next question is - can we have a catch all for ranges, i mean if
   things
don't match any other range then insert in this shard..is this
  possible?
   
Kindly advice.
Chris
   
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
  
 



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
Shalin Shekhar Mangar.


Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
If I am not mistaken the only way to create a new shard from a collection 
in 4.4.0 was to use cores API. That worked fine for me until I used 
*other* cores API commands. Those usually produced null ranges. 

In 4.5.0 this is fixed with newly added commands createshard etc. to the 
collections API, right?

Primoz



From:   Shalin Shekhar Mangar shalinman...@gmail.com
To: solr-user@lucene.apache.org
Date:   16.10.2013 09:06
Subject:Re: Regarding Solr Cloud issue...



Chris, can you post your complete clusterstate.json? Do all shards have a
null range? Also, did you issue any core admin CREATE commands apart from
the create collection api.

Primoz, I was able to reproduce this but by doing an illegal operation.
Suppose I create a collection with numShards=5 and then I issue a core
admin create command such as:
http://localhost:8983/solr/admin/cores?action=CREATEname=xyzcollection=mycollection51shard=shard6


Then a shard6 is added to the collection with a null range. This is a 
bug
because we should never allow such a core admin create to succeed anyway.
I'll open an issue.



On Wed, Oct 16, 2013 at 11:49 AM, primoz.sk...@policija.si wrote:

 I sometimes also do get null ranges when doing colletions/cores API
 actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed
 because zkCli had problems with putfile command, but in 4.5.0 it works
 OK. All you have to do is download clusterstate.json from ZK (get
 /clusterstate.json), fix ranges to appropriate values and upload the 
file
 back to ZK with zkCli.

 But why those null ranges happen at all is beyond me :)

 Primoz



 From:   Shalin Shekhar Mangar shalinman...@gmail.com
 To: solr-user@lucene.apache.org
 Date:   16.10.2013 07:37
 Subject:Re: Regarding Solr Cloud issue...



 I'm sorry I am not able to reproduce this issue.

 I started 5 solr-4.4 instances.
 I copied example directory into example1, example2, example3 and 
example4
 cd example; java -Dbootstrap_confdir=./solr/collection1/conf
 -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
 cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar 
start.jar
 cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar 
start.jar
 cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar 
start.jar
 cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar 
start.jar

 After that I invoked:

 
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1



 I can see all shards having non-null ranges in clusterstate.


 On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote:

  Hi Shalin,.
 
  Thank you for your quick reply. I appreciate all the help.
 
  I started the solr cloud servers first...with 5 nodes.
 
  then i issued a command like below to create the shards -
 
 
 

 
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1


  
 

 
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4


  
 
  Please advice.
 
  Regards,
  Chris
 
 
  On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar 
  shalinman...@gmail.com wrote:
 
   How did you create these shards? Can you tell us how to reproduce 
the
   issue?
  
   Any shard in a collection with compositeId router should never have
 null
   ranges.
  
  
   On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote:
  
Hi,
   
I am using solr 4.4 as cloud. while creating shards i see that the
 last
shard has range of null. i am not sure if this is a bug.
   
I am stuck with having null value for the range in 
clusterstate.json
(attached below)
   
shard5:{ range:null, state:active,
 replicas:{core_node1:{
state:active, core:Web_shard5_replica1,
node_name:domain-name.com:1981_solr, base_url:
http://domain-name.com:1981/solr;, leader:true,
router:compositeId},
   
I tried to use zookeeper cli to change this, but it was not able 
to.
 I
tried to locate this file, but didn't find it anywhere.
   
Can you please let me know how do i change the range from null to
   something
meaningful? i have the range that i need, so if i can find the 
file,
   maybe
i can change it manually.
   
My next question is - can we have a catch all for ranges, i mean 
if
   things
don't match any other range then insert in this shard..is this
  possible?
   
Kindly advice.
Chris
   
  
  
  
   --
   Regards,
   Shalin Shekhar Mangar.
  
 



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
Shalin Shekhar Mangar.



Re: SPLITSHARD not working in SOLR-4.4.0

2013-10-16 Thread RadhaJayalakshmi
Shalin,
It is working for me. As you pointed rightly, i had defined UNIQUE_KEY field
in schema, but forgot to mention this field in the uniqueKey decalaration.
After i added this, it started working.
One another question i have with regard to SPLITSHARD is, we are not able to
control, which nodes of tomcat, the splitted shards should be create.
While creating a collection, we can mention createNodeSet to set our
preference of tomcat nodes on which the collections slices should be
created.
But i dont find that feature in SPLITSHARD API. Would you know that it is a
limitation in solr 4.4 or is there any other means by which we can achieve
this



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SPLITSHARD-not-working-in-SOLR-4-4-0-tp4095623p4095809.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [Indexing XML files in Solr with DataImportHandler]

2013-10-16 Thread kujta1
it is not indexing, it is saying there are no files indexed



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Indexing-XML-files-in-Solr-with-DataImportHandler-tp4095628p4095811.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: [Indexing XML files in Solr with DataImportHandler]

2013-10-16 Thread Gora Mohanty
On 16 October 2013 13:06, kujta1 kujtim.rahm...@gmail.com wrote:
 it is not indexing, it is saying there are no files indexed

If you expect answers on the mailing list it might be best to provide
details here. From a quick glance at Stackoverflow, it looks like you
need a FileListEntityProcessor.

Searching Google turns up many examples of using a FileDataSource,
e.g., see:
http://java.dzone.com/news/data-import-handler-%E2%80%93-import

Regards,
Gora


Re: Debugging update request

2013-10-16 Thread michael.boom
Thanks Erick!

The version is 4.4.0.

I'm posting 100k docs batches every 30-40 sec from each indexing client and
sometimes two or more clients post in a very small timeframe. That's when i
think the deadlock happens.

I'll try to replicate the problem and check the thread dump.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095821.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Debugging update request

2013-10-16 Thread Chris Geeringh
I ran an import last night, and this morning my cloud wouldn't accept
updates. I'm running the latest 4.6 snapshot. I was importing with latest
solrj snapshot, and using java bin transport with CloudSolrServer.

The cluster had indexed ~1.3 million docs before no further updates were
accepted, querying still working.

I'll run jstack shortly and provide the results.

On Wednesday, October 16, 2013, michael.boom wrote:

 Thanks Erick!

 The version is 4.4.0.

 I'm posting 100k docs batches every 30-40 sec from each indexing client and
 sometimes two or more clients post in a very small timeframe. That's when i
 think the deadlock happens.

 I'll try to replicate the problem and check the thread dump.



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095821.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: SPLITSHARD not working in SOLR-4.4.0

2013-10-16 Thread Shalin Shekhar Mangar
Thanks for clearing that.

The way it is implemented, shard splitting must create the leaders of
sub-shards on the same node as the leader of the parent shard. The location
of the other replicas of the sub-shards are chosen at random. Split shard
doesn't support a createNodeSet parameter yet but it'd make for a nice
improvement. Can you please open a jira issue?


On Wed, Oct 16, 2013 at 1:00 PM, RadhaJayalakshmi 
rlakshminaraya...@inautix.co.in wrote:

 Shalin,
 It is working for me. As you pointed rightly, i had defined UNIQUE_KEY
 field
 in schema, but forgot to mention this field in the uniqueKey
 decalaration.
 After i added this, it started working.
 One another question i have with regard to SPLITSHARD is, we are not able
 to
 control, which nodes of tomcat, the splitted shards should be create.
 While creating a collection, we can mention createNodeSet to set our
 preference of tomcat nodes on which the collections slices should be
 created.
 But i dont find that feature in SPLITSHARD API. Would you know that it is a
 limitation in solr 4.4 or is there any other means by which we can achieve
 this



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SPLITSHARD-not-working-in-SOLR-4-4-0-tp4095623p4095809.html
 Sent from the Solr - User mailing list archive at Nabble.com.




-- 
Regards,
Shalin Shekhar Mangar.


Re: Regarding Solr Cloud issue...

2013-10-16 Thread Shalin Shekhar Mangar
If the initial collection was created with a numShards parameter (and hence
compositeId router then there was no way to create a new logical shard. You
can add replicas with the core admin API but only to shards that already
exist. A new logical shard can only be created by splitting an existing one.

The createshard API also has the same limitation -- it cannot create a
shard for a collection with compositeId router. It is supposed to be used
for collections with custom sharding (i.e. implicit router). In such
collections, there is no concept of a hash range and routing is done
explicitly by the user using the shards parameter in the request or by
sending the request to the target core/node directly.

So, in summary, attempting to add a new logical shard to a collection with
compositeId router via CoreAdmin APIs is wrong, unsupported and should be
disallowed. Adding replicas to existing logical shards is okay though.


On Wed, Oct 16, 2013 at 12:56 PM, primoz.sk...@policija.si wrote:

 If I am not mistaken the only way to create a new shard from a collection
 in 4.4.0 was to use cores API. That worked fine for me until I used
 *other* cores API commands. Those usually produced null ranges.

 In 4.5.0 this is fixed with newly added commands createshard etc. to the
 collections API, right?

 Primoz



 From:   Shalin Shekhar Mangar shalinman...@gmail.com
 To: solr-user@lucene.apache.org
 Date:   16.10.2013 09:06
 Subject:Re: Regarding Solr Cloud issue...



 Chris, can you post your complete clusterstate.json? Do all shards have a
 null range? Also, did you issue any core admin CREATE commands apart from
 the create collection api.

 Primoz, I was able to reproduce this but by doing an illegal operation.
 Suppose I create a collection with numShards=5 and then I issue a core
 admin create command such as:

 http://localhost:8983/solr/admin/cores?action=CREATEname=xyzcollection=mycollection51shard=shard6


 Then a shard6 is added to the collection with a null range. This is a
 bug
 because we should never allow such a core admin create to succeed anyway.
 I'll open an issue.



 On Wed, Oct 16, 2013 at 11:49 AM, primoz.sk...@policija.si wrote:

  I sometimes also do get null ranges when doing colletions/cores API
  actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily fixed
  because zkCli had problems with putfile command, but in 4.5.0 it works
  OK. All you have to do is download clusterstate.json from ZK (get
  /clusterstate.json), fix ranges to appropriate values and upload the
 file
  back to ZK with zkCli.
 
  But why those null ranges happen at all is beyond me :)
 
  Primoz
 
 
 
  From:   Shalin Shekhar Mangar shalinman...@gmail.com
  To: solr-user@lucene.apache.org
  Date:   16.10.2013 07:37
  Subject:Re: Regarding Solr Cloud issue...
 
 
 
  I'm sorry I am not able to reproduce this issue.
 
  I started 5 solr-4.4 instances.
  I copied example directory into example1, example2, example3 and
 example4
  cd example; java -Dbootstrap_confdir=./solr/collection1/conf
  -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
  cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar
 start.jar
  cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar
 start.jar
  cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar
 start.jar
  cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar
 start.jar
 
  After that I invoked:
 
 

 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1

 
 
  I can see all shards having non-null ranges in clusterstate.
 
 
  On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote:
 
   Hi Shalin,.
  
   Thank you for your quick reply. I appreciate all the help.
  
   I started the solr cloud servers first...with 5 nodes.
  
   then i issued a command like below to create the shards -
  
  
  
 
 

 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1

 
   
  
 
 

 http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4

 
   
  
   Please advice.
  
   Regards,
   Chris
  
  
   On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar 
   shalinman...@gmail.com wrote:
  
How did you create these shards? Can you tell us how to reproduce
 the
issue?
   
Any shard in a collection with compositeId router should never have
  null
ranges.
   
   
On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com wrote:
   
 Hi,

 I am using solr 4.4 as cloud. while creating shards i see that the
  last
 shard has range of null. i am not sure if this is a bug.

 I am stuck with having null value for the range in
 clusterstate.json
 (attached below)

 shard5:{ range:null, state:active,
  replicas:{core_node1:{
 state:active, core:Web_shard5_replica1,
 

RE: ClusteringComponent under Tomcat 7

2013-10-16 Thread Lieberman, Ariel
Hi,

If I recall correctly this problem relate to the class loader path.

make sure that the ./lib (solr home, were you've replaced the jars) is not also 
part of the Tomcat class loader path.
(in other words solr and Tomcat cannot share the same ./lib directories.)

-Ariel

-Original Message-
From: ravi koshal [mailto:ravikosha...@gmail.com] 
Sent: Tuesday, October 15, 2013 10:10 AM
To: solr-user@lucene.apache.org
Subject: Re: ClusteringComponent under Tomcat 7

Hi Lieberman, 
I am facing the same issue. were you able to resolve this?
I am able to see the solr home , but the cores do not appear.
my stack trace is as follows :

org.apache.solr.common.SolrException: Error Instantiating SearchComponent, 
solr.clustering.ClusteringComponent failed to instantiate 
org.apache.solr.handler.component.SearchComponent
at org.apache.solr.core.SolrCore.init(SolrCore.java:834)
at org.apache.solr.core.SolrCore.init(SolrCore.java:625)
at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:524)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:559)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)
at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:241)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.util.concurrent.FutureTask.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source) Caused by: 
org.apache.solr.common.SolrException: Error Instantiating SearchComponent, 
solr.clustering.ClusteringComponent failed to instantiate 
org.apache.solr.handler.component.SearchComponent
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:547)
at org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:582)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2128)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2122)
at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2155)
at
org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java:1177)
at org.apache.solr.core.SolrCore.init(SolrCore.java:762)
... 11 more
Caused by: java.lang.ClassCastException: class 
org.apache.solr.handler.clustering.ClusteringComponent
at java.lang.Class.asSubclass(Unknown Source)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:44
3)
at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:38
1)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:526)


Lieberman, Ariel Ariel.Lieberman at verint.com writes:

 
 Hi,
 
 I'm trying to run Solr 4.3 (and 4.4) with 
 -Dsolr.clustering.enabled=true
 
 I've copied all relevant jars to ./lib directory under the instance.
 
 With jetty it runs OK! But, under Tomcat I receives the error 
 (exception)
below.
 
 Any idea/help?
 
 Thanks,
 
 -Ariel
 
 org.apache.solr.common.SolrException: Error Instantiating 
 SearchComponent, solr.clustering.ClusteringComponent failed to 
 instantiate
org.apache.solr.handler.component.SearchComponent
  at org.apache.solr.core.SolrCore.init(SolrCore.java:835)
  at org.apache.solr.core.SolrCore.init(SolrCore.java:629)
  at
org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:622)
  at org.apache.solr.core.CoreContainer.create(CoreContainer.java:657)
  at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364)
  at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356)
  at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
  at java.util.concurrent.FutureTask.run(Unknown Source)
  at java.util.concurrent.Executors$RunnableAdapter.call(Unknown
Source)
  at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
  at java.util.concurrent.FutureTask.run(Unknown Source)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
  at java.lang.Thread.run(Unknown Source) Caused by: 
 org.apache.solr.common.SolrException: Error Instantiating
SearchComponent,
 solr.clustering.ClusteringComponent failed to instantiate
org.apache.solr.handler.component.SearchComponent
  at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)
  at
org.apache.solr.core.SolrCore.createInitInstance(SolrCore.java:586)
  at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2173)
  at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2167)
  at org.apache.solr.core.SolrCore.initPlugins(SolrCore.java:2200)
  at
org.apache.solr.core.SolrCore.loadSearchComponents(SolrCore.java:1231)
  at org.apache.solr.core.SolrCore.init(SolrCore.java:766)
  ... 13 more
 Caused by: 

req info : SOLRJ and TermVector

2013-10-16 Thread elfu
hi,

can i access TermVector information using solrj ?


thx,
elfu


Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
Yap, you are right - I only created extra replicas with cores API. For a 
new shard I had to use split shard command.

My apologies.

Primož



From:   Shalin Shekhar Mangar shalinman...@gmail.com
To: solr-user@lucene.apache.org
Date:   16.10.2013 10:45
Subject:Re: Regarding Solr Cloud issue...



If the initial collection was created with a numShards parameter (and 
hence
compositeId router then there was no way to create a new logical shard. 
You
can add replicas with the core admin API but only to shards that already
exist. A new logical shard can only be created by splitting an existing 
one.

The createshard API also has the same limitation -- it cannot create a
shard for a collection with compositeId router. It is supposed to be used
for collections with custom sharding (i.e. implicit router). In such
collections, there is no concept of a hash range and routing is done
explicitly by the user using the shards parameter in the request or by
sending the request to the target core/node directly.

So, in summary, attempting to add a new logical shard to a collection with
compositeId router via CoreAdmin APIs is wrong, unsupported and should be
disallowed. Adding replicas to existing logical shards is okay though.


On Wed, Oct 16, 2013 at 12:56 PM, primoz.sk...@policija.si wrote:

 If I am not mistaken the only way to create a new shard from a 
collection
 in 4.4.0 was to use cores API. That worked fine for me until I used
 *other* cores API commands. Those usually produced null ranges.

 In 4.5.0 this is fixed with newly added commands createshard etc. to 
the
 collections API, right?

 Primoz



 From:   Shalin Shekhar Mangar shalinman...@gmail.com
 To: solr-user@lucene.apache.org
 Date:   16.10.2013 09:06
 Subject:Re: Regarding Solr Cloud issue...



 Chris, can you post your complete clusterstate.json? Do all shards have 
a
 null range? Also, did you issue any core admin CREATE commands apart 
from
 the create collection api.

 Primoz, I was able to reproduce this but by doing an illegal operation.
 Suppose I create a collection with numShards=5 and then I issue a core
 admin create command such as:

 
http://localhost:8983/solr/admin/cores?action=CREATEname=xyzcollection=mycollection51shard=shard6



 Then a shard6 is added to the collection with a null range. This is a
 bug
 because we should never allow such a core admin create to succeed 
anyway.
 I'll open an issue.



 On Wed, Oct 16, 2013 at 11:49 AM, primoz.sk...@policija.si wrote:

  I sometimes also do get null ranges when doing colletions/cores API
  actions CREATE or/and UNLOAD, etc... In 4.4.0 that was not easily 
fixed
  because zkCli had problems with putfile command, but in 4.5.0 it 
works
  OK. All you have to do is download clusterstate.json from ZK (get
  /clusterstate.json), fix ranges to appropriate values and upload the
 file
  back to ZK with zkCli.
 
  But why those null ranges happen at all is beyond me :)
 
  Primoz
 
 
 
  From:   Shalin Shekhar Mangar shalinman...@gmail.com
  To: solr-user@lucene.apache.org
  Date:   16.10.2013 07:37
  Subject:Re: Regarding Solr Cloud issue...
 
 
 
  I'm sorry I am not able to reproduce this issue.
 
  I started 5 solr-4.4 instances.
  I copied example directory into example1, example2, example3 and
 example4
  cd example; java -Dbootstrap_confdir=./solr/collection1/conf
  -Dcollection.configName=myconf -DzkRun -DnumShards=1 -jar start.jar
  cd example1; java -Djetty.port=7574 -DzkHost=localhost:9983 -jar
 start.jar
  cd example2; java -Djetty.port=7575 -DzkHost=localhost:9983 -jar
 start.jar
  cd example3; java -Djetty.port=7576 -DzkHost=localhost:9983 -jar
 start.jar
  cd example4; java -Djetty.port=7577 -DzkHost=localhost:9983 -jar
 start.jar
 
  After that I invoked:
 
 

 
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollection51numShards=5replicationFactor=1


 
 
  I can see all shards having non-null ranges in clusterstate.
 
 
  On Tue, Oct 15, 2013 at 8:47 PM, Chris christu...@gmail.com wrote:
 
   Hi Shalin,.
  
   Thank you for your quick reply. I appreciate all the help.
  
   I started the solr cloud servers first...with 5 nodes.
  
   then i issued a command like below to create the shards -
  
  
  
 
 

 
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=5replicationFactor=1


 
   
  
 
 

 
http://localhost:8983/solr/admin/collections?action=CREATEname=mycollectionnumShards=3replicationFactor=4


 
   
  
   Please advice.
  
   Regards,
   Chris
  
  
   On Tue, Oct 15, 2013 at 8:07 PM, Shalin Shekhar Mangar 
   shalinman...@gmail.com wrote:
  
How did you create these shards? Can you tell us how to reproduce
 the
issue?
   
Any shard in a collection with compositeId router should never 
have
  null
ranges.
   
   
On Tue, Oct 15, 2013 at 7:07 PM, Chris christu...@gmail.com 
wrote:
   
 Hi,

 I am using solr 4.4 as cloud. while creating shards i 

Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-16 Thread Stavros Delisavas
Hello Solr-Experts,

I am currently having a strange issue with my solr querys. I am running
a small php/mysql-website that uses Solr for faster text-searches in
name-lists, movie-titles, etc. Recently I noticed that the results on my
local development-environment differ from those on my webserver. Both
use the 100% same mysql-database with identical solr-queries for
data-import.
This is a sample query:

http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29version=2.2start=0rows=1000indent=onfl=titleid

It is autogenerated by an php-script and 100% identical on local and on
my webserver. My local solr gives me the expected results: all entries
that have the words into AND the AND wild* in them.
But my webserver acts as if I was looking for into OR the OR
wild*, eventhough the query is the same (as shown above). That's why I
get useless (too many) results on the webserver-side.

I don't know what could be the issue. I have tried to check the
config-files but I don't really know what to look for, so it is
overwhelming for me to search through this big file without knowing.

What could be the problem, where can I check/find it and how can I solve
that problem?

In case, additional informations are needed, let me know please.

Thank you!

(Excuse my poor english, please. It's not my mother-language.)


Re: Debugging update request

2013-10-16 Thread michael.boom
I got the trace from jstack.
I found references to semaphore but not sure if this is what you meant.
Here's the trace:
http://pastebin.com/15QKAz7U



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095847.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Debugging update request

2013-10-16 Thread Chris Geeringh
Here is my jstack output... Lots of blocked threads.

http://pastebin.com/1ktjBYbf


On 16 October 2013 10:28, michael.boom my_sky...@yahoo.com wrote:

 I got the trace from jstack.
 I found references to semaphore but not sure if this is what you meant.
 Here's the trace:
 http://pastebin.com/15QKAz7U



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Debugging-update-request-tp4095619p4095847.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Boosting a field with defType:dismax -- No results at all

2013-10-16 Thread uwe72
Hi there,

i want to boost a field, see below.

If i add the defType:dismax i don't get results at all anymore.

What i am doing wrong?

Regards
Uwe

requestHandler name=/select class=solr.SearchHandler
lst name=defaults
str name=omitHeadertrue/str
str name=dftext/str
str name=q.opAND/str


str name=spellcheck.dictionarydefault/str

str name=spellchecktrue/str
str name=spellcheck.extendedResultstrue/str
str name=spellcheck.count1/str

str name=spellcheck.maxResultsForSuggest100/str
str name=spellcheck.collatetrue/str
str name=spellcheck.collateExtendedResultstrue/str
str name=spellcheck.maxCollations1/str


str name=defTypedismax/str
str name=qf
   SignalImpl.baureihe^1011 text^0.1
/str




/lst
arr name=last-components
strspellcheck/str
/arr
/requestHandler



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850.html
Sent from the Solr - User mailing list archive at Nabble.com.


Timeout Errors while using Collections API

2013-10-16 Thread RadhaJayalakshmi
Hi,
My setup is
Zookeeper ensemble - running with 3 nodes
Tomcats - 9 Tomcat instances are brought up, by registereing with zookeeper. 

Steps :
1) I uploaded the solr configuration like db_data_config, solrconfig, schema
xmls into zookeeoper
2)  Now, i am trying to create a collection with the collection API like
below:

http://miadevuser001.albridge.com:7021/solr/admin/collections?action=CREATEname=Schwab_InvACC_CollnumShards=1replicationFactor=2createNodeSet=localhost:7034_solr,localhost:7036_solrcollection.configName=InvestorAccountDomainConfig

Now, when i execute this command, i am getting the following error:
responselst name=responseHeaderint name=status500/intint
name=QTime60015/int/lstlst name=errorstr
name=msgcreatecollection the collection time out:60s/strstr
name=traceorg.apache.solr.common.SolrException: createcollection the
collection time out:60s
at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:175)
at
org.apache.solr.handler.admin.CollectionsHandler.handleResponse(CollectionsHandler.java:156)
at
org.apache.solr.handler.admin.CollectionsHandler.handleCreateAction(CollectionsHandler.java:290)
at
org.apache.solr.handler.admin.CollectionsHandler.handleRequestBody(CollectionsHandler.java:112)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:611)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:218)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
at
org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:947)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1009)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:722)
/strint name=code500/int/lst/response

Now after i got this error, i am not able to do any operation on these
instances with collection API. It is repeteadly giving the same timeout
error..
This setup was working fine 5 mins back. suddenly it started throwing this
exceptions. Any ideas please??






--
View this message in context: 
http://lucene.472066.n3.nabble.com/Timeout-Errors-while-using-Collections-API-tp4095852.html
Sent from the Solr - User mailing list archive at Nabble.com.


how does solr load plugins?

2013-10-16 Thread Liu Bo
Hi

I write a plugin to index contents reusing our DAO layer which is developed
using Spring.

What I am doing now is putting the plugin jar and all other depending jars
of DAO layer to shared lib folder under solr home.

In the log, I can see all the jars are loaded through SolrResourceLoader
like:

INFO  - 2013-10-16 16:25:30.611; org.apache.solr.core.SolrResourceLoader;
Adding 'file:/D:/apache-tomcat-7.0.42/solr/lib/spring-tx-3.1.0.RELEASE.jar'
to classloader


Then initialize the Spring context using:

ApplicationContext context = new
FileSystemXmlApplicationContext(/solr/spring/solr-plugin-bean-test.xml);


Then Spring will complain:

INFO  - 2013-10-16 16:33:57.432;
org.springframework.context.support.AbstractApplicationContext; Refreshing
org.springframework.context.support.FileSystemXmlApplicationContext@e582a85:
startup date [Wed Oct 16 16:33:57 CST 2013]; root of context hierarchy
INFO  - 2013-10-16 16:33:57.491;
org.springframework.beans.factory.xml.XmlBeanDefinitionReader; Loading XML
bean definitions from file
[D:\apache-tomcat-7.0.42\solr\spring\solr-plugin-bean-test.xml]
ERROR - 2013-10-16 16:33:59.944;
com.test.search.solr.spring.AppicationContextWrapper; Configuration
problem: Unable to locate Spring NamespaceHandler for XML schema namespace [
http://www.springframework.org/schema/context]
Offending resource: file
[D:\apache-tomcat-7.0.42\solr\spring\solr-plugin-bean-test.xml]

Spring context requires spring-tx-3.1.xsd which does exist
in spring-tx-3.1.0.RELEASE.jar under
org\springframework\transaction\config\ package, but the program can't
find it even though it could load spring classes successfully.

The following won't work either.

ApplicationContext context = new
ClassPathXmlApplicationContext(classpath:spring/solr-plugin-bean-test.xml);
//the solr-plugin-bean-test.xml is packaged in plugin.jar as well.

But when I but all the jars under TOMECAT_HOME/webapp/solr/WEB-INF/lib, and
using

ApplicationContext context = new
ClassPathXmlApplicationContext(classpath:spring/solr-plugin-bean-test.xml);

everything works fine, I could initialize spring context and load DAO beans
to read data and then write them to solr index. But isn't modifying
solr.war a bad practice?

It seems SolrResourceLoader only loads classes from plugins jars but these
jars are NOT in classpath. Please correct me if I am wrong,

Is there any ways to use resources in plugin jars such as configuration
file?

BTW is there any difference between SolrResourceLoader with tomcat webapp
classLoader?

-- 
All the best

Liu Bo


SolrCloud Query Balancing

2013-10-16 Thread michael.boom
I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3
machines along with 3 Zookeeper instances.

My web application makes queries to Solr specifying the hostname of one of
the machines. So that machine will always get the request and the other ones
will just serve as an aid.
So I would like to setup a load balancer that would fix that, balancing the
queries to all machines. 
Maybe doing the same while indexing.

Would this be a good practice ? Any recommended tools for doing that?

Thanks!



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: SolrCloud Query Balancing

2013-10-16 Thread Chris Geeringh
If your web application is using SolrJ/Java based - use a CloudSolrServer
instance with the zkHosts. It will take care of load balancing when
querying, indexing, and handle routing if a node goes down.


On 16 October 2013 10:52, michael.boom my_sky...@yahoo.com wrote:

 I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3
 machines along with 3 Zookeeper instances.

 My web application makes queries to Solr specifying the hostname of one of
 the machines. So that machine will always get the request and the other
 ones
 will just serve as an aid.
 So I would like to setup a load balancer that would fix that, balancing the
 queries to all machines.
 Maybe doing the same while indexing.

 Would this be a good practice ? Any recommended tools for doing that?

 Thanks!



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: SolrCloud Query Balancing

2013-10-16 Thread michael.boom
Thanks!

I've read a lil' bit about that, but my app is php-based so I'm afraid I
can't use that.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095857.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Different document types in different collections OR same collection without sharing fields?

2013-10-16 Thread user 01
Can some expert users please leave a comment on this ?


On Sun, Oct 6, 2013 at 2:54 AM, user 01 user...@gmail.com wrote:

  Using a single node Solr instance, I need to search for, lets say,
 electronics items  grocery items. But I never want to search both of them
 together. When I search for electrnoics I don't expect a grocery item ever
  vice versa.

 Should I be defining both these document types within a single schema.xml
 or should I use different collection for each of these two(maintaining
 separate schema.xml  solrconfig.xml for each of two) ?

 I believe that if I add both to a single collection, without sharing
 fields among these two document types, I should be equally good as
 separating them in two collection(in terms of performance  all), as their
 indexes/filter caches would be totally independent of each other when they
 don't share fields?


 Also posted at SO: http://stackoverflow.com/q/19202882/530153



Re: SolrCloud Query Balancing

2013-10-16 Thread Henrik Ossipoff Hansen
What you could do (and what we do) is to have a simple proxy in front of your 
Solr instances. We for example run with Nginx in front of all of our Tomcats, 
and use Nginx's upstream capabilities to do a simple loadbalancer for our 
SolrCloud cluster.

http://wiki.nginx.org/HttpUpstreamModule

I'm sure other web servers have similar modules.

Den 16/10/2013 kl. 12.08 skrev michael.boom 
my_sky...@yahoo.commailto:my_sky...@yahoo.com:

Thanks!

I've read a lil' bit about that, but my app is php-based so I'm afraid I
can't use that.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095857.html
Sent from the Solr - User mailing list archive at Nabble.comhttp://Nabble.com.



Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-16 Thread Erik Hatcher
What does the debug output say from debugQuery=true say between the two?



On Oct 16, 2013, at 5:16, Stavros Delisavas stav...@delisavas.de wrote:

 Hello Solr-Experts,
 
 I am currently having a strange issue with my solr querys. I am running
 a small php/mysql-website that uses Solr for faster text-searches in
 name-lists, movie-titles, etc. Recently I noticed that the results on my
 local development-environment differ from those on my webserver. Both
 use the 100% same mysql-database with identical solr-queries for
 data-import.
 This is a sample query:
 
 http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29version=2.2start=0rows=1000indent=onfl=titleid
 
 It is autogenerated by an php-script and 100% identical on local and on
 my webserver. My local solr gives me the expected results: all entries
 that have the words into AND the AND wild* in them.
 But my webserver acts as if I was looking for into OR the OR
 wild*, eventhough the query is the same (as shown above). That's why I
 get useless (too many) results on the webserver-side.
 
 I don't know what could be the issue. I have tried to check the
 config-files but I don't really know what to look for, so it is
 overwhelming for me to search through this big file without knowing.
 
 What could be the problem, where can I check/find it and how can I solve
 that problem?
 
 In case, additional informations are needed, let me know please.
 
 Thank you!
 
 (Excuse my poor english, please. It's not my mother-language.)


Solr Copy field append values ?

2013-10-16 Thread vishgupt
Hi ,
Schema like this 

external_id is multivalued field.

copyField source=upc dest=external_id /

I want to know will values of upc will be appended to exiting values of
external_id or override it ?

For example if I send a document having values 

upc:131
external_id:423

for indexing in sorl with above mentioned schema.what will be value of
external_id field 131 or 131,423.

Thanks
Vishal





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Copy-field-append-values-tp4095862.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Different document types in different collections OR same collection without sharing fields?

2013-10-16 Thread shrikanth k
Hi,

Please refer below link for clarification on fields having null value.

http://stackoverflow.com/questions/7332122/solr-what-are-the-default-values-for-fields-which-does-not-have-a-default-value

logically it is better to have different collections for different domain
data. Having 2 collections will improve the overall performances.

Currently am holding 2 collections for different domain data. It eases
importing data and re-indexing.


regards,
Shrikanth



On Wed, Oct 16, 2013 at 3:48 PM, user 01 user...@gmail.com wrote:

 Can some expert users please leave a comment on this ?


 On Sun, Oct 6, 2013 at 2:54 AM, user 01 user...@gmail.com wrote:

   Using a single node Solr instance, I need to search for, lets say,
  electronics items  grocery items. But I never want to search both of
 them
  together. When I search for electrnoics I don't expect a grocery item
 ever
   vice versa.
 
  Should I be defining both these document types within a single schema.xml
  or should I use different collection for each of these two(maintaining
  separate schema.xml  solrconfig.xml for each of two) ?
 
  I believe that if I add both to a single collection, without sharing
  fields among these two document types, I should be equally good as
  separating them in two collection(in terms of performance  all), as
 their
  indexes/filter caches would be totally independent of each other when
 they
  don't share fields?
 
 
  Also posted at SO: http://stackoverflow.com/q/19202882/530153
 




--


Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
Hi,

Please find the clusterstate.json as below:

I have created a dev environment on one of my servers so that you can see
the issue live - http://64.251.14.47:1984/solr/

Also, There seems to be something wrong in zookeeper, when we try to add
documents using solrj, it works fine as long as load of insert is not much,
but once we start doing many inserts, then it throws a lot of errors...

I am doing something like -

CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL);
solrCoreCloud.setDefaultCollection(Image);
UpdateResponse up = solrCoreCloud.addBean(resultItem);
UpdateResponse upr = solrCoreCloud.commit();



clusterstate.json ---

{
  collection1:{
shards:{
  shard2:{
range:b333-e665,
state:active,
replicas:{core_node4:{
state:active,
core:collection1,
node_name:64.251.14.47:1984_solr,
base_url:http://64.251.14.47:1984/solr;,
leader:true}}},
  shard3:{
range:e666-1998,
state:active,
replicas:{core_node5:{
state:active,
core:collection1,
node_name:64.251.14.47:1985_solr,
base_url:http://64.251.14.47:1985/solr;,
leader:true}}},
  shard4:{
range:1999-4ccb,
state:active,
replicas:{
  core_node2:{
state:active,
core:collection1,
node_name:64.251.14.47:1982_solr,
base_url:http://64.251.14.47:1982/solr},
  core_node6:{
state:active,
core:collection1,
node_name:64.251.14.47:1981_solr,
base_url:http://64.251.14.47:1981/solr;,
leader:true}}},
  shard5:{
range:4ccc-7fff,
state:active,
replicas:{core_node3:{
state:active,
core:collection1,
node_name:64.251.14.47:1983_solr,
base_url:http://64.251.14.47:1983/solr;,
leader:true,
router:compositeId},
  Web:{
shards:{
  shard1:{
range:8000-b332,
state:active,
replicas:{core_node2:{
state:active,
core:Web_shard1_replica1,
node_name:64.251.14.47:1983_solr,
base_url:http://64.251.14.47:1983/solr;,
leader:true}}},
  shard2:{
range:b333-e665,
state:active,
replicas:{core_node3:{
state:active,
core:Web_shard2_replica1,
node_name:64.251.14.47:1984_solr,
base_url:http://64.251.14.47:1984/solr;,
leader:true}}},
  shard3:{
range:e666-1998,
state:active,
replicas:{core_node4:{
state:active,
core:Web_shard3_replica1,
node_name:64.251.14.47:1982_solr,
base_url:http://64.251.14.47:1982/solr;,
leader:true}}},
  shard4:{
range:1999-4ccb,
state:active,
replicas:{core_node5:{
state:active,
core:Web_shard4_replica1,
node_name:64.251.14.47:1985_solr,
base_url:http://64.251.14.47:1985/solr;,
leader:true}}},
  shard5:{
range:null,
state:active,
replicas:{core_node1:{
state:active,
core:Web_shard5_replica1,
node_name:64.251.14.47:1981_solr,
base_url:http://64.251.14.47:1981/solr;,
leader:true,
router:compositeId},
  Image:{
shards:{
  shard1:{
range:8000-b332,
state:active,
replicas:{core_node1:{
state:active,
core:Image_shard1_replica1,
node_name:64.251.14.47:1983_solr,
base_url:http://64.251.14.47:1983/solr;,
leader:true}}},
  shard2:{
range:b333-e665,
state:active,
replicas:{core_node2:{
state:active,
core:Image_shard2_replica1,
node_name:64.251.14.47:1985_solr,
base_url:http://64.251.14.47:1985/solr;,
leader:true}}},
  shard3:{
range:e666-1998,
state:active,
replicas:{core_node3:{
state:active,
core:Image_shard3_replica1,
node_name:64.251.14.47:1984_solr,
base_url:http://64.251.14.47:1984/solr;,
leader:true}}},
  shard4:{
range:1999-4ccb,
state:active,
replicas:{core_node5:{
state:active,
core:Image_shard4_replica1,
node_name:64.251.14.47:1982_solr,
base_url:http://64.251.14.47:1982/solr;,
leader:true}}},
  shard5:{
range:null,
state:active,
replicas:{core_node4:{
state:active,
core:Image_shard5_replica1,
node_name:64.251.14.47:1981_solr,

Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-16 Thread Stavros Delisavas
My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:
 What does the debug output say from debugQuery=true say between the two?



 On Oct 16, 2013, at 5:16, Stavros Delisavas stav...@delisavas.de wrote:

 Hello Solr-Experts,

 I am currently having a strange issue with my solr querys. I am running
 a small php/mysql-website that uses Solr for faster text-searches in
 name-lists, movie-titles, etc. Recently I noticed that the results on my
 local development-environment differ from those on my webserver. Both
 use the 100% same mysql-database with identical solr-queries for
 data-import.
 This is a sample query:

 http://localhost:8080/solr/select/?q=title%3A%28into+AND+the+AND+wild*%29version=2.2start=0rows=1000indent=onfl=titleid

 It is autogenerated by an php-script and 100% identical on local and on
 my webserver. My local solr gives me the expected results: all entries
 that have the words into AND the AND wild* in them.
 But my webserver acts as if I was looking for into OR the OR
 wild*, eventhough the query is the same (as shown above). That's why I
 get useless (too many) results on the webserver-side.

 I don't know what could be the issue. I have tried to check the
 config-files but I don't really know what to look for, so it is
 overwhelming for me to search through this big file without knowing.

 What could be the problem, where can I check/find it and how can I solve
 that problem?

 In case, additional informations are needed, let me know please.

 Thank you!

 (Excuse my poor english, please. It's not my mother-language.)



Re: Concurent indexing

2013-10-16 Thread Erick Erickson
Run jstack on the solr process (standard with Java) and
look for the word semaphore. You should see your
servers blocked on this in the Solr code. That'll pretty
much nail it.

There's an open JIRA to fix the underlying cause, see:
SOLR-5232, but that's currently slated for 4.6 which
won't be cut for a while.

Also, there's a patch that will fix this as a side effect,
assuming you're using SolrJ, see. This is available in 4.5
SOLR-4816

Best,
Erick




On Tue, Oct 15, 2013 at 1:33 PM, michael.boom my_sky...@yahoo.com wrote:

 Here's some of the Solr's last words (log content before it stoped
 accepting
 updates), maybe someone can help me interpret that.
 http://pastebin.com/mv7fH62H



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
oops, the actual url is -http://64.251.14.47:1981/solr/

Also, another issue that needs to be raised is the creation of cores from
the core admin section of the gui, doesnt really work well, it creates
files but then they do not work (again i am using 4.4)


On Wed, Oct 16, 2013 at 4:12 PM, Chris christu...@gmail.com wrote:

 Hi,

 Please find the clusterstate.json as below:

 I have created a dev environment on one of my servers so that you can see
 the issue live - http://64.251.14.47:1984/solr/

 Also, There seems to be something wrong in zookeeper, when we try to add
 documents using solrj, it works fine as long as load of insert is not much,
 but once we start doing many inserts, then it throws a lot of errors...

 I am doing something like -

 CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL);
 solrCoreCloud.setDefaultCollection(Image);
 UpdateResponse up = solrCoreCloud.addBean(resultItem);
 UpdateResponse upr = solrCoreCloud.commit();



 clusterstate.json ---

 {
   collection1:{
 shards:{
   shard2:{
 range:b333-e665,
 state:active,
 replicas:{core_node4:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1984_solr,
 base_url:http://64.251.14.47:1984/solr;,
 leader:true}}},
   shard3:{
 range:e666-1998,
 state:active,
 replicas:{core_node5:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1985_solr,
 base_url:http://64.251.14.47:1985/solr;,
 leader:true}}},
   shard4:{
 range:1999-4ccb,
 state:active,
 replicas:{
   core_node2:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1982_solr,
 base_url:http://64.251.14.47:1982/solr},
   core_node6:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1981_solr,
 base_url:http://64.251.14.47:1981/solr;,
 leader:true}}},
   shard5:{
 range:4ccc-7fff,
 state:active,
 replicas:{core_node3:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1983_solr,
 base_url:http://64.251.14.47:1983/solr;,
 leader:true,
 router:compositeId},
   Web:{
 shards:{
   shard1:{
 range:8000-b332,
 state:active,
 replicas:{core_node2:{
 state:active,
 core:Web_shard1_replica1,
 node_name:64.251.14.47:1983_solr,
 base_url:http://64.251.14.47:1983/solr;,
 leader:true}}},
   shard2:{
 range:b333-e665,
 state:active,
 replicas:{core_node3:{
 state:active,
 core:Web_shard2_replica1,
 node_name:64.251.14.47:1984_solr,
 base_url:http://64.251.14.47:1984/solr;,
 leader:true}}},
   shard3:{
 range:e666-1998,
 state:active,
 replicas:{core_node4:{
 state:active,
 core:Web_shard3_replica1,
 node_name:64.251.14.47:1982_solr,
 base_url:http://64.251.14.47:1982/solr;,
 leader:true}}},
   shard4:{
 range:1999-4ccb,
 state:active,
 replicas:{core_node5:{
 state:active,
 core:Web_shard4_replica1,
 node_name:64.251.14.47:1985_solr,
 base_url:http://64.251.14.47:1985/solr;,
 leader:true}}},
   shard5:{
 range:null,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:Web_shard5_replica1,
 node_name:64.251.14.47:1981_solr,
 base_url:http://64.251.14.47:1981/solr;,
 leader:true,
 router:compositeId},
   Image:{
 shards:{
   shard1:{
 range:8000-b332,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:Image_shard1_replica1,
 node_name:64.251.14.47:1983_solr,
 base_url:http://64.251.14.47:1983/solr;,
 leader:true}}},
   shard2:{
 range:b333-e665,
 state:active,
 replicas:{core_node2:{
 state:active,
 core:Image_shard2_replica1,
 node_name:64.251.14.47:1985_solr,
 base_url:http://64.251.14.47:1985/solr;,
 leader:true}}},
   shard3:{
 range:e666-1998,
 state:active,
 replicas:{core_node3:{
 state:active,
 core:Image_shard3_replica1,
 node_name:64.251.14.47:1984_solr,
 base_url:http://64.251.14.47:1984/solr;,
 leader:true}}},
   shard4:{
 

Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
Also, is there any easy way upgrading to 4.5 without having to change most
of my plugins  configuration files?


On Wed, Oct 16, 2013 at 4:18 PM, Chris christu...@gmail.com wrote:

 oops, the actual url is -http://64.251.14.47:1981/solr/

 Also, another issue that needs to be raised is the creation of cores from
 the core admin section of the gui, doesnt really work well, it creates
 files but then they do not work (again i am using 4.4)


 On Wed, Oct 16, 2013 at 4:12 PM, Chris christu...@gmail.com wrote:

 Hi,

 Please find the clusterstate.json as below:

 I have created a dev environment on one of my servers so that you can see
 the issue live - http://64.251.14.47:1984/solr/

 Also, There seems to be something wrong in zookeeper, when we try to add
 documents using solrj, it works fine as long as load of insert is not much,
 but once we start doing many inserts, then it throws a lot of errors...

 I am doing something like -

 CloudSolrServer solrCoreCloud = new CloudSolrServer(cloudURL);
 solrCoreCloud.setDefaultCollection(Image);
 UpdateResponse up = solrCoreCloud.addBean(resultItem);
 UpdateResponse upr = solrCoreCloud.commit();



 clusterstate.json ---

 {
   collection1:{
 shards:{
   shard2:{
 range:b333-e665,
 state:active,
 replicas:{core_node4:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1984_solr,
 base_url:http://64.251.14.47:1984/solr;,
 leader:true}}},
   shard3:{
 range:e666-1998,
 state:active,
 replicas:{core_node5:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1985_solr,
 base_url:http://64.251.14.47:1985/solr;,
 leader:true}}},
   shard4:{
 range:1999-4ccb,
 state:active,
 replicas:{
   core_node2:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1982_solr,
 base_url:http://64.251.14.47:1982/solr},
   core_node6:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1981_solr,
 base_url:http://64.251.14.47:1981/solr;,
 leader:true}}},
   shard5:{
 range:4ccc-7fff,
 state:active,
 replicas:{core_node3:{
 state:active,
 core:collection1,
 node_name:64.251.14.47:1983_solr,
 base_url:http://64.251.14.47:1983/solr;,
 leader:true,
 router:compositeId},
   Web:{
 shards:{
   shard1:{
 range:8000-b332,
 state:active,
 replicas:{core_node2:{
 state:active,
 core:Web_shard1_replica1,
 node_name:64.251.14.47:1983_solr,
 base_url:http://64.251.14.47:1983/solr;,
 leader:true}}},
   shard2:{
 range:b333-e665,
 state:active,
 replicas:{core_node3:{
 state:active,
 core:Web_shard2_replica1,
 node_name:64.251.14.47:1984_solr,
 base_url:http://64.251.14.47:1984/solr;,
 leader:true}}},
   shard3:{
 range:e666-1998,
 state:active,
 replicas:{core_node4:{
 state:active,
 core:Web_shard3_replica1,
 node_name:64.251.14.47:1982_solr,
 base_url:http://64.251.14.47:1982/solr;,
 leader:true}}},
   shard4:{
 range:1999-4ccb,
 state:active,
 replicas:{core_node5:{
 state:active,
 core:Web_shard4_replica1,
 node_name:64.251.14.47:1985_solr,
 base_url:http://64.251.14.47:1985/solr;,
 leader:true}}},
   shard5:{
 range:null,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:Web_shard5_replica1,
 node_name:64.251.14.47:1981_solr,
 base_url:http://64.251.14.47:1981/solr;,
 leader:true,
 router:compositeId},
   Image:{
 shards:{
   shard1:{
 range:8000-b332,
 state:active,
 replicas:{core_node1:{
 state:active,
 core:Image_shard1_replica1,
 node_name:64.251.14.47:1983_solr,
 base_url:http://64.251.14.47:1983/solr;,
 leader:true}}},
   shard2:{
 range:b333-e665,
 state:active,
 replicas:{core_node2:{
 state:active,
 core:Image_shard2_replica1,
 node_name:64.251.14.47:1985_solr,
 base_url:http://64.251.14.47:1985/solr;,
 leader:true}}},
   shard3:{
 range:e666-1998,
 state:active,
 replicas:{core_node3:{
 state:active,
 

Re: Concurent indexing

2013-10-16 Thread Chris Geeringh
Hi Erick, here is a paste from other thread (debugging update request) with
my input as I am seeing errors too:

I ran an import last night, and this morning my cloud wouldn't accept
updates. I'm running the latest 4.6 snapshot. I was importing with latest
solrj snapshot, and using java bin transport with CloudSolrServer.

The cluster had indexed ~1.3 million docs before no further updates were
accepted, querying still working.

I'll run jstack shortly and provide the results.

Here is my jstack output... Lots of blocked threads.

http://pastebin.com/1ktjBYbf



On 16 October 2013 11:46, Erick Erickson erickerick...@gmail.com wrote:

 Run jstack on the solr process (standard with Java) and
 look for the word semaphore. You should see your
 servers blocked on this in the Solr code. That'll pretty
 much nail it.

 There's an open JIRA to fix the underlying cause, see:
 SOLR-5232, but that's currently slated for 4.6 which
 won't be cut for a while.

 Also, there's a patch that will fix this as a side effect,
 assuming you're using SolrJ, see. This is available in 4.5
 SOLR-4816

 Best,
 Erick




 On Tue, Oct 15, 2013 at 1:33 PM, michael.boom my_sky...@yahoo.com wrote:

  Here's some of the Solr's last words (log content before it stoped
  accepting
  updates), maybe someone can help me interpret that.
  http://pastebin.com/mv7fH62H
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 



Re: Different document types in different collections OR same collection without sharing fields?

2013-10-16 Thread user 01
@Shrikanth: how do you manage multiple redundant configurations(isn' it?) ?
I thought indexes would be separate when fields aren't shared. I don't need
to import any data/ or re-indexing, if those are the only benefits for
separate collections.  I just index when a request comes/ new item is added
to DB.


On Wed, Oct 16, 2013 at 4:12 PM, shrikanth k jconsult.s...@gmail.comwrote:

 Hi,

 Please refer below link for clarification on fields having null value.


 http://stackoverflow.com/questions/7332122/solr-what-are-the-default-values-for-fields-which-does-not-have-a-default-value

 logically it is better to have different collections for different domain
 data. Having 2 collections will improve the overall performances.

 Currently am holding 2 collections for different domain data. It eases
 importing data and re-indexing.


 regards,
 Shrikanth



 On Wed, Oct 16, 2013 at 3:48 PM, user 01 user...@gmail.com wrote:

  Can some expert users please leave a comment on this ?
 
 
  On Sun, Oct 6, 2013 at 2:54 AM, user 01 user...@gmail.com wrote:
 
Using a single node Solr instance, I need to search for, lets say,
   electronics items  grocery items. But I never want to search both of
  them
   together. When I search for electrnoics I don't expect a grocery item
  ever
vice versa.
  
   Should I be defining both these document types within a single
 schema.xml
   or should I use different collection for each of these two(maintaining
   separate schema.xml  solrconfig.xml for each of two) ?
  
   I believe that if I add both to a single collection, without sharing
   fields among these two document types, I should be equally good as
   separating them in two collection(in terms of performance  all), as
  their
   indexes/filter caches would be totally independent of each other when
  they
   don't share fields?
  
  
   Also posted at SO: http://stackoverflow.com/q/19202882/530153
  
 



 --



Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
 Also, another issue that needs to be raised is the creation of cores 
from
 the core admin section of the gui, doesnt really work well, it 
creates
 files but then they do not work (again i am using 4.4)

From my experience core admin section of the GUI does not work well in 
SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0 
which acts much better.

I would use only HTTP requests (cores and collections API) with 
SolrCloud and would use GUI only for viewing the state of cluster and 
cores.

Primoz




Re: req info : SOLRJ and TermVector

2013-10-16 Thread Koji Sekiguchi

(13/10/16 17:47), elfu wrote:

hi,

can i access TermVector information using solrj ?


There is TermVectorComponent to get termVector info:

http://wiki.apache.org/solr/TermVectorComponent

So yes, you can access it using solrj.

koji
--
http://soleami.com/blog/automatically-acquiring-synonym-knowledge-from-wikipedia.html


RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-16 Thread Akkinepalli, Bharat (ELS-CON)
Hi Otis,
Did you get a chance to look into the logs.  Please let me know if you need 
more information.  Thank you.

Regards,
Bharat Akkinepalli

-Original Message-
From: Akkinepalli, Bharat (ELS-CON) [mailto:b.akkinepa...@elsevier.com] 
Sent: Friday, October 11, 2013 2:16 PM
To: solr-user@lucene.apache.org
Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue with 
Commits after deleting documents using Delete by ID

Hi Otis,
Thanks for the response.  The log files can be found here.  

MasterLog : http://pastebin.com/DPLKMPcF Slave Log:  
http://pastebin.com/DX9sV6Jx

One more point worth mentioning here is that when we issue the commit with 
expungeDeletes=true, then the delete by id replication is successful. i.e. 
http://localhost:8983/solr/annotation/update?commit=trueexpungeDeletes=true

Regards,
Bharat Akkinepalli

-Original Message-
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
Sent: Wednesday, October 09, 2013 6:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with 
Commits after deleting documents using Delete by ID

Bharat,

Can you look at the logs on the Master when you issue the delete and the 
subsequent commits and share that?

Otis
--
Solr  ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- 
http://sematext.com/spm



On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) 
b.akkinepa...@elsevier.com wrote:
 Hi,
 We have recently migrated from Solr 3.6 to Solr 4.4.  We are using the 
 Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have noticed the 
 following behavior/defect.

 Configuration:
 ===

 1.   The Hard Commit and Soft Commit are disabled in the configuration 
 (we control the commits from the application)

 2.   We have 1 Master and 2 Slaves configured and the pollInterval is 
 configured to 10 Minutes.

 3.   The Master is configured to have the replicateAfter as commit  
 startup

 Steps to reproduce the problem:
 ==

 1.   Delete a document in Solr  (using delete by id).  URL - 
 http://localhost:8983/solr/annotation/update with body as  
 deleteidchange.me/id/delete

 2.   Issue a commit in Master 
 (http://localhost:8983/solr/annotation/update?commit=true).

 3.   The replication of the DELETE WILL NOT happen.  The master and slave 
 has the same Index version.

 4.   If we try to issue another commit in Master, we see that it 
 replicates fine.

 Request you to please confirm if this is a known issue.  Thank you.

 Regards,
 Bharat Akkinepalli



Re: Find documents that are composed of % words

2013-10-16 Thread Aloke Ghoshal
Hi Shahzad,

Personally I am of the same opinion as others who have replied, that you
are better off going back to your clients at this stage itself, with all
the new found info/data points.

Further, to the questions that you put to me directly:

1) For option 1, as indicated earlier, you have to compute the
myfieldwordcount outside of Solr  push it in as any other field to Solr.
As far as I know, there is no filter that will do this for you out of the
box.

2) For option 2, you had to take a look at:
http://wiki.apache.org/solr/SolrRelevancyFAQ#index-time_boosts
Related links:
Function Query: http://wiki.apache.org/solr/FunctionQuery#norm
Norms:
http://lucene.apache.org/core/3_5_0/api/all/org/apache/lucene/search/Similarity.html#computeNorm(java.lang.String,
org.apache.lucene.index.FieldInvertState)
Changes to schema:
http://wiki.apache.org/solr/SchemaXml#Common_field_options (omitNorms
option)

For a field with default boost (= 1), norm = lengthNorm (approximately
1/sqrrt(numTerms)). Norm's been multiplied twice in the query to divide the
score (approx.) by numTerms.

Hope that helps.

Regards,
Aloke


On Fri, Oct 11, 2013 at 5:36 PM, shahzad73 shahzad...@yahoo.com wrote:

 Aloke Ghoshal i'm trying to work out your equation.   i am using standard
 scheme provided by nutch for solr and not aware of how to calculate
 myfieldwordcount   in first query.no idea where this count will come
 from.   is there any filter that will store number of tokens generated for
 a
 specific field and store it as another field.   that way we can use it .
 not sure what norm does in second equation  try to find information for
 this from online and did not find any yet.   please explain


 Shahzad



 --
 View this message in context:
 http://lucene.472066.n3.nabble.com/Find-documents-that-are-composed-of-words-tp4094264p4094955.html
 Sent from the Solr - User mailing list archive at Nabble.com.



Re: Concurent indexing

2013-10-16 Thread Chris Geeringh
Here's another jstack http://pastebin.com/8JiQc3rb


On 16 October 2013 11:53, Chris Geeringh geeri...@gmail.com wrote:

 Hi Erick, here is a paste from other thread (debugging update request)
 with my input as I am seeing errors too:

 I ran an import last night, and this morning my cloud wouldn't accept
 updates. I'm running the latest 4.6 snapshot. I was importing with latest
 solrj snapshot, and using java bin transport with CloudSolrServer.

 The cluster had indexed ~1.3 million docs before no further updates were
 accepted, querying still working.

 I'll run jstack shortly and provide the results.

 Here is my jstack output... Lots of blocked threads.

 http://pastebin.com/1ktjBYbf



 On 16 October 2013 11:46, Erick Erickson erickerick...@gmail.com wrote:

 Run jstack on the solr process (standard with Java) and
 look for the word semaphore. You should see your
 servers blocked on this in the Solr code. That'll pretty
 much nail it.

 There's an open JIRA to fix the underlying cause, see:
 SOLR-5232, but that's currently slated for 4.6 which
 won't be cut for a while.

 Also, there's a patch that will fix this as a side effect,
 assuming you're using SolrJ, see. This is available in 4.5
 SOLR-4816

 Best,
 Erick




 On Tue, Oct 15, 2013 at 1:33 PM, michael.boom my_sky...@yahoo.com
 wrote:

  Here's some of the Solr's last words (log content before it stoped
  accepting
  updates), maybe someone can help me interpret that.
  http://pastebin.com/mv7fH62H
 
 
 
  --
  View this message in context:
 
 http://lucene.472066.n3.nabble.com/Concurent-indexing-tp4095409p4095642.html
  Sent from the Solr - User mailing list archive at Nabble.com.
 





Re: Switching indexes

2013-10-16 Thread Christopher Gross
Shawn,

It all makes sense, I'm just dealing with production servers here so I'm
trying to be very careful (shutting down one node at a time is OK, just
don't want to do something catastrophic.)

OK, so I should use that aliasing feature.

On index1 I have:
core1
core1new
core2

On index2 and index3 I have:
core1
core2

If I do the alias command on index1 and have core1 alias core1new:
1) Will that then get rid of the existing core1 and have core1new data be
used for queries?
2) Will that change make core1 instances on index2 and index3 update to
have core1new data?

Thanks again!



-- Chris


On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/15/2013 2:17 PM, Christopher Gross wrote:

 I have 3 Solr nodes (and 5 ZK nodes).

 For #1, would I have to do that on all of them?
 For #2, I'm not getting the auto-replication between node 1 and nodes 2 
 3
 for my new index.

 I have 2 indexes -- just call them index and indexbk (bk being the
 backup containing the full data set) up and running on one node.
 If I were to do a swap (via the Core Admin page), would that push the
 changes for indexbk over to the other two nodes?  Would I need to do that
 switch on the leader, or could that be done on one of the other nodes?


 For #1, I don't know how you want to handle your sharding and/or
 replication.  I would assume that you probably have numShards=1 and
 replicationFactor=3, but I could be wrong. At any rate, where the
 collection lives is an implementation detail that's up to you.  SolrCloud
 keeps track of all your collections, whether they are on one server or all
 servers. Typically you can send requests (queries, API calls, etc) that
 deal with entire collections to any node in your cluster and they will be
 handled correctly.  If you need to deal with a specific core, that call
 needs to go to the correct node.

 For #2, when you create a core and want it to be a replica of something
 that already exists, you need to give it a name that's not in use on your
 cluster, such as index2_shard1_replica3.  You also tell it what collection
 it's part of, which for my example, would probably be index2.  Then you
 tell it what shard it will contain.  That will be shard1, shard2, etc.
  Here's an example of a CREATE call:

 http://server:port/solr/admin/**cores?action=CREATEname=**
 index2_shard1_replica3**collection=index2shard=shard1

 For the rest of your message: Core swapping and SolrCloud do NOT get
 along.  If you are using SolrCloud, CoreAdmin features like that need to
 disappear from your toolset. Attempting a core swap will make bad things
 (tm) happen.

 Collection aliasing is the way in SolrCloud that you can now do what used
 to be done with swapping.  You have collections named index1, index2,
 index3, etc ... and you keep an alias called just index that points to
 one of those other collections, so that you don't have to change your
 application - you just repoint the alias and all the application queries
 going to index will go to the correct place.

 I hope I haven't made things more confusing for you!

 Thanks,
 Shawn




Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
oh great. Thanks Primoz.

is there any simple way to do the upgrade to 4.5 without having to change
my configurations? update a few jar files etc?


On Wed, Oct 16, 2013 at 4:58 PM, primoz.sk...@policija.si wrote:

  Also, another issue that needs to be raised is the creation of cores
 from
  the core admin section of the gui, doesnt really work well, it
 creates
  files but then they do not work (again i am using 4.4)

 From my experience core admin section of the GUI does not work well in
 SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0
 which acts much better.

 I would use only HTTP requests (cores and collections API) with
 SolrCloud and would use GUI only for viewing the state of cluster and
 cores.

 Primoz





Error when i want to create a CORE

2013-10-16 Thread raige
I install the version solr 4.5 on windows. I launch with Jetty web server the
example. I have no problem with collection 1 core. But, when i want to
create my core, the server send me this error : 
*
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load config file C:\Documents and
Settings\r.lucas\Bureau\Moteur\solr-4.5.0\example\solr\index1\solrconfig.xml*

could you help please



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Regarding Solr Cloud issue...

2013-10-16 Thread primoz . skale
Hm, good question. I haven't really done any upgrading yet, because I just 
reinstall and reindex everything. I would replace jars with the new ones 
(if needed - check release notes for version 4.4.0 and 4.5.0 where all the 
versions of external tools [tika, maven, etc.] are stated) and deploy the 
updated WAR file to servlet container.

Primoz




From:   Chris christu...@gmail.com
To: solr-user solr-user@lucene.apache.org
Date:   16.10.2013 14:30
Subject:Re: Regarding Solr Cloud issue...



oh great. Thanks Primoz.

is there any simple way to do the upgrade to 4.5 without having to change
my configurations? update a few jar files etc?


On Wed, Oct 16, 2013 at 4:58 PM, primoz.sk...@policija.si wrote:

  Also, another issue that needs to be raised is the creation of cores
 from
  the core admin section of the gui, doesnt really work well, it
 creates
  files but then they do not work (again i am using 4.4)

 From my experience core admin section of the GUI does not work well in
 SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0
 which acts much better.

 I would use only HTTP requests (cores and collections API) with
 SolrCloud and would use GUI only for viewing the state of cluster and
 cores.

 Primoz






Re: Error when i want to create a CORE

2013-10-16 Thread primoz . skale
Can you try with a directory path that contains *no* spaces.

Primoz



From:   raige regis...@gmail.com
To: solr-user@lucene.apache.org
Date:   16.10.2013 14:46
Subject:Error when i want to create a CORE



I install the version solr 4.5 on windows. I launch with Jetty web server 
the
example. I have no problem with collection 1 core. But, when i want to
create my core, the server send me this error : 
*
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load config file C:\Documents and
Settings\r.lucas\Bureau\Moteur\solr-4.5.0\example\solr\index1\solrconfig.xml*

could you help please



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894.html

Sent from the Solr - User mailing list archive at Nabble.com.



Re: Boosting a field with defType:dismax -- No results at all

2013-10-16 Thread Jack Krupansky

Get rid of the newlines before and after the value of the qf parameter.

-- Jack Krupansky

-Original Message- 
From: uwe72

Sent: Wednesday, October 16, 2013 5:36 AM
To: solr-user@lucene.apache.org
Subject: Boosting a field with defType:dismax -- No results at all

Hi there,

i want to boost a field, see below.

If i add the defType:dismax i don't get results at all anymore.

What i am doing wrong?

Regards
Uwe

   requestHandler name=/select class=solr.SearchHandler
   lst name=defaults
   str name=omitHeadertrue/str
   str name=dftext/str
   str name=q.opAND/str


   str name=spellcheck.dictionarydefault/str

   str name=spellchecktrue/str
   str name=spellcheck.extendedResultstrue/str
   str name=spellcheck.count1/str

   str name=spellcheck.maxResultsForSuggest100/str
   str name=spellcheck.collatetrue/str
   str name=spellcheck.collateExtendedResultstrue/str
   str name=spellcheck.maxCollations1/str


   str name=defTypedismax/str
   str name=qf
  SignalImpl.baureihe^1011 text^0.1
   /str




   /lst
   arr name=last-components
   strspellcheck/str
   /arr
   /requestHandler



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: Error when i want to create a CORE

2013-10-16 Thread michael.boom
Assuming that you are using the Admin UI: 
The instanceDir must be already existing (in your case index1).
Inside it there should be conf/ directory holding the cofiguration files.
In the config field only insert the file name (like solrconfig.xml) which
shoulf be found in the conf/ directory



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Error-when-i-want-to-create-a-CORE-tp4095894p4095900.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr Copy field append values ?

2013-10-16 Thread Jack Krupansky

Appended.

-- Jack Krupansky

-Original Message- 
From: vishgupt

Sent: Wednesday, October 16, 2013 6:25 AM
To: solr-user@lucene.apache.org
Subject: Solr Copy field append values ?

Hi ,
Schema like this

external_id is multivalued field.

copyField source=upc dest=external_id /

I want to know will values of upc will be appended to exiting values of
external_id or override it ?

For example if I send a document having values

upc:131
external_id:423

for indexing in sorl with above mentioned schema.what will be value of
external_id field 131 or 131,423.

Thanks
Vishal





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Copy-field-append-values-tp4095862.html
Sent from the Solr - User mailing list archive at Nabble.com. 



AW: Boosting a field with defType:dismax -- No results at all

2013-10-16 Thread uwe72
Perfect!!! THANKS A LOT

 

That was the mistake.

 

Von: Jack Krupansky-2 [via Lucene]
[mailto:ml-node+s472066n409590...@n3.nabble.com] 
Gesendet: Mittwoch, 16. Oktober 2013 14:55
An: uwe72
Betreff: Re: Boosting a field with defType:dismax -- No results at all

 

Get rid of the newlines before and after the value of the qf parameter. 

-- Jack Krupansky 

-Original Message- 
From: uwe72 
Sent: Wednesday, October 16, 2013 5:36 AM 
To: [hidden email] 
Subject: Boosting a field with defType:dismax -- No results at all 

Hi there, 

i want to boost a field, see below. 

If i add the defType:dismax i don't get results at all anymore. 

What i am doing wrong? 

Regards 
Uwe 

requestHandler name=/select class=solr.SearchHandler 
lst name=defaults 
str name=omitHeadertrue/str 
str name=dftext/str 
str name=q.opAND/str 


str name=spellcheck.dictionarydefault/str 

str name=spellchecktrue/str 
str name=spellcheck.extendedResultstrue/str 
str name=spellcheck.count1/str 

str name=spellcheck.maxResultsForSuggest100/str 
str name=spellcheck.collatetrue/str 
str name=spellcheck.collateExtendedResultstrue/str 
str name=spellcheck.maxCollations1/str 


str name=defTypedismax/str 
str name=qf 
   SignalImpl.baureihe^1011 text^0.1 
/str 




/lst 
arr name=last-components 
strspellcheck/str 
/arr 
/requestHandler 



-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No
-results-at-all-tp4095850.html
Sent from the Solr - User mailing list archive at Nabble.com. 




  _  

If you reply to this email, your message will be added to the discussion
below:

http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No
-results-at-all-tp4095850p4095901.html 

To unsubscribe from Boosting a field with defType:dismax -- No results at
all, click here
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubsc
ribe_by_codenode=4095850code=dXdlLmNsZW1lbnRAZXh4Y2VsbGVudC5kZXw0MDk1ODU
wfC0yOTkxOTMwMjI= .
 
http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_v
iewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.Ba
sicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.temp
late.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-in
stant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.nam
l NAML 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095906.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Regarding Solr Cloud issue...

2013-10-16 Thread Chris
very well, i will try the same, maybe an auto update tool should be also
put on the line...just a thought ...


On Wed, Oct 16, 2013 at 6:20 PM, primoz.sk...@policija.si wrote:

 Hm, good question. I haven't really done any upgrading yet, because I just
 reinstall and reindex everything. I would replace jars with the new ones
 (if needed - check release notes for version 4.4.0 and 4.5.0 where all the
 versions of external tools [tika, maven, etc.] are stated) and deploy the
 updated WAR file to servlet container.

 Primoz




 From:   Chris christu...@gmail.com
 To: solr-user solr-user@lucene.apache.org
 Date:   16.10.2013 14:30
 Subject:Re: Regarding Solr Cloud issue...



 oh great. Thanks Primoz.

 is there any simple way to do the upgrade to 4.5 without having to change
 my configurations? update a few jar files etc?


 On Wed, Oct 16, 2013 at 4:58 PM, primoz.sk...@policija.si wrote:

   Also, another issue that needs to be raised is the creation of cores
  from
   the core admin section of the gui, doesnt really work well, it
  creates
   files but then they do not work (again i am using 4.4)
 
  From my experience core admin section of the GUI does not work well in
  SolrCloud domain. If I am not mistaken this was somehow fixed in 4.5.0
  which acts much better.
 
  I would use only HTTP requests (cores and collections API) with
  SolrCloud and would use GUI only for viewing the state of cluster and
  cores.
 
  Primoz
 
 
 




Re: SolrCloud Query Balancing

2013-10-16 Thread michael.boom
Thanks!

Could you provide some examples or details of the configuration you use ?
I think this solution would suit me also.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SolrCloud-Query-Balancing-tp4095854p4095910.html
Sent from the Solr - User mailing list archive at Nabble.com.


How to retrieve the query for a boolean keyword?

2013-10-16 Thread Silvia Suárez
Dear all,

I am using solrj as client for indexing and searching documents on the solr
server

My question:

How to retrieve the query for a boolean keyword?

For example:

I have this query:

text:(“vacuna” AND “esteve news”) OR text:(“vacuna”) OR text:(“esteve news”)

And searching in:

text-- Esteve news: Obtener una vacuna para frenar el...

Solr returns:

emEsteve news/em: obtener una emvacuna/em para frenar el ...

It is ok.

My question is:

Can I know with solr that results: emEsteve news/em emvacuna/em
are provided by the query with the AND operator?

is it posible to retrieve with solrj?

Thanks a lot in advance,

Sil,



*
*
*Tecnologías y SaaS para el análisis de marcas comerciales.*


Nota:
Usted ha recibido este mensaje al estar en la libreta de direcciones del
remitente, en los archivos de la empresa o mediante el sistema de
“responder” al ser usted la persona que contactó por este medio con el
remitente. En caso de no querer recibir ningún email mas del remitente o de
cualquier miembro de la organización a la que pertenece, por favor,
responda a este email solicitando la baja de su dirección en nuestros
archivos.

Advertencia legal:
Este mensaje y, en su caso, los ficheros anexos son confidenciales,
especialmente en lo que respecta a los datos personales, y se dirigen
exclusivamente al destinatario referenciado. Si usted no lo es y lo ha
recibido por error o tiene conocimiento del mismo por cualquier motivo, le
rogamos que nos lo comunique por este medio y proceda a destruirlo o
borrarlo, y que en todo caso se abstenga de utilizar, reproducir, alterar,
archivar o comunicar a terceros el presente mensaje y ficheros anexos, todo
ello bajo pena de incurrir en responsabilidades legales.


Re: SolrCloud Query Balancing

2013-10-16 Thread Shawn Heisey
On 10/16/2013 3:52 AM, michael.boom wrote:
 I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3
 machines along with 3 Zookeeper instances.
 
 My web application makes queries to Solr specifying the hostname of one of
 the machines. So that machine will always get the request and the other ones
 will just serve as an aid.
 So I would like to setup a load balancer that would fix that, balancing the
 queries to all machines. 
 Maybe doing the same while indexing.

SolrCloud actually handles load balancing for you.  You'll find that
when you send requests to one server, they are actually being
re-directed across the entire cloud, unless you include a
distrib=false parameter on the request, but that would also limit the
search to one shard, which is probably not what you want.

The only thing that you don't get with a non-Java client is redundancy.
 If you can't build in failover capability yourself, which is a very
advanced programming technique, then you need a load balancer.

For my large non-Cloud Solr install, I use haproxy as a load balancer.
Most of the time, it doesn't actually balance the load, just makes sure
that Solr is always reachable even if part of it goes down.  The haproxy
program is simple and easy to use, but performs extremely well.  I've
got a pacemaker cluster making sure that the shared IP address, haproxy,
and other homegrown utility applications related to Solr are only
running on one machine.

Thanks,
Shawn



Re: SolrCloud Query Balancing

2013-10-16 Thread Henrik Ossipoff Hansen
I did not actually realize this, I apologize for my previous reply!

Haproxy would definitely be the right choice then for the posters setup for 
redundancy.

Den 16/10/2013 kl. 15.53 skrev Shawn Heisey s...@elyograg.org:

 On 10/16/2013 3:52 AM, michael.boom wrote:
 I have setup a SolrCloud system with: 3 shards, replicationFactor=3 on 3
 machines along with 3 Zookeeper instances.
 
 My web application makes queries to Solr specifying the hostname of one of
 the machines. So that machine will always get the request and the other ones
 will just serve as an aid.
 So I would like to setup a load balancer that would fix that, balancing the
 queries to all machines. 
 Maybe doing the same while indexing.
 
 SolrCloud actually handles load balancing for you.  You'll find that
 when you send requests to one server, they are actually being
 re-directed across the entire cloud, unless you include a
 distrib=false parameter on the request, but that would also limit the
 search to one shard, which is probably not what you want.
 
 The only thing that you don't get with a non-Java client is redundancy.
 If you can't build in failover capability yourself, which is a very
 advanced programming technique, then you need a load balancer.
 
 For my large non-Cloud Solr install, I use haproxy as a load balancer.
 Most of the time, it doesn't actually balance the load, just makes sure
 that Solr is always reachable even if part of it goes down.  The haproxy
 program is simple and easy to use, but performs extremely well.  I've
 got a pacemaker cluster making sure that the shared IP address, haproxy,
 and other homegrown utility applications related to Solr are only
 running on one machine.
 
 Thanks,
 Shawn
 



howto increase indexing speed?

2013-10-16 Thread Giovanni Bricconi
I have a small solr setup, not even on a physical machine but a vmware
virtual machine with a single cpu that reads data using DIH from a
database. The machine has no phisical disks attached but stores data on a
netapp nas.

Currently this machine indexes 320 documents/sec, not bad but we plan to
double the index and we would like to keep nearly the same.

Doing some basic checks during the indexing I have found with iostat that
the usage of the disks is nearly 8% and the source database is running
fine, instead the  virtual cpu is 95% running on solr.

Now I can quite easily add another virtual cpu to the solr box, but as far
as I know this won't help because DIH doesn't work in parallel. Am I wrong?

What would you do? Rewrite the feeding process quitting dih and using solrj
to feed data in parallel? Would you instead keep DIH and switch to a
sharded configuration?

Thank you for any hints

Giovanni


Re: howto increase indexing speed?

2013-10-16 Thread primoz . skale
I think DIH uses only one core per instance. IMHO 300 doc/sec is quite 
good. If you would like to use more cores you need to use solrj. Or maybe 
more than one DIH and more cores of course.

Primoz



From:   Giovanni Bricconi giovanni.bricc...@banzai.it
To: solr-user solr-user@lucene.apache.org
Date:   16.10.2013 16:25
Subject:howto increase indexing speed?



I have a small solr setup, not even on a physical machine but a vmware
virtual machine with a single cpu that reads data using DIH from a
database. The machine has no phisical disks attached but stores data on a
netapp nas.

Currently this machine indexes 320 documents/sec, not bad but we plan to
double the index and we would like to keep nearly the same.

Doing some basic checks during the indexing I have found with iostat that
the usage of the disks is nearly 8% and the source database is running
fine, instead the  virtual cpu is 95% running on solr.

Now I can quite easily add another virtual cpu to the solr box, but as far
as I know this won't help because DIH doesn't work in parallel. Am I 
wrong?

What would you do? Rewrite the feeding process quitting dih and using 
solrj
to feed data in parallel? Would you instead keep DIH and switch to a
sharded configuration?

Thank you for any hints

Giovanni



Re: howto increase indexing speed?

2013-10-16 Thread Walter Underwood
You might consider local disks. I once ran Solr with the indexes on an 
NFS-mounted volume and the slowdown was severe.

wunder

On Oct 16, 2013, at 7:40 AM, primoz.sk...@policija.si wrote:

 I think DIH uses only one core per instance. IMHO 300 doc/sec is quite 
 good. If you would like to use more cores you need to use solrj. Or maybe 
 more than one DIH and more cores of course.
 
 Primoz
 
 
 
 From:   Giovanni Bricconi giovanni.bricc...@banzai.it
 To: solr-user solr-user@lucene.apache.org
 Date:   16.10.2013 16:25
 Subject:howto increase indexing speed?
 
 
 
 I have a small solr setup, not even on a physical machine but a vmware
 virtual machine with a single cpu that reads data using DIH from a
 database. The machine has no phisical disks attached but stores data on a
 netapp nas.
 
 Currently this machine indexes 320 documents/sec, not bad but we plan to
 double the index and we would like to keep nearly the same.
 
 Doing some basic checks during the indexing I have found with iostat that
 the usage of the disks is nearly 8% and the source database is running
 fine, instead the  virtual cpu is 95% running on solr.
 
 Now I can quite easily add another virtual cpu to the solr box, but as far
 as I know this won't help because DIH doesn't work in parallel. Am I 
 wrong?
 
 What would you do? Rewrite the feeding process quitting dih and using 
 solrj
 to feed data in parallel? Would you instead keep DIH and switch to a
 sharded configuration?
 
 Thank you for any hints
 
 Giovanni
 

--
Walter Underwood
wun...@wunderwood.org





Re: prepareCommit vs Commit

2013-10-16 Thread Phani Chaitanya
Thanks  Shalin. Will post it there too.



-
Phani Chaitanya
--
View this message in context: 
http://lucene.472066.n3.nabble.com/prepareCommit-vs-Commit-tp4095545p4095916.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-16 Thread Shalin Shekhar Mangar
The only delete I see in the master logs is:

INFO  - 2013-10-11 14:06:54.793;
org.apache.solr.update.processor.LogUpdateProcessor; [annotation]
webapp=/solr path=/update params={}
{delete=[change.me(-1448623278425899008)]} 0 60

When you commit, we have the following:

INFO  - 2013-10-11 14:07:03.809;
org.apache.solr.update.DirectUpdateHandler2; start
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2013-10-11 14:07:03.813;
org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
Skipping IW.commit.

That suggests that the id you are trying to delete never existed in the
first place and hence there was nothing to commit. Hence replication was
not triggered. Am I missing something?


On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON) 
b.akkinepa...@elsevier.com wrote:

 Hi Otis,
 Did you get a chance to look into the logs.  Please let me know if you
 need more information.  Thank you.

 Regards,
 Bharat Akkinepalli

 -Original Message-
 From: Akkinepalli, Bharat (ELS-CON) [mailto:b.akkinepa...@elsevier.com]
 Sent: Friday, October 11, 2013 2:16 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue
 with Commits after deleting documents using Delete by ID

 Hi Otis,
 Thanks for the response.  The log files can be found here.

 MasterLog : http://pastebin.com/DPLKMPcF Slave Log:
 http://pastebin.com/DX9sV6Jx

 One more point worth mentioning here is that when we issue the commit with
 expungeDeletes=true, then the delete by id replication is successful. i.e.
 http://localhost:8983/solr/annotation/update?commit=trueexpungeDeletes=true

 Regards,
 Bharat Akkinepalli

 -Original Message-
 From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
 Sent: Wednesday, October 09, 2013 6:35 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue
 with Commits after deleting documents using Delete by ID

 Bharat,

 Can you look at the logs on the Master when you issue the delete and the
 subsequent commits and share that?

 Otis
 --
 Solr  ElasticSearch Support -- http://sematext.com/ Performance
 Monitoring -- http://sematext.com/spm



 On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) 
 b.akkinepa...@elsevier.com wrote:
  Hi,
  We have recently migrated from Solr 3.6 to Solr 4.4.  We are using the
 Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have noticed
 the following behavior/defect.
 
  Configuration:
  ===
 
  1.   The Hard Commit and Soft Commit are disabled in the
 configuration (we control the commits from the application)
 
  2.   We have 1 Master and 2 Slaves configured and the pollInterval
 is configured to 10 Minutes.
 
  3.   The Master is configured to have the replicateAfter as commit
  startup
 
  Steps to reproduce the problem:
  ==
 
  1.   Delete a document in Solr  (using delete by id).  URL -
 http://localhost:8983/solr/annotation/update with body as  deleteid
 change.me/id/delete
 
  2.   Issue a commit in Master (
 http://localhost:8983/solr/annotation/update?commit=true).
 
  3.   The replication of the DELETE WILL NOT happen.  The master and
 slave has the same Index version.
 
  4.   If we try to issue another commit in Master, we see that it
 replicates fine.
 
  Request you to please confirm if this is a known issue.  Thank you.
 
  Regards,
  Bharat Akkinepalli
 




-- 
Regards,
Shalin Shekhar Mangar.


Re: AW: Boosting a field with defType:dismax -- No results at all

2013-10-16 Thread uwe72
We have just one more Problem:

When we search explicit, like *:* or partNumber:A32783627 we still don’t get
any results.

What we are doing here wrong? 




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095918.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Switching indexes

2013-10-16 Thread Christopher Gross
Garth,

I think I get what you're saying, but I want to make sure.

I have 3 servers (index1, index2, index3), with Solr living on port 8080.

Each of those has 3 cores loaded with data:
core1 (old version)
core1new (new version)
core2 (unrelated to core1)

If I wanted to make it so that queries to core1 are really going to
core1new, I'd run:
http://index1:8080/solr/admin/cores?action=CREATEALIASname=core1collections=core1newshard=shard1

Correct?

-- Chris


On Wed, Oct 16, 2013 at 9:02 AM, Garth Grimm 
garthgr...@averyranchconsulting.com wrote:

 The alias applies to the entire cloud, not a single core.

 So you'd have your indexing application point to a collection alias
 named 'index'.  And that alias would point to core1.
 You'd have your query applications point to a collection alias named
 'query', and that would point to core1, as well.

 Then use the Collection API to create core1new across the entire cloud.
  Then update the 'index' alias to point to core1new.  Feed documents in,
 run warm-up scripts, run smoke tests, etc., etc.
 When you're ready, point the 'query' alias to core1new.

 You're now running completely on core1new, and can use the Collection API
 to delete core1 from the cloud.  Or keep it around as a backup to which you
 can restore simply by changing 'query' alias.

 -Original Message-
 From: Christopher Gross [mailto:cogr...@gmail.com]
 Sent: Wednesday, October 16, 2013 7:05 AM
 To: solr-user
 Subject: Re: Switching indexes

 Shawn,

 It all makes sense, I'm just dealing with production servers here so I'm
 trying to be very careful (shutting down one node at a time is OK, just
 don't want to do something catastrophic.)

 OK, so I should use that aliasing feature.

 On index1 I have:
 core1
 core1new
 core2

 On index2 and index3 I have:
 core1
 core2

 If I do the alias command on index1 and have core1 alias core1new:
 1) Will that then get rid of the existing core1 and have core1new data
 be used for queries?
 2) Will that change make core1 instances on index2 and index3 update to
 have core1new data?

 Thanks again!



 -- Chris


 On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey s...@elyograg.org wrote:

  On 10/15/2013 2:17 PM, Christopher Gross wrote:
 
  I have 3 Solr nodes (and 5 ZK nodes).
 
  For #1, would I have to do that on all of them?
  For #2, I'm not getting the auto-replication between node 1 and nodes
  2 
  3
  for my new index.
 
  I have 2 indexes -- just call them index and indexbk (bk being
  the backup containing the full data set) up and running on one node.
  If I were to do a swap (via the Core Admin page), would that push the
  changes for indexbk over to the other two nodes?  Would I need to do
  that switch on the leader, or could that be done on one of the other
 nodes?
 
 
  For #1, I don't know how you want to handle your sharding and/or
  replication.  I would assume that you probably have numShards=1 and
  replicationFactor=3, but I could be wrong. At any rate, where the
  collection lives is an implementation detail that's up to you.
  SolrCloud keeps track of all your collections, whether they are on one
  server or all servers. Typically you can send requests (queries, API
  calls, etc) that deal with entire collections to any node in your
  cluster and they will be handled correctly.  If you need to deal with
  a specific core, that call needs to go to the correct node.
 
  For #2, when you create a core and want it to be a replica of
  something that already exists, you need to give it a name that's not
  in use on your cluster, such as index2_shard1_replica3.  You also tell
  it what collection it's part of, which for my example, would probably
  be index2.  Then you tell it what shard it will contain.  That will be
 shard1, shard2, etc.
   Here's an example of a CREATE call:
 
  http://server:port/solr/admin/**cores?action=CREATEname=**
  index2_shard1_replica3**collection=index2shard=shard1
 
  For the rest of your message: Core swapping and SolrCloud do NOT get
  along.  If you are using SolrCloud, CoreAdmin features like that need
  to disappear from your toolset. Attempting a core swap will make bad
  things
  (tm) happen.
 
  Collection aliasing is the way in SolrCloud that you can now do what
  used to be done with swapping.  You have collections named index1,
  index2, index3, etc ... and you keep an alias called just index that
  points to one of those other collections, so that you don't have to
  change your application - you just repoint the alias and all the
  application queries going to index will go to the correct place.
 
  I hope I haven't made things more confusing for you!
 
  Thanks,
  Shawn
 
 



RE: Switching indexes

2013-10-16 Thread Garth Grimm
I'd suggest using the Collections API:
http://localhost:8983/solr/admin/collections?action=CREATEALIASname=aliascollections=collection1,collection2...

See the Collections Aliases section of http://wiki.apache.org/solr/SolrCloud.

BTW, once you make the aliases, Zookeeper will have entries in /aliases.json 
that will tell you what aliases are defined and what they point to.

-Original Message-
From: Christopher Gross [mailto:cogr...@gmail.com] 
Sent: Wednesday, October 16, 2013 10:44 AM
To: solr-user
Subject: Re: Switching indexes

Garth,

I think I get what you're saying, but I want to make sure.

I have 3 servers (index1, index2, index3), with Solr living on port 8080.

Each of those has 3 cores loaded with data:
core1 (old version)
core1new (new version)
core2 (unrelated to core1)

If I wanted to make it so that queries to core1 are really going to core1new, 
I'd run:
http://index1:8080/solr/admin/cores?action=CREATEALIASname=core1collections=core1newshard=shard1

Correct?

-- Chris


On Wed, Oct 16, 2013 at 9:02 AM, Garth Grimm  
garthgr...@averyranchconsulting.com wrote:

 The alias applies to the entire cloud, not a single core.

 So you'd have your indexing application point to a collection alias
 named 'index'.  And that alias would point to core1.
 You'd have your query applications point to a collection alias named 
 'query', and that would point to core1, as well.

 Then use the Collection API to create core1new across the entire cloud.
  Then update the 'index' alias to point to core1new.  Feed documents 
 in, run warm-up scripts, run smoke tests, etc., etc.
 When you're ready, point the 'query' alias to core1new.

 You're now running completely on core1new, and can use the Collection 
 API to delete core1 from the cloud.  Or keep it around as a backup to 
 which you can restore simply by changing 'query' alias.

 -Original Message-
 From: Christopher Gross [mailto:cogr...@gmail.com]
 Sent: Wednesday, October 16, 2013 7:05 AM
 To: solr-user
 Subject: Re: Switching indexes

 Shawn,

 It all makes sense, I'm just dealing with production servers here so 
 I'm trying to be very careful (shutting down one node at a time is OK, 
 just don't want to do something catastrophic.)

 OK, so I should use that aliasing feature.

 On index1 I have:
 core1
 core1new
 core2

 On index2 and index3 I have:
 core1
 core2

 If I do the alias command on index1 and have core1 alias core1new:
 1) Will that then get rid of the existing core1 and have core1new 
 data be used for queries?
 2) Will that change make core1 instances on index2 and index3 update 
 to have core1new data?

 Thanks again!



 -- Chris


 On Tue, Oct 15, 2013 at 7:30 PM, Shawn Heisey s...@elyograg.org wrote:

  On 10/15/2013 2:17 PM, Christopher Gross wrote:
 
  I have 3 Solr nodes (and 5 ZK nodes).
 
  For #1, would I have to do that on all of them?
  For #2, I'm not getting the auto-replication between node 1 and 
  nodes
  2 
  3
  for my new index.
 
  I have 2 indexes -- just call them index and indexbk (bk being 
  the backup containing the full data set) up and running on one node.
  If I were to do a swap (via the Core Admin page), would that push 
  the changes for indexbk over to the other two nodes?  Would I need 
  to do that switch on the leader, or could that be done on one of 
  the other
 nodes?
 
 
  For #1, I don't know how you want to handle your sharding and/or 
  replication.  I would assume that you probably have numShards=1 and 
  replicationFactor=3, but I could be wrong. At any rate, where the 
  collection lives is an implementation detail that's up to you.
  SolrCloud keeps track of all your collections, whether they are on 
  one server or all servers. Typically you can send requests (queries, 
  API calls, etc) that deal with entire collections to any node in 
  your cluster and they will be handled correctly.  If you need to 
  deal with a specific core, that call needs to go to the correct node.
 
  For #2, when you create a core and want it to be a replica of 
  something that already exists, you need to give it a name that's not 
  in use on your cluster, such as index2_shard1_replica3.  You also 
  tell it what collection it's part of, which for my example, would 
  probably be index2.  Then you tell it what shard it will contain.  
  That will be
 shard1, shard2, etc.
   Here's an example of a CREATE call:
 
  http://server:port/solr/admin/**cores?action=CREATEname=**
  index2_shard1_replica3**collection=index2shard=shard1
 
  For the rest of your message: Core swapping and SolrCloud do NOT get 
  along.  If you are using SolrCloud, CoreAdmin features like that 
  need to disappear from your toolset. Attempting a core swap will 
  make bad things
  (tm) happen.
 
  Collection aliasing is the way in SolrCloud that you can now do what 
  used to be done with swapping.  You have collections named index1, 
  index2, index3, etc ... and you keep an alias called just index 
  that points to one of 

RE: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-16 Thread Akkinepalli, Bharat (ELS-CON)
Hi Shalin,
I am not sure why the log specifies No uncommitted changes appear.  The data 
is available in Solr at the time I perform a delete.

please find the below steps I have performed:
 Inserted a document in master (with id= change.me.1)
 issued a commit on master
 Triggered replication on slave
 Ensured that the document is replicated successfully.
 Issued a delete by ID.
 Issued a commit on master
 Replication did NOT happen.

The logs are as follows:
Master - http://pastebin.com/265CtCEp 
Slave - http://pastebin.com/Qx0xLwmK 

Regards,
Bharat Akkinepalli.

-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] 
Sent: Wednesday, October 16, 2013 11:28 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue with 
Commits after deleting documents using Delete by ID

The only delete I see in the master logs is:

INFO  - 2013-10-11 14:06:54.793;
org.apache.solr.update.processor.LogUpdateProcessor; [annotation] webapp=/solr 
path=/update params={} {delete=[change.me(-1448623278425899008)]} 0 60

When you commit, we have the following:

INFO  - 2013-10-11 14:07:03.809;
org.apache.solr.update.DirectUpdateHandler2; start 
commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
INFO  - 2013-10-11 14:07:03.813;
org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
Skipping IW.commit.

That suggests that the id you are trying to delete never existed in the first 
place and hence there was nothing to commit. Hence replication was not 
triggered. Am I missing something?


On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON)  
b.akkinepa...@elsevier.com wrote:

 Hi Otis,
 Did you get a chance to look into the logs.  Please let me know if you 
 need more information.  Thank you.

 Regards,
 Bharat Akkinepalli

 -Original Message-
 From: Akkinepalli, Bharat (ELS-CON) 
 [mailto:b.akkinepa...@elsevier.com]
 Sent: Friday, October 11, 2013 2:16 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue 
 with Commits after deleting documents using Delete by ID

 Hi Otis,
 Thanks for the response.  The log files can be found here.

 MasterLog : http://pastebin.com/DPLKMPcF Slave Log:
 http://pastebin.com/DX9sV6Jx

 One more point worth mentioning here is that when we issue the commit 
 with expungeDeletes=true, then the delete by id replication is successful. 
 i.e.
 http://localhost:8983/solr/annotation/update?commit=trueexpungeDelete
 s=true

 Regards,
 Bharat Akkinepalli

 -Original Message-
 From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
 Sent: Wednesday, October 09, 2013 6:35 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue 
 with Commits after deleting documents using Delete by ID

 Bharat,

 Can you look at the logs on the Master when you issue the delete and 
 the subsequent commits and share that?

 Otis
 --
 Solr  ElasticSearch Support -- http://sematext.com/ Performance 
 Monitoring -- http://sematext.com/spm



 On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON)  
 b.akkinepa...@elsevier.com wrote:
  Hi,
  We have recently migrated from Solr 3.6 to Solr 4.4.  We are using 
  the
 Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have 
 noticed the following behavior/defect.
 
  Configuration:
  ===
 
  1.   The Hard Commit and Soft Commit are disabled in the
 configuration (we control the commits from the application)
 
  2.   We have 1 Master and 2 Slaves configured and the pollInterval
 is configured to 10 Minutes.
 
  3.   The Master is configured to have the replicateAfter as commit
  startup
 
  Steps to reproduce the problem:
  ==
 
  1.   Delete a document in Solr  (using delete by id).  URL -
 http://localhost:8983/solr/annotation/update with body as  
 deleteid change.me/id/delete
 
  2.   Issue a commit in Master (
 http://localhost:8983/solr/annotation/update?commit=true).
 
  3.   The replication of the DELETE WILL NOT happen.  The master and
 slave has the same Index version.
 
  4.   If we try to issue another commit in Master, we see that it
 replicates fine.
 
  Request you to please confirm if this is a known issue.  Thank you.
 
  Regards,
  Bharat Akkinepalli
 




--
Regards,
Shalin Shekhar Mangar.


Re: Regarding Solr Cloud issue...

2013-10-16 Thread Shawn Heisey
On 10/16/2013 4:51 AM, Chris wrote:
 Also, is there any easy way upgrading to 4.5 without having to change most
 of my plugins  configuration files?

Upgrading is something that should be done carefully.  If you can, it's
always recommended that you try it out on dev hardware with your real
index data beforehand, so you can deal with any problems that arise
without causing problems for your production cluster.  Upgrading
SolrCloud is particularly tricky, because for a while you will be
running different versions on different machines in your cluster.

If you're using your own custom software to go with Solr, or you're
using third-party plugins that aren't included in the Solr download,
upgrading might take more effort than usual.  Also, if you are doing
anything in your config/schema that changes the format of the Lucene
index, you may find that it can't be upgraded without completely
rebuilding the index.  Examples of this are changing the postings format
or docValues format.  This is a very nasty complication with SolrCloud,
because those configurations affect the entire cluster.  In that case,
the whole index may need to be rebuilt without custom formats before
upgrading is attempted.

If you don't have any of the complications mentioned in the preceding
paragraph, upgrading is usually a very simple process:

*) Shut down Solr.
*) Delete the extracted WAR file directory.
*) Replace solr.war with the new war from dist/ in the download.
**) Usually it must actually be named solr.war, which means renaming it.
*) Delete and replace other jars copied from the download.
*) Change luceneMatchVersion in all solrconfig.xml files. **
*) Start Solr back up.

** With SolrCloud, you can't actually change the luceneMatchVersion
until all of your servers have been upgraded.

A full reindex is strongly recommended.  With SolrCloud, it normally
needs to wait until all servers are upgraded.  In situations where it
won't work at all without a reindex, upgrading SolrCloud can be very
challenging.

It's strongly recommended that you look over CHANGES.txt and compare the
new example config/schema with the example from the old version, to see
if there are any changes that you might want to incorporate into your
own config.  As with luceneMatchVersion, if you're running SolrCloud,
those changes might need to wait until you're fully upgraded.

Side note: When upgrading to a new minor version, config changes aren't
normally required.  They will usually be required when upgrading major
versions, such as 3.x to 4.x.

If you *do* have custom plugins that aren't included in the Solr
download, you may have to recompile them for the new version, or wait
for the vendor to create a new version before you upgrade.

This is only the tip of the iceberg, but a lot of the rest of it depends
greatly on your configurations.

Thanks,
Shawn



AW: Boosting a field with defType:dismax -- No results at all

2013-10-16 Thread uwe72
We have just one more Problem:

 

When we search explicit, like *:* or partNumber:A32783627 we still don't
get any results.

 

What we are doing here wrong? 





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095927.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud Query Balancing

2013-10-16 Thread Shawn Heisey
On 10/16/2013 8:01 AM, Henrik Ossipoff Hansen wrote:
 I did not actually realize this, I apologize for my previous reply!
 
 Haproxy would definitely be the right choice then for the posters setup for 
 redundancy.

Any load balancer software, or even an appliance load balancer like
those made by F5, would probably work.  I don't think there's anything
wrong with nginx.  I've never used it, but I've heard it mentioned often
in a load balancer context, so it's probably great software.  The
original poster should use whatever they are comfortable with, and if
they have no experience with any particular solution, they can ask
advice from people who have used one or more of the possibilities.

Never be afraid to offer advice.  I've been wrong plenty of times in
what I've posted on this list, and I've learned a TON because of it.

Thanks,
Shawn



Re: Switching indexes

2013-10-16 Thread Shawn Heisey
On 10/16/2013 9:44 AM, Christopher Gross wrote:
 Garth,
 
 I think I get what you're saying, but I want to make sure.
 
 I have 3 servers (index1, index2, index3), with Solr living on port 8080.
 
 Each of those has 3 cores loaded with data:
 core1 (old version)
 core1new (new version)
 core2 (unrelated to core1)
 
 If I wanted to make it so that queries to core1 are really going to
 core1new, I'd run:
 http://index1:8080/solr/admin/cores?action=CREATEALIASname=core1collections=core1newshard=shard1

Alias is a *Collections* API concept, not a CoreAdmin API concept.

One question is this:  Do you have a *collection* named core1, or just a
*core* named core1?  I'm pretty sure that it's possible on a SolrCloud
system to have cores that are not participating in the cloud infrastructure.

Collections are made up of shards.  Shards have replicas.  Each replica
is a core.

I'd like to see whether you have configurations loaded into zookeeper.
In the admin UI, click on Cloud, then Tree.  Click the arrow to the left
of /configs to open it.  If you see folders underneath /configs, then
you do have at least one configurations in zookeeper, and you will have
the name(s) they are using.

You can also click the arrow next to /collections and see whether you
have any collections.

The Cloud-Graph page shows you a visual representation of your cloud.

Let us know what you find.  If you have anything there, I can give you
some API URL calls that will hopefully fully illustrate what I'm saying.

Thanks,
Shawn



Re: AW: Boosting a field with defType:dismax -- No results at all

2013-10-16 Thread Jack Krupansky

dismax doesn't support wildcard, fuzzy, or fielded terms. edismax does.

My e-book details differences between the query parsers.

-- Jack Krupansky

-Original Message- 
From: uwe72

Sent: Wednesday, October 16, 2013 12:26 PM
To: solr-user@lucene.apache.org
Subject: AW: Boosting a field with defType:dismax -- No results at all

We have just one more Problem:



When we search explicit, like *:* or partNumber:A32783627 we still don't
get any results.



What we are doing here wrong?





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095927.html
Sent from the Solr - User mailing list archive at Nabble.com. 



Re: How to retrieve the query for a boolean keyword?

2013-10-16 Thread Stavros Delsiavas
I believe it is not possible. But you can easily split this in two query 
statements.


First one:

text:(“vacuna” AND “esteve news”)

and the second:

(text:(“vacuna”) OR text:(“esteve news”)) AND -text:(“vacuna” AND 
“esteve news”)


The minus - excludes all entries of the first statemant. This is 
important to ensure that you don't get entries twice. So the first will 
contain all entries with both words, and the second query all left 
entries that contain exactly ONE of those words.


I hope this helps.


Am 16.10.2013 15:49, schrieb Silvia Suárez:

Dear all,

I am using solrj as client for indexing and searching documents on the solr
server

My question:

How to retrieve the query for a boolean keyword?

For example:

I have this query:

text:(“vacuna” AND “esteve news”) OR text:(“vacuna”) OR text:(“esteve news”)

And searching in:

text-- Esteve news: Obtener una vacuna para frenar el...

Solr returns:

emEsteve news/em: obtener una emvacuna/em para frenar el ...

It is ok.

My question is:

Can I know with solr that results: emEsteve news/em emvacuna/em
are provided by the query with the AND operator?

is it posible to retrieve with solrj?

Thanks a lot in advance,

Sil,



*
*
*Tecnologías y SaaS para el análisis de marcas comerciales.*


Nota:
Usted ha recibido este mensaje al estar en la libreta de direcciones del
remitente, en los archivos de la empresa o mediante el sistema de
“responder” al ser usted la persona que contactó por este medio con el
remitente. En caso de no querer recibir ningún email mas del remitente o de
cualquier miembro de la organización a la que pertenece, por favor,
responda a este email solicitando la baja de su dirección en nuestros
archivos.

Advertencia legal:
Este mensaje y, en su caso, los ficheros anexos son confidenciales,
especialmente en lo que respecta a los datos personales, y se dirigen
exclusivamente al destinatario referenciado. Si usted no lo es y lo ha
recibido por error o tiene conocimiento del mismo por cualquier motivo, le
rogamos que nos lo comunique por este medio y proceda a destruirlo o
borrarlo, y que en todo caso se abstenga de utilizar, reproducir, alterar,
archivar o comunicar a terceros el presente mensaje y ficheros anexos, todo
ello bajo pena de incurrir en responsabilidades legales.





Re: field title_ngram was indexed without position data; cannot run PhraseQuery

2013-10-16 Thread MC

Hello,
Thank you all for your help. There was indeed a property which was not 
set right in schema.xml:

omitTermFreqAndPositions=true
After changing it to false phrase lookup started working OK.
Thanks,

M


On 10/15/13 12:01 PM, Jack Krupansky wrote:

Show us the field and field type from your schema.

Likely you are omitting position info for the field, and the field 
type has autoGeneratePhraseQueries=true - the ngram analyzer 
generates a sequence of terms for a single source term and then the 
query parser generates a PhraseQuery for that sequence, but that 
requires position info in the index but you have omitted them. That's 
one theory.


So, if that theory is correct, either retain position info by getting 
rid of the omit, or remove the autoGeneratePhraseQueries.


-- Jack Krupansky

-Original Message- From: Jason Hellman
Sent: Tuesday, October 15, 2013 11:19 AM
To: solr-user@lucene.apache.org
Subject: Re: field title_ngram was indexed without position data; 
cannot run PhraseQuery


If you consider what n-grams do this should make sense to you. 
Consider the following piece of data:


White iPod

If the field is fed through a bigram filter (n-gram with size of 2) 
the resulting token stream would appear as such:


wh hi it te
ip po od

The usual use of n-grams is to match those partial tokens, essentially 
giving you a great deal of power in creating non-wildcard partial 
matches. How you use this is up to your imagination, but one easy use 
is in partial matches in autosuggest features.


I can't speak for the intent behind the way it's coded, but it makes a 
great deal of sense to me that positional data would be seen as 
unnecessary since the intent of n-grams typically doesn't collide with 
phrase searches.  If you need both behaviors it's far better to use 
copyField and have one field dedicated to standard tokenization and 
token filters, and another field for n-grams.


I hope that's useful to you.

On Oct 15, 2013, at 6:14 AM, MC videm...@gmail.com wrote:


Hello,

Could someone explain (or perhaps provide a documentation link) what 
does the following error mean:
field title_ngram was indexed without position data; cannot run 
PhraseQuery


I'll do some more searching online, I was just wondering if anyone 
has encountered this error before, and what the possible solution 
might be. I've recently upgraded my version of solr from 3.6.0 to 
4.5.0, I'm not sure if this has any bearing or not.

Thanks,

M







Re: Switching indexes

2013-10-16 Thread Christopher Gross
Ok, so I think I was confusing the terminology (still in a 3.X mindset I
guess.)

From the Cloud-Tree, I do see that I have collections for what I was
calling core1, core2, etc.

So, to redo the above,
Servers: index1, index2, index3
Collections: (on each) coll1, coll2
Collection (core?) on index1: coll1new

Each Collection has 1 shard (too small to make sharding worthwhile).

So should I run something like this:
http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new

Or will I need coll1new to be on each of the index1, index2 and index3
instances of Solr?


-- Chris


On Wed, Oct 16, 2013 at 12:40 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/16/2013 9:44 AM, Christopher Gross wrote:
  Garth,
 
  I think I get what you're saying, but I want to make sure.
 
  I have 3 servers (index1, index2, index3), with Solr living on port 8080.
 
  Each of those has 3 cores loaded with data:
  core1 (old version)
  core1new (new version)
  core2 (unrelated to core1)
 
  If I wanted to make it so that queries to core1 are really going to
  core1new, I'd run:
 
 http://index1:8080/solr/admin/cores?action=CREATEALIASname=core1collections=core1newshard=shard1

 Alias is a *Collections* API concept, not a CoreAdmin API concept.

 One question is this:  Do you have a *collection* named core1, or just a
 *core* named core1?  I'm pretty sure that it's possible on a SolrCloud
 system to have cores that are not participating in the cloud
 infrastructure.

 Collections are made up of shards.  Shards have replicas.  Each replica
 is a core.

 I'd like to see whether you have configurations loaded into zookeeper.
 In the admin UI, click on Cloud, then Tree.  Click the arrow to the left
 of /configs to open it.  If you see folders underneath /configs, then
 you do have at least one configurations in zookeeper, and you will have
 the name(s) they are using.

 You can also click the arrow next to /collections and see whether you
 have any collections.

 The Cloud-Graph page shows you a visual representation of your cloud.

 Let us know what you find.  If you have anything there, I can give you
 some API URL calls that will hopefully fully illustrate what I'm saying.

 Thanks,
 Shawn




Re: AW: Boosting a field with defType:dismax -- No results at all

2013-10-16 Thread uwe72
Works like this?

str name=defTypeedismax/str
str name=qfSignalImpl.baureihe^1011 text^0.1/str

Another option:

How about just but to the desired fields a high boosting factor while adding
the field to the document, using solr?!

Can this work?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Boosting-a-field-with-defType-dismax-No-results-at-all-tp4095850p4095938.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Switching indexes

2013-10-16 Thread Shawn Heisey
On 10/16/2013 11:51 AM, Christopher Gross wrote:
 Ok, so I think I was confusing the terminology (still in a 3.X mindset I
 guess.)
 
 From the Cloud-Tree, I do see that I have collections for what I was
 calling core1, core2, etc.
 
 So, to redo the above,
 Servers: index1, index2, index3
 Collections: (on each) coll1, coll2
 Collection (core?) on index1: coll1new
 
 Each Collection has 1 shard (too small to make sharding worthwhile).
 
 So should I run something like this:
 http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new
 
 Or will I need coll1new to be on each of the index1, index2 and index3
 instances of Solr?

I don't think you can create an alias if a collection already exists
with that name - so having a collection named core1 means you wouldn't
want an alias named core1.  I could be wrong, but just to keep things
clean, I wouldn't recommend it, even if it's possible.

That CREATEALIAS command will only work if coll1new shows up in
/collections and shows green on the cloud graph.  If it does, and you're
using an alias name that doesn't already exist as a collection, then
you're good.

Whether coll1new is living on one server, two servers, or all three
servers doesn't matter for CREATEALIAS, or for most other
collection-related topics.  Any query or update can be sent to any
server in the cloud and it will be routed to the correct place according
to the clusterstate.

Where things live and how many replicas there are *does* matter for a
discussion about redundancy.  Generally speaking, you're going to want
your shards to have at least two replicas, so that if a Solr instance
goes down, or is taken down for maintenance, your cloud remains fully
operational.  In your situation, you probably want three replicas - so
each collection lives on all three servers.

So my general advice:

Decide what name you want your application to use, make sure none of
your existing collections are using that name, and set up an alias with
that name pointing to whichever collection is current.  Then change your
application configurations or code to point at the alias instead of
directly at the collection.

When you want to do your reindex, first create a new collection using
the collections API.  Index to that new collection.  When it's ready to
go, use CREATEALIAS to update the alias, and your application will start
using the new index.

Thanks,
Shawn



SolrCloud Performance Issue

2013-10-16 Thread shamik
Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching 2000ms
often. I expected some surge due to additional servers, network latency,
etc. but this difference is really baffling. The hardware is similar in both
cases, except for the fact that couple of SolrCloud node is sharing
zookeeper as well. m1x.large I/O is high, so shouldn't be a bottleneck as
well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries executed
from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:

autoCommit 
maxTime3/maxTime 
openSearcherfalse/openSearcher 
/autoCommit

autoSoftCommit 
maxTime1000/maxTime 
/autoSoftCommit


maxBooleanClauses1024/maxBooleanClauses


filterCache class=solr.FastLRUCache size=16384 initialSize=4096
autowarmCount=4096/

queryResultCache class=solr.LRUCache size=16384 initialSize=8192
autowarmCount=4096/

documentCache class=solr.LRUCache size=32768 initialSize=16384
autowarmCount=0/

fieldValueCache class=solr.FastLRUCache size=16384 autowarmCount=8192
showItems=4096 /

enableLazyFieldLoadingtrue/enableLazyFieldLoading

queryResultWindowSize200/queryResultWindowSize

queryResultMaxDocsCached400/queryResultMaxDocsCached



listener event=newSearcher class=solr.QuerySenderListener
arr name=queries
lststr name=qline/str/lst
lststr name=qxref/str/lst
lststr name=qdraw/str/lst
/arr
/listener
listener event=firstSearcher 
class=solr.QuerySenderListener
arr name=queries
lststr name=qline/str/lst
lststr name=qdraw/str/lst
lststr name=qline/strstr 
name=fqlanguage:english/str/lst
lststr name=qline/strstr
name=fqSource2:documentation/str/lst
lststr name=qline/strstr
name=fqSource2:CloudHelp/str/lst
lststr name=qdraw/strstr 
name=fqlanguage:english/str/lst
lststr name=qdraw/strstr
name=fqSource2:documentation/str/lst
lststr name=qdraw/strstr
name=fqSource2:CloudHelp/str/lst
/arr
/listener

maxWarmingSearchers2/maxWarmingSearchers


The custom request handler :

requestHandler name=/adskcloudhelp class=solr.SearchHandler
lst name=defaults
str name=echoParamsexplicit/str
float name=tie0.01/float
str name=wtvelocity/str
str name=v.templatebrowse/str
str name=v.contentTypetext/html;charset=UTF-8/str 
  
str name=v.layoutlayout/str
str name=v.channelcloudhelp/str

str name=defTypeedismax/str
str name=q.alt*:*/str
str name=rows15/str
str
name=flid,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score/str
str name=qftext^1.5 title^2 IndexTerm^.9 
keywords^1.2
ADSKCommandSrch^2 ADSKContextId^1/str
str name=bqSource2:CloudHelp^3 
Source2:youtube^0.85/str 
str 
name=bfrecip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0/str 
str name=dftext/str


str name=faceton/str
str name=facet.mincount1/str
str name=facet.limit100/str
str name=facet.fieldlanguage/str
str name=facet.fieldSource2/str
str name=facet.fieldDocumentationBook/str
str name=facet.fieldADSKProductDisplay/str
str name=facet.fieldaudience/str


str name=hltrue/str
str name=hl.fltext title/str
str name=f.text.hl.fragsize250/str
str name=f.text.hl.alternateFieldShortDesc/str


str 

Re: Switching indexes

2013-10-16 Thread Christopher Gross
Thanks Shawn, the explanations help bring me forward to the SolrCloud
mentality.

So it sounds like going forward that I should have a more complicated name
(ex: coll1-20131015) aliased to coll1, to make it easier to switch in the
future.

Now, if I already have an index (copied from one location to another), it
sounds like I should just remove my existing (bad/old data) coll1, create
the replicated one (calling it coll1-date), then alias coll1 to that
one.

This type of information would have been awesome to know before I got
started, but I can make do with what I've got going now.

Thanks again!


-- Chris


On Wed, Oct 16, 2013 at 2:40 PM, Shawn Heisey s...@elyograg.org wrote:

 On 10/16/2013 11:51 AM, Christopher Gross wrote:
  Ok, so I think I was confusing the terminology (still in a 3.X mindset I
  guess.)
 
  From the Cloud-Tree, I do see that I have collections for what I was
  calling core1, core2, etc.
 
  So, to redo the above,
  Servers: index1, index2, index3
  Collections: (on each) coll1, coll2
  Collection (core?) on index1: coll1new
 
  Each Collection has 1 shard (too small to make sharding worthwhile).
 
  So should I run something like this:
 
 http://index1:8080/solr/admin/collections?action=CREATEALIASname=coll1collections=col11new
 
  Or will I need coll1new to be on each of the index1, index2 and index3
  instances of Solr?

 I don't think you can create an alias if a collection already exists
 with that name - so having a collection named core1 means you wouldn't
 want an alias named core1.  I could be wrong, but just to keep things
 clean, I wouldn't recommend it, even if it's possible.

 That CREATEALIAS command will only work if coll1new shows up in
 /collections and shows green on the cloud graph.  If it does, and you're
 using an alias name that doesn't already exist as a collection, then
 you're good.

 Whether coll1new is living on one server, two servers, or all three
 servers doesn't matter for CREATEALIAS, or for most other
 collection-related topics.  Any query or update can be sent to any
 server in the cloud and it will be routed to the correct place according
 to the clusterstate.

 Where things live and how many replicas there are *does* matter for a
 discussion about redundancy.  Generally speaking, you're going to want
 your shards to have at least two replicas, so that if a Solr instance
 goes down, or is taken down for maintenance, your cloud remains fully
 operational.  In your situation, you probably want three replicas - so
 each collection lives on all three servers.

 So my general advice:

 Decide what name you want your application to use, make sure none of
 your existing collections are using that name, and set up an alias with
 that name pointing to whichever collection is current.  Then change your
 application configurations or code to point at the alias instead of
 directly at the collection.

 When you want to do your reindex, first create a new collection using
 the collections API.  Index to that new collection.  When it's ready to
 go, use CREATEALIAS to update the alias, and your application will start
 using the new index.

 Thanks,
 Shawn




Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-16 Thread Shawn Heisey
On 10/16/2013 4:46 AM, Stavros Delisavas wrote:
 My local solr gives me:
 http://pastebin.com/Q6d9dFmZ
 
 and my webserver this:
 http://pastebin.com/q87WEjVA
 
 I copied only the first few hundret lines (of more than 8000) because
 the webserver output was to big even for pastebin.
 
 
 
 On 16.10.2013 12:27, Erik Hatcher wrote:
 What does the debug output say from debugQuery=true say between the two?

What's really needed here is the first part of the debug section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn



Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-16 Thread Stavros Delsiavas

Okay I understand,

here's the rawquerystring. It was at about line 3000:

lst name=debug
 str name=rawquerystringtitle:(into AND the AND wild*)/str
 str name=querystringtitle:(into AND the AND wild*)/str
 str name=parsedquery+title:wild*/str
 str name=parsedquery_toString+title:wild*/str

At this place the debug output DOES differ from the one on my local 
system. But I don't understand why...

This is the local debug output:

lst name=debug
  str name=rawquerystringtitle:(into AND the AND wild*)/str
  str name=querystringtitle:(into AND the AND wild*)/str
  str name=parsedquery+title:into +title:the +title:wild*/str
  str name=parsedquery_toString+title:into +title:the 
+title:wild*/str


Why is that? Any ideas?




Am 16.10.2013 21:03, schrieb Shawn Heisey:

On 10/16/2013 4:46 AM, Stavros Delisavas wrote:

My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:

What does the debug output say from debugQuery=true say between the two?

What's really needed here is the first part of the debug section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn





Re: Local Solr and Webserver-Solr act differently (and treated like or)

2013-10-16 Thread Jack Krupansky
So, the stopwords.txt file is different between the two systems - the first 
has stop words but the second does not. Did you expect stop words to be 
removed, or not?


-- Jack Krupansky

-Original Message- 
From: Stavros Delsiavas

Sent: Wednesday, October 16, 2013 5:02 PM
To: solr-user@lucene.apache.org
Subject: Re: Local Solr and Webserver-Solr act differently (and treated 
like or)


Okay I understand,

here's the rawquerystring. It was at about line 3000:

lst name=debug
 str name=rawquerystringtitle:(into AND the AND wild*)/str
 str name=querystringtitle:(into AND the AND wild*)/str
 str name=parsedquery+title:wild*/str
 str name=parsedquery_toString+title:wild*/str

At this place the debug output DOES differ from the one on my local
system. But I don't understand why...
This is the local debug output:

lst name=debug
  str name=rawquerystringtitle:(into AND the AND wild*)/str
  str name=querystringtitle:(into AND the AND wild*)/str
  str name=parsedquery+title:into +title:the +title:wild*/str
  str name=parsedquery_toString+title:into +title:the
+title:wild*/str

Why is that? Any ideas?




Am 16.10.2013 21:03, schrieb Shawn Heisey:

On 10/16/2013 4:46 AM, Stavros Delisavas wrote:

My local solr gives me:
http://pastebin.com/Q6d9dFmZ

and my webserver this:
http://pastebin.com/q87WEjVA

I copied only the first few hundret lines (of more than 8000) because
the webserver output was to big even for pastebin.



On 16.10.2013 12:27, Erik Hatcher wrote:

What does the debug output say from debugQuery=true say between the two?

What's really needed here is the first part of the debug section,
which has rawquerystring, querystring, parsedquery, and
parsedquery_toString.  The info from your local solr has this part, but
what you pasted from the webserver one didn't include those parts,
because it's further down than the first few hundred lines.

Thanks,
Shawn





SolrCloud Performance Issue

2013-10-16 Thread Shamik Bandopadhyay
Hi,

  I'm in the process of transitioning to SolrCloud from a conventional
Master-Slave model. I'm using Solr 4.4 and has set-up 2 shards with 1
replica each. I've 3 zookeeper ensemble. All the nodes are running on AWS
EC2 instances. Shards are on m1.xlarge and sharing a zookeeper instance
(mounted on a separate volume). 6 gb memory is allocated to each solr
instance.

I've around 10 million documents in index. With the previous standalone
model, the queries avg around 100 ms.  The SolrCloud query response have
been abysmal so far. The query response time is over 1000ms, reaching
2000ms often. I expected some surge due to additional servers, network
latency, etc. but this difference is really baffling. The hardware is
similar in both cases, except for the fact that couple of SolrCloud node is
sharing zookeeper as well. m1x.large I/O is high, so shouldn't be a
bottleneck as well.

The other difference from old setup is that I'm using the new
CloudSolrServer class which is having the 3 zookeeper reference for load
balancing. But I don't think it has any major impact as the queries
executed from Solr admin query panel confirms the slowness.

Here are some of my configuration setup:

autoCommit
maxTime3/maxTime
openSearcherfalse/openSearcher
/autoCommit

autoSoftCommit
maxTime1000/maxTime
/autoSoftCommit


maxBooleanClauses1024/maxBooleanClauses


filterCache class=solr.FastLRUCache size=16384 initialSize=4096
autowarmCount=4096/

queryResultCache class=solr.LRUCache size=16384 initialSize=8192
autowarmCount=4096/

documentCache class=solr.LRUCache size=32768 initialSize=16384
autowarmCount=0/

fieldValueCache class=solr.FastLRUCache size=16384
autowarmCount=8192 showItems=4096 /

enableLazyFieldLoadingtrue/enableLazyFieldLoading

queryResultWindowSize200/queryResultWindowSize

queryResultMaxDocsCached400/queryResultMaxDocsCached



listener event=newSearcher class=solr.QuerySenderListener
arr name=queries
lststr name=qline/str/lst
lststr name=qxref/str/lst
lststr name=qdraw/str/lst
/arr
/listener
listener event=firstSearcher
class=solr.QuerySenderListener
arr name=queries
lststr name=qline/str/lst
lststr name=qdraw/str/lst
lststr name=qline/strstr
name=fqlanguage:english/str/lst
lststr name=qline/strstr
name=fqSource2:documentation/str/lst
lststr name=qline/strstr
name=fqSource2:CloudHelp/str/lst
lststr name=qdraw/strstr
name=fqlanguage:english/str/lst
lststr name=qdraw/strstr
name=fqSource2:documentation/str/lst
lststr name=qdraw/strstr
name=fqSource2:CloudHelp/str/lst
/arr
/listener

maxWarmingSearchers2/maxWarmingSearchers


The custom request handler :

requestHandler name=/adskcloudhelp class=solr.SearchHandler
lst name=defaults
str name=echoParamsexplicit/str
float name=tie0.01/float
str name=wtvelocity/str
str name=v.templatebrowse/str
str
name=v.contentTypetext/html;charset=UTF-8/str
str name=v.layoutlayout/str
str name=v.channelcloudhelp/str

str name=defTypeedismax/str
str name=q.alt*:*/str
str name=rows15/str
str
name=flid,url,Description,Source2,text,filetype,title,LastUpdateDate,PublishDate,ViewCount,TotalMessageCount,Solution,LastPostAuthor,Author,Duration,AuthorUrl,ThumbnailUrl,TopicId,score/str
str name=qftext^1.5 title^2 IndexTerm^.9
keywords^1.2 ADSKCommandSrch^2 ADSKContextId^1/str
str name=bqSource2:CloudHelp^3
Source2:youtube^0.85/str
str
name=bfrecip(ms(NOW,PublishDate),3.16e-11,1,1)^2.0/str
str name=dftext/str


str name=faceton/str
str name=facet.mincount1/str
str name=facet.limit100/str
str name=facet.fieldlanguage/str
str name=facet.fieldSource2/str
str name=facet.fieldDocumentationBook/str
str name=facet.fieldADSKProductDisplay/str
str name=facet.fieldaudience/str


str name=hltrue/str
str name=hl.fltext title/str
str name=f.text.hl.fragsize250/str
str name=f.text.hl.alternateFieldShortDesc/str


str name=spellchecktrue/str
str name=spellcheck.dictionarydefault/str
  

Solr - Read sort data from external source

2013-10-16 Thread qrcde
Hello,

I am trying to write some code to read rank data from external db, I saw
some example done using database -
http://sujitpal.blogspot.com/2011/05/custom-sorting-in-solr-using-external.html,
where they fetch whole database during index searcher creation and cache it.

But is there any way to pass parameter or choose different database during
FieldComparator based on query. Lets say I want to pass versions, the sort
order in version 1 will be different then sort order in v2.

Or if I use ExternalFileField is there way to load different file base on
query parameter? 

Regards



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Read-sort-data-from-external-source-tp4095970.html
Sent from the Solr - User mailing list archive at Nabble.com.


Skipping caches on a /select

2013-10-16 Thread Tim Vaillancourt
Hey guys,

I am debugging some /select queries on my Solr tier and would like to see
if there is a way to tell Solr to skip the caches on a given /select query
if it happens to ALREADY be in the cache. Live queries are being inserted
and read from the caches, but I want my debug queries to bypass the cache
entirely.

I do know about the cache=false param (that causes the results of a
select to not be INSERTED in to the cache), but what I am looking for
instead is a way to tell Solr to not read the cache at all, even if there
actually is a cached result for my query.

Is there a way to do this (without disabling my caches in solrconfig.xml),
or is this feature request?

Thanks!

Tim Vaillancourt


Re: SolrCloud on SSL

2013-10-16 Thread Tim Vaillancourt
Not important, but I'm also curious why you would want SSL on Solr (adds
overhead, complexity, harder-to-troubleshoot, etc)?

To avoid the overhead, could you put Solr on a separate VLAN (with ACLs to
client servers)?

Cheers,

Tim


On 12 October 2013 17:30, Shawn Heisey s...@elyograg.org wrote:

 On 10/11/2013 9:38 AM, Christopher Gross wrote:
  On Fri, Oct 11, 2013 at 11:08 AM, Shawn Heisey s...@elyograg.org
 wrote:
 
  On 10/11/2013 8:17 AM, Christopher Gross wrote: 
  Is there a spot in a Solr configuration that I can set this up to use
  HTTPS?
 
  From what I can tell, not yet.
 
  https://issues.apache.org/jira/browse/SOLR-3854
  https://issues.apache.org/jira/browse/SOLR-4407
  https://issues.apache.org/jira/browse/SOLR-4470
 
 
  Dang.

 Christopher,

 I was just looking through Solr source code for a completely different
 issue, and it seems that there *IS* a way to do this in your configuration.

 If you were to use https://hostname; or https://ipaddress; as the
 host parameter in your solr.xml file on each machine, it should do
 what you want.  The parameter is described here, but not the behavior
 that I have discovered:

 http://wiki.apache.org/solr/SolrCloud#SolrCloud_Instance_Params

 Boring details: In the org.apache.solr.cloud package, there is a
 ZkController class.  The getHostAddress method is where I discovered
 that you can do this.

 If you could try this out and confirm that it works, I will get the wiki
 page updated and look into the Solr reference guide as well.

 Thanks,
 Shawn




Re: Skipping caches on a /select

2013-10-16 Thread Yonik Seeley
On Wed, Oct 16, 2013 at 6:18 PM, Tim Vaillancourt t...@elementspace.com wrote:
 I am debugging some /select queries on my Solr tier and would like to see
 if there is a way to tell Solr to skip the caches on a given /select query
 if it happens to ALREADY be in the cache. Live queries are being inserted
 and read from the caches, but I want my debug queries to bypass the cache
 entirely.

 I do know about the cache=false param (that causes the results of a
 select to not be INSERTED in to the cache), but what I am looking for
 instead is a way to tell Solr to not read the cache at all, even if there
 actually is a cached result for my query.

Yeah, cache=false for q or fq should already not use the cache at
all (read or write).

-Yonik


Re: Solr 4.4 - Master/Slave configuration - Replication Issue with Commits after deleting documents using Delete by ID

2013-10-16 Thread Shalin Shekhar Mangar
Thanks Bharat. This is a bug. I've opened LUCENE-5289.

https://issues.apache.org/jira/browse/LUCENE-5289


On Wed, Oct 16, 2013 at 9:35 PM, Akkinepalli, Bharat (ELS-CON) 
b.akkinepa...@elsevier.com wrote:

 Hi Shalin,
 I am not sure why the log specifies No uncommitted changes appear.  The
 data is available in Solr at the time I perform a delete.

 please find the below steps I have performed:
  Inserted a document in master (with id= change.me.1)
  issued a commit on master
  Triggered replication on slave
  Ensured that the document is replicated successfully.
  Issued a delete by ID.
  Issued a commit on master
  Replication did NOT happen.

 The logs are as follows:
 Master - http://pastebin.com/265CtCEp
 Slave - http://pastebin.com/Qx0xLwmK

 Regards,
 Bharat Akkinepalli.

 -Original Message-
 From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
 Sent: Wednesday, October 16, 2013 11:28 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue
 with Commits after deleting documents using Delete by ID

 The only delete I see in the master logs is:

 INFO  - 2013-10-11 14:06:54.793;
 org.apache.solr.update.processor.LogUpdateProcessor; [annotation]
 webapp=/solr path=/update params={} {delete=[change.me(-1448623278425899008)]}
 0 60

 When you commit, we have the following:

 INFO  - 2013-10-11 14:07:03.809;
 org.apache.solr.update.DirectUpdateHandler2; start
 commit{,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
 INFO  - 2013-10-11 14:07:03.813;
 org.apache.solr.update.DirectUpdateHandler2; No uncommitted changes.
 Skipping IW.commit.

 That suggests that the id you are trying to delete never existed in the
 first place and hence there was nothing to commit. Hence replication was
 not triggered. Am I missing something?


 On Wed, Oct 16, 2013 at 5:06 PM, Akkinepalli, Bharat (ELS-CON) 
 b.akkinepa...@elsevier.com wrote:

  Hi Otis,
  Did you get a chance to look into the logs.  Please let me know if you
  need more information.  Thank you.
 
  Regards,
  Bharat Akkinepalli
 
  -Original Message-
  From: Akkinepalli, Bharat (ELS-CON)
  [mailto:b.akkinepa...@elsevier.com]
  Sent: Friday, October 11, 2013 2:16 PM
  To: solr-user@lucene.apache.org
  Subject: RE: Solr 4.4 - Master/Slave configuration - Replication Issue
  with Commits after deleting documents using Delete by ID
 
  Hi Otis,
  Thanks for the response.  The log files can be found here.
 
  MasterLog : http://pastebin.com/DPLKMPcF Slave Log:
  http://pastebin.com/DX9sV6Jx
 
  One more point worth mentioning here is that when we issue the commit
  with expungeDeletes=true, then the delete by id replication is
 successful. i.e.
  http://localhost:8983/solr/annotation/update?commit=trueexpungeDelete
  s=true
 
  Regards,
  Bharat Akkinepalli
 
  -Original Message-
  From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
  Sent: Wednesday, October 09, 2013 6:35 PM
  To: solr-user@lucene.apache.org
  Subject: Re: Solr 4.4 - Master/Slave configuration - Replication Issue
  with Commits after deleting documents using Delete by ID
 
  Bharat,
 
  Can you look at the logs on the Master when you issue the delete and
  the subsequent commits and share that?
 
  Otis
  --
  Solr  ElasticSearch Support -- http://sematext.com/ Performance
  Monitoring -- http://sematext.com/spm
 
 
 
  On Tue, Oct 8, 2013 at 3:57 PM, Akkinepalli, Bharat (ELS-CON) 
  b.akkinepa...@elsevier.com wrote:
   Hi,
   We have recently migrated from Solr 3.6 to Solr 4.4.  We are using
   the
  Master/Slave configuration in Solr 4.4 (not Solr Cloud).  We have
  noticed the following behavior/defect.
  
   Configuration:
   ===
  
   1.   The Hard Commit and Soft Commit are disabled in the
  configuration (we control the commits from the application)
  
   2.   We have 1 Master and 2 Slaves configured and the pollInterval
  is configured to 10 Minutes.
  
   3.   The Master is configured to have the replicateAfter as
 commit
   startup
  
   Steps to reproduce the problem:
   ==
  
   1.   Delete a document in Solr  (using delete by id).  URL -
  http://localhost:8983/solr/annotation/update with body as
  deleteid change.me/id/delete
  
   2.   Issue a commit in Master (
  http://localhost:8983/solr/annotation/update?commit=true).
  
   3.   The replication of the DELETE WILL NOT happen.  The master and
  slave has the same Index version.
  
   4.   If we try to issue another commit in Master, we see that it
  replicates fine.
  
   Request you to please confirm if this is a known issue.  Thank you.
  
   Regards,
   Bharat Akkinepalli
  
 



 --
 Regards,
 Shalin Shekhar Mangar.




-- 
Regards,
Shalin Shekhar Mangar.