Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-24 Thread sam fang
Hi Mark,

If can support in future, I think it's great. It's a really useful feature.
For example, user can use to refresh with totally new core. User can build
index on one core. After build done, can swap old core and new core. Then
get totally new core for search.

Also can used in the backup. If one crashed, can easily swap with backup
core and quickly serve the search request.

Best Regards,
Sam

On Sun, Sep 23, 2012 at 2:51 PM, Mark Miller markrmil...@gmail.com wrote:

 FYI swap is def not supported in SolrCloud right now - even though it may
 work, it's not been thought about and there are no tests.

 If you would like to see support, I'd add a JIRA issue along with any
 pertinent info from this thread about what the behavior needs to be changed
 to.

 - Mark

 On Sep 21, 2012, at 6:49 PM, sam fang sam.f...@gmail.com wrote:

  Hi Chris,
 
  Thanks for your help. Today I tried again and try to figure out the
 reason.
 
  1. set up an external zookeeper server.
 
  2. change /opt/solr/apache-solr-4.0.0-BETA/example/solr/solr.xml
 persistent
  to true. and run below command to upload config to zk. (renamed multicore
  to solr, and need to put zkcli.sh related jar package.)
  /opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
  upconfig -confdir
 /opt/solr/apache-solr-4.0.0-BETA/example/solr/core0/conf/
  -confname
  core0 -z localhost:2181
  /opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
  upconfig -confdir
 /opt/solr/apache-solr-4.0.0-BETA/example/solr/core1/conf/
  -confname
  core1 -z localhost:2181
 
  3. Start jetty server
  cd /opt/solr/apache-solr-4.0.0-BETA/example
  java -DzkHost=localhost:2181 -jar start.jar
 
  4. publish message to core0
  /opt/solr/apache-solr-4.0.0-BETA/example/solr/exampledocs
  cp ../../exampledocs/post.jar ./
  java -Durl=http://localhost:8983/solr/core0/update -jar post.jar
  ipod_video.xml
 
  5. query to core0 and core1 is ok.
 
  6. Click swap in the admin page, the query to core0 and core1 is
  changing. Previous I saw sometimes returns 0 result. sometimes return 1
  result. Today
  seems core0 still return 1 result, core1 return 0 result.
 
  7. Then click reload in the admin page, the query to core0 and core1.
  Sometimes return 1 result, and sometimes return nothing. Also can see
 the zk
  configuration also changed.
 
  8. Restart jetty server. If do the query, it's same as what I saw in
 step 7.
 
  9. Stop jetty server, then log into zkCli.sh, then run command set
  /clusterstate.json {}. then start jetty again. everything back to
 normal,
  that is what previous swap did in solr 3.6 or solr 4.0 w/o cloud.
 
 
  From my observation, after swap, seems it put shard information into
  actualShards, when user request to search, it will use all shard
  information to do the
  search. But user can't see zk update until click reload button in admin
  page. When restart web server, this shard information eventually went to
  zk, and
  the search go to all shards.
 
  I found there is a option distrib, and used url like 
  http://host1:18000/solr/core0/select?distrib=falseq=*%3A*wt=xml;, then
  only get the data on the
  core0. Digged in the code (handleRequestBody method in SearchHandler
 class,
  seems it make sense)
 
  I tried to stop tomcat server, then use command set /clusterstate.json
 {}
  to clean all cluster state, then use command cloud-scripts/zkcli.sh -cmd
  upconfig to upload config to zk server, and start tomcat server. It
  rebuild the right shard information in zk. then search function back to
  normal like what
  we saw in 3.6 or 4.0 w/o cloud.
 
  Seems solr always add shard information into zk.
 
  I tested cloud swap on single machine, if each core have one shard in the
  zk, after swap, eventually zk has 2 slices(shards) for that core because
  now only
  do the add. so the search will go to both 2 shards.
 
  and tested cloud swap with 2 machine which each core have 1 shard and 2
  slices. Below the configuration in the zk. After swap, eventually zk has
 4
  for that
  core. and search will mess up.
 
   core0:{shard1:{
   host1:18000_solr_core0:{
 shard:shard1,
 roles:null,
 leader:true,
 state:active,
 core:core0,
 collection:core0,
 node_name:host1:18000_solr,
 base_url:http://host1:18000/solr},
   host2:18000_solr_core0:{
 shard:shard1,
 roles:null,
 state:active,
 core:core0,
 collection:core0,
 node_name:host2:18000_solr,
 base_url:http://host2:18000/solr}}},
 
  For previous 2 cases, if I stoped tomcat/jetty server, then manullay
 upload
  configuration to zk, then start tomcat server, zk and search become
 normal.
 
  On Fri, Sep 21, 2012 at 3:34 PM, Chris Hostetter
  hossman_luc...@fucit.orgwrote:
 
 
  : Below is my solr.xml configuration, and already set persistent to
 true.
 ...
  : Then publish 1 record to test1, and query. it's ok now.
 
  Ok, first off -- 

Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-23 Thread Mark Miller
FYI swap is def not supported in SolrCloud right now - even though it may work, 
it's not been thought about and there are no tests.

If you would like to see support, I'd add a JIRA issue along with any pertinent 
info from this thread about what the behavior needs to be changed to.

- Mark

On Sep 21, 2012, at 6:49 PM, sam fang sam.f...@gmail.com wrote:

 Hi Chris,
 
 Thanks for your help. Today I tried again and try to figure out the reason.
 
 1. set up an external zookeeper server.
 
 2. change /opt/solr/apache-solr-4.0.0-BETA/example/solr/solr.xml persistent
 to true. and run below command to upload config to zk. (renamed multicore
 to solr, and need to put zkcli.sh related jar package.)
 /opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
 upconfig -confdir /opt/solr/apache-solr-4.0.0-BETA/example/solr/core0/conf/
 -confname
 core0 -z localhost:2181
 /opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
 upconfig -confdir /opt/solr/apache-solr-4.0.0-BETA/example/solr/core1/conf/
 -confname
 core1 -z localhost:2181
 
 3. Start jetty server
 cd /opt/solr/apache-solr-4.0.0-BETA/example
 java -DzkHost=localhost:2181 -jar start.jar
 
 4. publish message to core0
 /opt/solr/apache-solr-4.0.0-BETA/example/solr/exampledocs
 cp ../../exampledocs/post.jar ./
 java -Durl=http://localhost:8983/solr/core0/update -jar post.jar
 ipod_video.xml
 
 5. query to core0 and core1 is ok.
 
 6. Click swap in the admin page, the query to core0 and core1 is
 changing. Previous I saw sometimes returns 0 result. sometimes return 1
 result. Today
 seems core0 still return 1 result, core1 return 0 result.
 
 7. Then click reload in the admin page, the query to core0 and core1.
 Sometimes return 1 result, and sometimes return nothing. Also can see the zk
 configuration also changed.
 
 8. Restart jetty server. If do the query, it's same as what I saw in step 7.
 
 9. Stop jetty server, then log into zkCli.sh, then run command set
 /clusterstate.json {}. then start jetty again. everything back to normal,
 that is what previous swap did in solr 3.6 or solr 4.0 w/o cloud.
 
 
 From my observation, after swap, seems it put shard information into
 actualShards, when user request to search, it will use all shard
 information to do the
 search. But user can't see zk update until click reload button in admin
 page. When restart web server, this shard information eventually went to
 zk, and
 the search go to all shards.
 
 I found there is a option distrib, and used url like 
 http://host1:18000/solr/core0/select?distrib=falseq=*%3A*wt=xml;, then
 only get the data on the
 core0. Digged in the code (handleRequestBody method in SearchHandler class,
 seems it make sense)
 
 I tried to stop tomcat server, then use command set /clusterstate.json {}
 to clean all cluster state, then use command cloud-scripts/zkcli.sh -cmd
 upconfig to upload config to zk server, and start tomcat server. It
 rebuild the right shard information in zk. then search function back to
 normal like what
 we saw in 3.6 or 4.0 w/o cloud.
 
 Seems solr always add shard information into zk.
 
 I tested cloud swap on single machine, if each core have one shard in the
 zk, after swap, eventually zk has 2 slices(shards) for that core because
 now only
 do the add. so the search will go to both 2 shards.
 
 and tested cloud swap with 2 machine which each core have 1 shard and 2
 slices. Below the configuration in the zk. After swap, eventually zk has 4
 for that
 core. and search will mess up.
 
  core0:{shard1:{
  host1:18000_solr_core0:{
shard:shard1,
roles:null,
leader:true,
state:active,
core:core0,
collection:core0,
node_name:host1:18000_solr,
base_url:http://host1:18000/solr},
  host2:18000_solr_core0:{
shard:shard1,
roles:null,
state:active,
core:core0,
collection:core0,
node_name:host2:18000_solr,
base_url:http://host2:18000/solr}}},
 
 For previous 2 cases, if I stoped tomcat/jetty server, then manullay upload
 configuration to zk, then start tomcat server, zk and search become normal.
 
 On Fri, Sep 21, 2012 at 3:34 PM, Chris Hostetter
 hossman_luc...@fucit.orgwrote:
 
 
 : Below is my solr.xml configuration, and already set persistent to true.
...
 : Then publish 1 record to test1, and query. it's ok now.
 
 Ok, first off -- please provide more details on how exactly you are
 running Solr.  Your initial email said...
 
 In Solr 3.6, core swap function works good. After switch to use Solr
 4.0
 Beta, and found it doesn't work well.
 
 ...but based on your solr.xml file and your logs, it appears you are now
 trying to use some of the ZooKeeper/SolrCloud features that didn't even
 exist in Solr 3.6, so it's kind of an apples to oranges comparison.  i'm
 pretty sure that for a simple multicore setup, SWAP still works exactly as
 it did in Solr 3.6.
 
 Wether SWAP works with ZooKeeper/SolrCloud is 

Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-21 Thread Chris Hostetter

: Below is my solr.xml configuration, and already set persistent to true.
...
: Then publish 1 record to test1, and query. it's ok now.

Ok, first off -- please provide more details on how exactly you are 
running Solr.  Your initial email said...

 In Solr 3.6, core swap function works good. After switch to use Solr 4.0
 Beta, and found it doesn't work well.

...but based on your solr.xml file and your logs, it appears you are now 
trying to use some of the ZooKeeper/SolrCloud features that didn't even 
exist in Solr 3.6, so it's kind of an apples to oranges comparison.  i'm 
pretty sure that for a simple multicore setup, SWAP still works exactly as 
it did in Solr 3.6.

Wether SWAP works with ZooKeeper/SolrCloud is something i'm not really 
clear on -- mainly because i'm not sure what it should mean conceptually.  
Should the two SolrCores swap which collections they are apart of? what 
happens if the doc-shard assignment for the two collections means the 
same docs woulnd't wind up im those SolrCores? what if the SolrCores are 
two different shards of the same collection befor teh SWAP?

FWIW: It wasn't clear from your messsage *how* you had your SolrCloud 
system setup, but it appears from your pre-swap log messages that you are 
running a single node, so I tried to reproduce the behavior you were 
seeing by seting up a solr home dir like you described, and then 
running...

java -Dsolr.solr.home=swap-test/ -DzkRun -Dbootstrap_conf=true -DnumShards=1 
-jar start.jar

...because that was my best guess as to what you were running. But even 
then i couldn't get the behavior you described after the swap...

: And found the shardurl is different with the log which search before swap.
: It’s shard.url=host1:18000/solr/test1-ondeck/| host1:18000/solr/test1/.

...what i observed after the swap, is that it apperaed as if hte SWAP had 
no effect to client requests, because a client request to 
/solr/test1/select?q=*:* was distributed under the covers to 
/solr/test1-ondeck/... which is the new name for the core where the doc 
had been indexed...

Sep 21, 2012 12:03:32 PM org.apache.solr.core.SolrCore execute
INFO: [test1-ondeck] webapp=/solr path=/select 
params={distrib=falsewt=javabinrows=10version=2df=textfl=id,scoreshard.url=frisbee:8983/solr/test1-ondeck/NOW=1348254212677start=0q=*:*isShard=truefsv=true}
 hits=1 status=0 QTime=0 
Sep 21, 2012 12:03:32 PM org.apache.solr.core.SolrCore execute
INFO: [test1-ondeck] webapp=/solr path=/select 
params={df=textshard.url=frisbee:8983/solr/test1-ondeck/NOW=1348254212677q=*:*ids=SOLR1000distrib=falseisShard=truewt=javabinrows=10version=2}
 status=0 QTime=0 
Sep 21, 2012 12:03:32 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select params={q=*:*} status=0 QTime=10 

...I'm guessing this behavior is because nothing in the SWAP call bothered 
to tell ZK that these two SolrCores swaped names, so when asking to query 
the test1 collection, ZK says ok, well the name of the core that's part 
of that collection is test1-ondeck and that's where the query was routed.

I only saw behavior similar to what you described (of the shard.url 
refering to both SolrCores) after i restarted solr -- presumably because 
as the solrCores started up, they both notified ZK about their existence, 
and said what collection they (thought they) are a part of, but ZK also 
already thinks they are each a part of a differet collection as well 
(because nothing bothered to tell ZK otherwise).

So the long and short of it seems to be...

* CoreAdminHandler's SWAP is poorly defined if you are using SolrCloud 
(most likely: so is RENAME and ALIAS) - i've opened SOLR-3866.

* This doesn't seem like a regression bug from Solr 3.6, because as 
far as i can tell SWAP still works as well as it did in 3.6.



-Hoss

Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-21 Thread sam fang
Hi Chris,

Thanks for your help. Today I tried again and try to figure out the reason.

1. set up an external zookeeper server.

2. change /opt/solr/apache-solr-4.0.0-BETA/example/solr/solr.xml persistent
to true. and run below command to upload config to zk. (renamed multicore
to solr, and need to put zkcli.sh related jar package.)
/opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
upconfig -confdir /opt/solr/apache-solr-4.0.0-BETA/example/solr/core0/conf/
-confname
core0 -z localhost:2181
/opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
upconfig -confdir /opt/solr/apache-solr-4.0.0-BETA/example/solr/core1/conf/
-confname
core1 -z localhost:2181

3. Start jetty server
cd /opt/solr/apache-solr-4.0.0-BETA/example
java -DzkHost=localhost:2181 -jar start.jar

4. publish message to core0
/opt/solr/apache-solr-4.0.0-BETA/example/solr/exampledocs
cp ../../exampledocs/post.jar ./
java -Durl=http://localhost:8983/solr/core0/update -jar post.jar
 ipod_video.xml

5. query to core0 and core1 is ok.

6. Click swap in the admin page, the query to core0 and core1 is
changing. Previous I saw sometimes returns 0 result. sometimes return 1
result. Today
seems core0 still return 1 result, core1 return 0 result.

7. Then click reload in the admin page, the query to core0 and core1.
Sometimes return 1 result, and sometimes return nothing. Also can see the zk
configuration also changed.

8. Restart jetty server. If do the query, it's same as what I saw in step 7.

9. Stop jetty server, then log into zkCli.sh, then run command set
/clusterstate.json {}. then start jetty again. everything back to normal,
that is what previous swap did in solr 3.6 or solr 4.0 w/o cloud.


From my observation, after swap, seems it put shard information into
actualShards, when user request to search, it will use all shard
information to do the
search. But user can't see zk update until click reload button in admin
page. When restart web server, this shard information eventually went to
zk, and
the search go to all shards.

I found there is a option distrib, and used url like 
http://host1:18000/solr/core0/select?distrib=falseq=*%3A*wt=xml;, then
only get the data on the
core0. Digged in the code (handleRequestBody method in SearchHandler class,
seems it make sense)

I tried to stop tomcat server, then use command set /clusterstate.json {}
to clean all cluster state, then use command cloud-scripts/zkcli.sh -cmd
upconfig to upload config to zk server, and start tomcat server. It
rebuild the right shard information in zk. then search function back to
normal like what
we saw in 3.6 or 4.0 w/o cloud.

Seems solr always add shard information into zk.

I tested cloud swap on single machine, if each core have one shard in the
zk, after swap, eventually zk has 2 slices(shards) for that core because
 now only
do the add. so the search will go to both 2 shards.

and tested cloud swap with 2 machine which each core have 1 shard and 2
slices. Below the configuration in the zk. After swap, eventually zk has 4
for that
core. and search will mess up.

  core0:{shard1:{
  host1:18000_solr_core0:{
shard:shard1,
roles:null,
leader:true,
state:active,
core:core0,
collection:core0,
node_name:host1:18000_solr,
base_url:http://host1:18000/solr},
  host2:18000_solr_core0:{
shard:shard1,
roles:null,
state:active,
core:core0,
collection:core0,
node_name:host2:18000_solr,
base_url:http://host2:18000/solr}}},

For previous 2 cases, if I stoped tomcat/jetty server, then manullay upload
configuration to zk, then start tomcat server, zk and search become normal.

On Fri, Sep 21, 2012 at 3:34 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:


 : Below is my solr.xml configuration, and already set persistent to true.
 ...
 : Then publish 1 record to test1, and query. it's ok now.

 Ok, first off -- please provide more details on how exactly you are
 running Solr.  Your initial email said...

  In Solr 3.6, core swap function works good. After switch to use Solr
 4.0
  Beta, and found it doesn't work well.

 ...but based on your solr.xml file and your logs, it appears you are now
 trying to use some of the ZooKeeper/SolrCloud features that didn't even
 exist in Solr 3.6, so it's kind of an apples to oranges comparison.  i'm
 pretty sure that for a simple multicore setup, SWAP still works exactly as
 it did in Solr 3.6.

 Wether SWAP works with ZooKeeper/SolrCloud is something i'm not really
 clear on -- mainly because i'm not sure what it should mean conceptually.
 Should the two SolrCores swap which collections they are apart of? what
 happens if the doc-shard assignment for the two collections means the
 same docs woulnd't wind up im those SolrCores? what if the SolrCores are
 two different shards of the same collection befor teh SWAP?

 FWIW: It wasn't clear from your messsage *how* you had your SolrCloud

Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-20 Thread Chris Hostetter
: In Solr 3.6, core swap function works good. After switch to use Solr 4.0
: Beta, and found it doesn't work well.

can you elaborate on what exactly you mean by doesn't work well ? .. 
what does your solr.xml file look like? what command did you run to do the 
swap? what results did you get from those commands?  what exactly did you 
observe after teh swap and how did you observe it?

: I tried to swap two cores, but it still return old core data when do the
: search. After restart tomat which contain Solr, it will mess up when do the
: search, seems it will use like oldcoreshard|newcoreshard to do the search.
: Anyone hit this issue?

how did you do the search ? is it possible you were just seeing your 
browser cache the results?  Do you have persistent=true in your solr.xml 
file? w/o that changes made via the CoreAdmin commands won't be saved to 
disk.

I just tested using both 4.0-BETA and the HEAD of the 4x branch and 
couldn't see any problems using SWAP  (i tested using 'java 
-Dsolr.solr.home=multicore/ -jar start.jar' and indexing some trivial 
docs, and then tested again after modifying the solr.xml to use 
persistent=true)


-Hoss


Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-20 Thread sam fang
Hi Hoss,

Thanks for your quick reply.

Below is my solr.xml configuration, and already set persistent to true.

?xml version=1.0 encoding=UTF-8 ?
solr persistent=true
  cores adminPath=/admin/cores
zkClientTimeout=${zkClientTimeout:15000} hostPort=18000
core schema=schema.xml shard=shard1 instanceDir=test1-ondeck/
name=test1 config=solrconfig.xml collection=test1-ondeck/
core schema=schema.xml shard=shard1 instanceDir=test1/
name=test1-ondeck config=solrconfig.xml collection=test1/
/cores
/solr

For test1 and tets1-ondeck content, just copied from
example/solr/collection1

Then publish 1 record to test1, and query. it's ok now.

INFO: [test1] webapp=/solr path=/select
params={distrib=falsewt=javabinrows=10version=2fl=id,scoredf=textNOW=1348195088691shard.url=host1:18000/solr/test1/start=0q=*:*isShard=truefsv=true}
hits=1 status=0 QTime=1
Sep 20, 2012 10:38:08 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select
params={ids=SOLR1000distrib=falsewt=javabinrows=10version=2df=textNOW=1348195088691shard.url=
host1:18000/solr/test1/q=*:*isShard=true} status=0 QTime=1
Sep 20, 2012 10:38:08 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select params={q=*:*wt=python} status=0
QTime=20


Then use core admin console page to swap, and click reload for test1 and
test1-ondeck.  if keep refresh query page, sometimes give 1 record,
sometime give 0 records.
And found the shardurl is different with the log which search before swap.
It’s shard.url=host1:18000/solr/test1-ondeck/| host1:18000/solr/test1/.

Below return 0
S Sep 20, 2012 10:41:32 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select
params={fl=id,scoredf=textNOW=1348195292608shard.url=host1:18000/solr/test1-ondeck/|
host1:18000/solr/test1/start=0q=*:*distrib=falseisShard=truewt=javabinfsv=truerows=10version=2}
hits=0 status=0 QTime=0
Sep 20, 2012 10:41:32 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select params={q=*:*wt=python} status=0
QTime=14

Below return 1
Sep 20, 2012 10:42:31 PM org.apache.solr.core.SolrCore execute
INFO: [test1-ondeck] webapp=/solr path=/select
params={fl=id,scoredf=textNOW=1348195351293shard.url=
host1:18000/solr/test1-ondeck/|
host1:18000/solr/test1/start=0q=*:*distrib=falseisShard=truewt=javabinfsv=truerows=10version=2}
hits=1 status=0 QTime=1
Sep 20, 2012 10:42:31 PM org.apache.solr.core.SolrCore execute
INFO: [test1-ondeck] webapp=/solr path=/select
params={df=textNOW=1348195351293shard.url=
host1:18000/solr/test1-ondeck/|
host1:18000/solr/test1/q=*:*ids=SOLR1000distrib=falseisShard=truewt=javabinrows=10version=2}
status=0 QTime=1
Sep 20, 2012 10:42:31 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select params={q=*:*wt=python} status=0
QTime=9

Thanks a lot,
Sam

On Thu, Sep 20, 2012 at 8:27 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:

 : In Solr 3.6, core swap function works good. After switch to use Solr 4.0
 : Beta, and found it doesn't work well.

 can you elaborate on what exactly you mean by doesn't work well ? ..
 what does your solr.xml file look like? what command did you run to do the
 swap? what results did you get from those commands?  what exactly did you
 observe after teh swap and how did you observe it?

 : I tried to swap two cores, but it still return old core data when do the
 : search. After restart tomat which contain Solr, it will mess up when do
 the
 : search, seems it will use like oldcoreshard|newcoreshard to do the
 search.
 : Anyone hit this issue?

 how did you do the search ? is it possible you were just seeing your
 browser cache the results?  Do you have persistent=true in your solr.xml
 file? w/o that changes made via the CoreAdmin commands won't be saved to
 disk.

 I just tested using both 4.0-BETA and the HEAD of the 4x branch and
 couldn't see any problems using SWAP  (i tested using 'java
 -Dsolr.solr.home=multicore/ -jar start.jar' and indexing some trivial
 docs, and then tested again after modifying the solr.xml to use
 persistent=true)


 -Hoss