Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Shawn Heisey
On 2/25/2015 5:50 AM, Benson Margulies wrote:
 So, found the following line in the guide:
 
java -DzkRun -DnumShards=2
 -Dbootstrap_confdir=./solr/collection1/conf
 -Dcollection.configName=myconf -jar start.jar
 
 using a completely clean, new, solr_home.
 
 In my own bootstrap dir, I have my own solrconfig.xml and schema.xml,
 and I modified to have:
 
  -DnumShards=8 -DmaxShardsPerNode=8
 
 When I went to start loading data into this, I failed:
 
 Caused by: 
 org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
 No registered leader was found after waiting for 4000ms , collection:
 rni slice: shard4
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:554)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
 at 
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
 at 
 org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:285)
 at 
 org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:271)
 at 
 com.basistech.rni.index.internal.SolrCloudEvaluationNameIndex.init(SolrCloudEvaluationNameIndex.java:53)
 
 with corresponding log traffic in the solr log.
 
 The cloud page in the Solr admin app shows the IP address in green.
 It's a bit hard to read in general, it's all squished up to the top.

The way I would do it would be to start Solr *only* with the zkHost
parameter.  If you're going to use embedded zookeeper, I guess you would
use zkRun instead.

Once I had Solr running in cloud mode, I would upload the config to
zookeeper using zkcli, and create the collection using the Collections
API, including things like numShards and maxShardsPerNode on that CREATE
call, not as startup properties.  Then I would completely reindex my
data into the new collection.  It's a whole lot cleaner than trying to
convert non-cloud to cloud and split shards.

Thanks,
Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Benson Margulies
On Wed, Feb 25, 2015 at 8:04 AM, Shawn Heisey apa...@elyograg.org wrote:
 On 2/25/2015 5:50 AM, Benson Margulies wrote:
 So, found the following line in the guide:

java -DzkRun -DnumShards=2
 -Dbootstrap_confdir=./solr/collection1/conf
 -Dcollection.configName=myconf -jar start.jar

 using a completely clean, new, solr_home.

 In my own bootstrap dir, I have my own solrconfig.xml and schema.xml,
 and I modified to have:

  -DnumShards=8 -DmaxShardsPerNode=8

 When I went to start loading data into this, I failed:

 Caused by: 
 org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
 No registered leader was found after waiting for 4000ms , collection:
 rni slice: shard4
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:554)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
 at 
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
 at 
 org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:285)
 at 
 org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:271)
 at 
 com.basistech.rni.index.internal.SolrCloudEvaluationNameIndex.init(SolrCloudEvaluationNameIndex.java:53)

 with corresponding log traffic in the solr log.

 The cloud page in the Solr admin app shows the IP address in green.
 It's a bit hard to read in general, it's all squished up to the top.

 The way I would do it would be to start Solr *only* with the zkHost
 parameter.  If you're going to use embedded zookeeper, I guess you would
 use zkRun instead.

 Once I had Solr running in cloud mode, I would upload the config to
 zookeeper using zkcli, and create the collection using the Collections
 API, including things like numShards and maxShardsPerNode on that CREATE
 call, not as startup properties.  Then I would completely reindex my
 data into the new collection.  It's a whole lot cleaner than trying to
 convert non-cloud to cloud and split shards.

Shawn, I _am_ starting from clean. However, I didn't find a recipe for
what you suggest as a process, and  (following Hoss' suggestion) I
found the recipe above with the boostrap_confdir scheme.

I am mostly confused as to how I supply my solrconfig.xml and
schema.xml when I follow the process you are suggesting. I know I'm
verging on vampirism here, but if you could possibly find the time to
turn your paragraph into either a pointer to a recipe or the command
lines in a bit more detail, I'd be exceedingly grateful.

Thanks,
benson




 Thanks,
 Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Benson Margulies
So, found the following line in the guide:

   java -DzkRun -DnumShards=2
-Dbootstrap_confdir=./solr/collection1/conf
-Dcollection.configName=myconf -jar start.jar

using a completely clean, new, solr_home.

In my own bootstrap dir, I have my own solrconfig.xml and schema.xml,
and I modified to have:

 -DnumShards=8 -DmaxShardsPerNode=8

When I went to start loading data into this, I failed:

Caused by: org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
No registered leader was found after waiting for 4000ms , collection:
rni slice: shard4
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:554)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
at 
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
at 
org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at 
org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:285)
at 
org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:271)
at 
com.basistech.rni.index.internal.SolrCloudEvaluationNameIndex.init(SolrCloudEvaluationNameIndex.java:53)

with corresponding log traffic in the solr log.

The cloud page in the Solr admin app shows the IP address in green.
It's a bit hard to read in general, it's all squished up to the top.




On Tue, Feb 24, 2015 at 4:33 PM, Benson Margulies bimargul...@gmail.com wrote:
 On Tue, Feb 24, 2015 at 4:27 PM, Chris Hostetter
 hossman_luc...@fucit.org wrote:

 : Unfortunately, this is all 5.1 and instructs me to run the 'start from
 : scratch' process.

 a) checkout the left nav of any ref guide page webpage which has a link to
 Older Versions of this Guide (PDF)

 b) i'm not entirely sure i understand what you're asking, but i'm guessing
 you mean...

 * you have a fully functional individual instance of Solr, with a single
 core
 * you only want to run that one single instance of the Solr process
 * you want tha single solr process to be a SolrCould of one node, but
 replace your single core with a collection that is divided into 8
 shards.
 * presumably: you don't care about replication since you are only trying
 to run one node.

 what you want to look into (in the 4.10 ref guide) is how to bootstrap a
 SolrCloud instance from a non-SolrCloud node -- ie: startup zk, tell solr
 to take the configs from your single core and uploda them to zk as a
 configset, and register that single core as a collection.

 That should give you a single instance of solrcloud, with a single
 collection, consisting of one shard (your original core)

 Then you should be able to use the SPLITSHARD command to split your
 single shard into 2 shards, and then split them again, etc... (i don't
 think you can split directly to 8-sub shards with a single command)



 FWIW: unless you no longer have access to the original data, it would
 almost certainly be a lot easier to just start with a clean install of
 Solr in cloud mode, then create a collection with 8 shards, then re-index
 your data.

 OK, now I'm good to go. Thanks.




 -Hoss
 http://www.lucidworks.com/


Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Benson Margulies
A little more data. Note that the cloud status shows the black bubble
for a leader. See http://i.imgur.com/k2MhGPM.png.

org.apache.solr.common.SolrException: No registered leader was found
after waiting for 4000ms , collection: rni slice: shard4
at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:568)
at 
org.apache.solr.common.cloud.ZkStateReader.getLeaderRetry(ZkStateReader.java:551)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doDeleteByQuery(DistributedUpdateProcessor.java:1358)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processDelete(DistributedUpdateProcessor.java:1226)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processDelete(UpdateRequestProcessor.java:55)
at 
org.apache.solr.update.processor.LogUpdateProcessor.processDelete(LogUpdateProcessorFactory.java:121)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processDelete(UpdateRequestProcessor.java:55)


On Wed, Feb 25, 2015 at 9:44 AM, Benson Margulies bimargul...@gmail.com wrote:
 On Wed, Feb 25, 2015 at 8:04 AM, Shawn Heisey apa...@elyograg.org wrote:
 On 2/25/2015 5:50 AM, Benson Margulies wrote:
 So, found the following line in the guide:

java -DzkRun -DnumShards=2
 -Dbootstrap_confdir=./solr/collection1/conf
 -Dcollection.configName=myconf -jar start.jar

 using a completely clean, new, solr_home.

 In my own bootstrap dir, I have my own solrconfig.xml and schema.xml,
 and I modified to have:

  -DnumShards=8 -DmaxShardsPerNode=8

 When I went to start loading data into this, I failed:

 Caused by: 
 org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
 No registered leader was found after waiting for 4000ms , collection:
 rni slice: shard4
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:554)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
 at 
 org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
 at 
 org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
 at 
 org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:285)
 at 
 org.apache.solr.client.solrj.SolrServer.deleteByQuery(SolrServer.java:271)
 at 
 com.basistech.rni.index.internal.SolrCloudEvaluationNameIndex.init(SolrCloudEvaluationNameIndex.java:53)

 with corresponding log traffic in the solr log.

 The cloud page in the Solr admin app shows the IP address in green.
 It's a bit hard to read in general, it's all squished up to the top.

 The way I would do it would be to start Solr *only* with the zkHost
 parameter.  If you're going to use embedded zookeeper, I guess you would
 use zkRun instead.

 Once I had Solr running in cloud mode, I would upload the config to
 zookeeper using zkcli, and create the collection using the Collections
 API, including things like numShards and maxShardsPerNode on that CREATE
 call, not as startup properties.  Then I would completely reindex my
 data into the new collection.  It's a whole lot cleaner than trying to
 convert non-cloud to cloud and split shards.

 Shawn, I _am_ starting from clean. However, I didn't find a recipe for
 what you suggest as a process, and  (following Hoss' suggestion) I
 found the recipe above with the boostrap_confdir scheme.

 I am mostly confused as to how I supply my solrconfig.xml and
 schema.xml when I follow the process you are suggesting. I know I'm
 verging on vampirism here, but if you could possibly find the time to
 turn your paragraph into either a pointer to a recipe or the command
 lines in a bit more detail, I'd be exceedingly grateful.

 Thanks,
 benson




 Thanks,
 Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Benson Margulies
It's the zkcli options on my mind. zkcli's usage shows me 'bootstrap',
'upconfig', and uploading a solr.xml.

When I use upconfig, it might work, but it sure is noise:

benson@ip-10-111-1-103:/data/solr+rni$ 554331
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983] WARN
org.apache.zookeeper.server.NIOServerCnxn  – caught end of stream
exception
EndOfStreamException: Unable to read additional data from client
sessionid 0x14bc16c5e660003, likely client has closed socket
at 
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
at 
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)

On Wed, Feb 25, 2015 at 10:52 AM, Shawn Heisey apa...@elyograg.org wrote:
 On 2/25/2015 8:35 AM, Benson Margulies wrote:
 Do I need a zkcli bootstrap or do I start with upconfig? What port does
 zkRun put zookeeper on?

 I personally would not use bootstrap options.  They are only meant to be
 used once, when converting from non-cloud, but many people who use them
 do NOT use them only once -- they include them in their startup scripts
 and use them on every startup.  The whole thing becomes extremely
 confusing.  I would just use zkcli and the Collections API, so nothing
 ever happens that you don't explicitly request.

 I believe that the port for embedded zookeeper (zkRun) is the jetty
 listen port plus 1000, so 9983 if jetty.port is 8983 or not set.

 Thanks,
 Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Benson Margulies
Do I need a zkcli bootstrap or do I start with upconfig? What port does
zkRun put zookeeper on?
On Feb 25, 2015 10:15 AM, Shawn Heisey apa...@elyograg.org wrote:

 On 2/25/2015 7:44 AM, Benson Margulies wrote:
  Shawn, I _am_ starting from clean. However, I didn't find a recipe for
  what you suggest as a process, and  (following Hoss' suggestion) I
  found the recipe above with the boostrap_confdir scheme.
 
  I am mostly confused as to how I supply my solrconfig.xml and
  schema.xml when I follow the process you are suggesting. I know I'm
  verging on vampirism here, but if you could possibly find the time to
  turn your paragraph into either a pointer to a recipe or the command
  lines in a bit more detail, I'd be exceedingly grateful.

 I'm willing to help in any way that I can.

 Normally in the conf directory for a non-cloud core you have
 solrconfig.xml and schema.xml, plus any other configs referenced by
 those files, like synomyms.txt, dih-config.xml, etc.  In cloud terms,
 the directory containing these files is a confdir.  It's best to keep
 the on-disk copy of your configs completely outside of the solr home so
 there's no confusion about what configurations are active.  On-disk
 cores for solrcloud do not need or use a conf directory.

 The cloud-scripts/zkcli.sh (or zkcli.bat) script has an upconfig
 command with -confdir and -confname options.

 When doing upconfig, the zkHost value goes on the -z option to zkcli,
 and you only need to list one of your zookeeper hosts, although it is
 perfectly happy if you list them all.  You would point -confdir at a
 directory containing the config files mentioned earlier, and -confname
 is the name that the config has in zookeeper, which you would then use
 on the collection.configName parameter for the Collections API call.
 Once the config is uploaded, here's an example call to that API for
 creating a collection:

 http://server:port
 /solr/admin/collections?action=CREATEname=testnumShards=8replicationFactor=1collection.configName=testcfgmaxShardsPerNode=8

 If this is not enough detail, please let me know which part you need
 help with.

 Thanks,
 Shawn




Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Shawn Heisey
On 2/25/2015 7:44 AM, Benson Margulies wrote:
 Shawn, I _am_ starting from clean. However, I didn't find a recipe for
 what you suggest as a process, and  (following Hoss' suggestion) I
 found the recipe above with the boostrap_confdir scheme.

 I am mostly confused as to how I supply my solrconfig.xml and
 schema.xml when I follow the process you are suggesting. I know I'm
 verging on vampirism here, but if you could possibly find the time to
 turn your paragraph into either a pointer to a recipe or the command
 lines in a bit more detail, I'd be exceedingly grateful.

I'm willing to help in any way that I can.

Normally in the conf directory for a non-cloud core you have
solrconfig.xml and schema.xml, plus any other configs referenced by
those files, like synomyms.txt, dih-config.xml, etc.  In cloud terms,
the directory containing these files is a confdir.  It's best to keep
the on-disk copy of your configs completely outside of the solr home so
there's no confusion about what configurations are active.  On-disk
cores for solrcloud do not need or use a conf directory.

The cloud-scripts/zkcli.sh (or zkcli.bat) script has an upconfig
command with -confdir and -confname options.

When doing upconfig, the zkHost value goes on the -z option to zkcli,
and you only need to list one of your zookeeper hosts, although it is
perfectly happy if you list them all.  You would point -confdir at a
directory containing the config files mentioned earlier, and -confname
is the name that the config has in zookeeper, which you would then use
on the collection.configName parameter for the Collections API call. 
Once the config is uploaded, here's an example call to that API for
creating a collection:

http://server:port/solr/admin/collections?action=CREATEname=testnumShards=8replicationFactor=1collection.configName=testcfgmaxShardsPerNode=8

If this is not enough detail, please let me know which part you need
help with.

Thanks,
Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Shawn Heisey
On 2/25/2015 8:35 AM, Benson Margulies wrote:
 Do I need a zkcli bootstrap or do I start with upconfig? What port does
 zkRun put zookeeper on?

I personally would not use bootstrap options.  They are only meant to be
used once, when converting from non-cloud, but many people who use them
do NOT use them only once -- they include them in their startup scripts
and use them on every startup.  The whole thing becomes extremely
confusing.  I would just use zkcli and the Collections API, so nothing
ever happens that you don't explicitly request.

I believe that the port for embedded zookeeper (zkRun) is the jetty
listen port plus 1000, so 9983 if jetty.port is 8983 or not set.

Thanks,
Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Benson Margulies
Bingo!

Here's the recipe for the record:

 gcopts has the ton of gc options.

First, set up shop:

DIR=$PWD
cd ../solr-4.10.3/example
java -Xmx200g $gcopts DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
-Djetty.port=8983 -Dsolr.solr.home=/data/solr+rni/cloud_solr_home
-Dsolr.install.dir=/dat\
a/solr-4.10.3 -Duser.timezone=UTC -Djava.net.preferIPv4Stack=true
-DzkRun -jar start.jar 

and then:

curl 
'http://localhost:8983/solr/admin/collections?action=CREATEname=rninumShards=8replicationFactor=1collection.configName=rnimaxSh\
ardsPerNode=8'



On Wed, Feb 25, 2015 at 11:03 AM, Benson Margulies
bimargul...@gmail.com wrote:
 It's the zkcli options on my mind. zkcli's usage shows me 'bootstrap',
 'upconfig', and uploading a solr.xml.

 When I use upconfig, it might work, but it sure is noise:

 benson@ip-10-111-1-103:/data/solr+rni$ 554331
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983] WARN
 org.apache.zookeeper.server.NIOServerCnxn  – caught end of stream
 exception
 EndOfStreamException: Unable to read additional data from client
 sessionid 0x14bc16c5e660003, likely client has closed socket
 at 
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
 at java.lang.Thread.run(Thread.java:745)

 On Wed, Feb 25, 2015 at 10:52 AM, Shawn Heisey apa...@elyograg.org wrote:
 On 2/25/2015 8:35 AM, Benson Margulies wrote:
 Do I need a zkcli bootstrap or do I start with upconfig? What port does
 zkRun put zookeeper on?

 I personally would not use bootstrap options.  They are only meant to be
 used once, when converting from non-cloud, but many people who use them
 do NOT use them only once -- they include them in their startup scripts
 and use them on every startup.  The whole thing becomes extremely
 confusing.  I would just use zkcli and the Collections API, so nothing
 ever happens that you don't explicitly request.

 I believe that the port for embedded zookeeper (zkRun) is the jetty
 listen port plus 1000, so 9983 if jetty.port is 8983 or not set.

 Thanks,
 Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-25 Thread Shawn Heisey
On 2/25/2015 9:03 AM, Benson Margulies wrote:
 It's the zkcli options on my mind. zkcli's usage shows me 'bootstrap',
 'upconfig', and uploading a solr.xml.

 When I use upconfig, it might work, but it sure is noise:

 benson@ip-10-111-1-103:/data/solr+rni$ 554331
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:9983] WARN
 org.apache.zookeeper.server.NIOServerCnxn  – caught end of stream
 exception
 EndOfStreamException: Unable to read additional data from client
 sessionid 0x14bc16c5e660003, likely client has closed socket
 at 
 org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
 at java.lang.Thread.run(Thread.java:745)

The upconfig command is VERY noisy.  A LOT of data is printed whether
it's successful or not, and exceptions on a successful upload would
actually not surprise me.  An issue to reduce the zkcli output to short
informational/error messages rather than the full zookeeper client
logging is something I'll do soon if someone else doesn't get to it.

I had never noticed the bootstrap option to zkcli before ... based on
the options shown, I think it's meant to convert an entire non-cloud
(and probably non-redundant) Solr installation (all cores currently
present in the solr home) to SolrCloud.  It's a conversion that would
work, but I think it would be very ugly.  There's also a bootstrap
option for Solr that does this.

Thanks,
Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-24 Thread Michael Della Bitta
I guess the place to start is the Reference Guide:

https://cwiki.apache.org/confluence/display/solr/SolrCloud

Generally speaking, when you start Solr with any sort of Zookeeper, you've
entered cloud mode, which essentially means that Solr is now capable of
organizing cores into groups that represent shards, and groups of shards
are coordinated into collections. Additionally, Zookeeper allows multiple
Solr installations to be coordinated together to serve these collections
with high availability.

If you're just trying to gain parallelism on a single by using multiple
cores, you don't specifically need cloud mode or collections. You can
create multiple cores, distribute your documents manually to each core, and
then do a distributed search ala
https://wiki.apache.org/solr/DistributedSearch. The downside here is that
you're on your own in terms of distributing the documents at write time,
but on the other hand, you don't have to maintain a Zookeeper ensemble or
devote brain cells to understanding collections/shards/etc.


Michael Della Bitta

Senior Software Engineer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/

On Tue, Feb 24, 2015 at 3:21 PM, Benson Margulies bimargul...@gmail.com
wrote:

 On Tue, Feb 24, 2015 at 1:30 PM, Michael Della Bitta
 michael.della.bi...@appinions.com wrote:
  Benson:
 
  Are you trying to run independent invocations of Solr for every node?
  Otherwise, you'd just want to create a 8 shard collection with
  maxShardsPerNode set to 8 (or more I guess).

 Michael Della Bitta,

 I don't want to run multiple invocations. I just want to exploit
 hardware cores with shards. Can you point me at doc for the process
 you are referencing here? I confess to some ongoing confusion between
 cores and collections.

 --benson


 
  Michael Della Bitta
 
  Senior Software Engineer
 
  o: +1 646 532 3062
 
  appinions inc.
 
  “The Science of Influence Marketing”
 
  18 East 41st Street
 
  New York, NY 10017
 
  t: @appinions https://twitter.com/Appinions | g+:
  plus.google.com/appinions
  
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 
  w: appinions.com http://www.appinions.com/
 
  On Tue, Feb 24, 2015 at 1:27 PM, Benson Margulies bimargul...@gmail.com
 
  wrote:
 
  With so much of the site shifted to 5.0, I'm having a bit of trouble
  finding what I need, and so I'm hoping that someone can give me a push
  in the right direction.
 
  On a big multi-core machine, I want to set up a configuration with 8
  (or perhaps more) nodes treated as shards. I have some very particular
  solrconfig.xml and schema.xml that I need to use.
 
  Could some kind person point me at a relatively step-by-step layout?
  This is all on Linux, I'm happy to explicitly run Zookeeper.
 



Re: 8 Shards of Cloud with 4.10.3.

2015-02-24 Thread Shawn Heisey
On 2/24/2015 1:21 PM, Benson Margulies wrote:
 On Tue, Feb 24, 2015 at 1:30 PM, Michael Della Bitta
 michael.della.bi...@appinions.com wrote:
 Benson:

 Are you trying to run independent invocations of Solr for every node?
 Otherwise, you'd just want to create a 8 shard collection with
 maxShardsPerNode set to 8 (or more I guess).
 Michael Della Bitta,

 I don't want to run multiple invocations. I just want to exploit
 hardware cores with shards. Can you point me at doc for the process
 you are referencing here? I confess to some ongoing confusion between
 cores and collections.

SolrCloud is designed around the idea that each machine runs one copy of
Solr.  Running multiple instances of Solr on one machine is usually a
waste of resources, and can lead to problems with SolrCloud high
availability (redundancy).

Here's a simple way of thinking about the terminology in SolrCloud: 
Collections are made up of one or more shards.  Shards have one or more
replicas.  Each replica is a core.

An important detail:  For each shard, one of the replicas is elected
leader.  SolrCloud gets rid of the master and slave concepts.

Thanks,
Shawn



Re: 8 Shards of Cloud with 4.10.3.

2015-02-24 Thread Benson Margulies
On Tue, Feb 24, 2015 at 4:27 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : Unfortunately, this is all 5.1 and instructs me to run the 'start from
 : scratch' process.

 a) checkout the left nav of any ref guide page webpage which has a link to
 Older Versions of this Guide (PDF)

 b) i'm not entirely sure i understand what you're asking, but i'm guessing
 you mean...

 * you have a fully functional individual instance of Solr, with a single
 core
 * you only want to run that one single instance of the Solr process
 * you want tha single solr process to be a SolrCould of one node, but
 replace your single core with a collection that is divided into 8
 shards.
 * presumably: you don't care about replication since you are only trying
 to run one node.

 what you want to look into (in the 4.10 ref guide) is how to bootstrap a
 SolrCloud instance from a non-SolrCloud node -- ie: startup zk, tell solr
 to take the configs from your single core and uploda them to zk as a
 configset, and register that single core as a collection.

 That should give you a single instance of solrcloud, with a single
 collection, consisting of one shard (your original core)

 Then you should be able to use the SPLITSHARD command to split your
 single shard into 2 shards, and then split them again, etc... (i don't
 think you can split directly to 8-sub shards with a single command)



 FWIW: unless you no longer have access to the original data, it would
 almost certainly be a lot easier to just start with a clean install of
 Solr in cloud mode, then create a collection with 8 shards, then re-index
 your data.

OK, now I'm good to go. Thanks.




 -Hoss
 http://www.lucidworks.com/


Re: 8 Shards of Cloud with 4.10.3.

2015-02-24 Thread Chris Hostetter

: Unfortunately, this is all 5.1 and instructs me to run the 'start from
: scratch' process.

a) checkout the left nav of any ref guide page webpage which has a link to 
Older Versions of this Guide (PDF)

b) i'm not entirely sure i understand what you're asking, but i'm guessing 
you mean...

* you have a fully functional individual instance of Solr, with a single 
core
* you only want to run that one single instance of the Solr process
* you want tha single solr process to be a SolrCould of one node, but 
replace your single core with a collection that is divided into 8 
shards.
* presumably: you don't care about replication since you are only trying 
to run one node.

what you want to look into (in the 4.10 ref guide) is how to bootstrap a 
SolrCloud instance from a non-SolrCloud node -- ie: startup zk, tell solr 
to take the configs from your single core and uploda them to zk as a 
configset, and register that single core as a collection.  

That should give you a single instance of solrcloud, with a single 
collection, consisting of one shard (your original core)

Then you should be able to use the SPLITSHARD command to split your 
single shard into 2 shards, and then split them again, etc... (i don't 
think you can split directly to 8-sub shards with a single command)



FWIW: unless you no longer have access to the original data, it would 
almost certainly be a lot easier to just start with a clean install of 
Solr in cloud mode, then create a collection with 8 shards, then re-index 
your data.



-Hoss
http://www.lucidworks.com/


Re: 8 Shards of Cloud with 4.10.3.

2015-02-24 Thread Benson Margulies
On Tue, Feb 24, 2015 at 1:30 PM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
 Benson:

 Are you trying to run independent invocations of Solr for every node?
 Otherwise, you'd just want to create a 8 shard collection with
 maxShardsPerNode set to 8 (or more I guess).

Michael Della Bitta,

I don't want to run multiple invocations. I just want to exploit
hardware cores with shards. Can you point me at doc for the process
you are referencing here? I confess to some ongoing confusion between
cores and collections.

--benson



 Michael Della Bitta

 Senior Software Engineer

 o: +1 646 532 3062

 appinions inc.

 “The Science of Influence Marketing”

 18 East 41st Street

 New York, NY 10017

 t: @appinions https://twitter.com/Appinions | g+:
 plus.google.com/appinions
 https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
 w: appinions.com http://www.appinions.com/

 On Tue, Feb 24, 2015 at 1:27 PM, Benson Margulies bimargul...@gmail.com
 wrote:

 With so much of the site shifted to 5.0, I'm having a bit of trouble
 finding what I need, and so I'm hoping that someone can give me a push
 in the right direction.

 On a big multi-core machine, I want to set up a configuration with 8
 (or perhaps more) nodes treated as shards. I have some very particular
 solrconfig.xml and schema.xml that I need to use.

 Could some kind person point me at a relatively step-by-step layout?
 This is all on Linux, I'm happy to explicitly run Zookeeper.



Re: 8 Shards of Cloud with 4.10.3.

2015-02-24 Thread Benson Margulies
On Tue, Feb 24, 2015 at 3:32 PM, Michael Della Bitta
michael.della.bi...@appinions.com wrote:
 https://cwiki.apache.org/confluence/display/solr/SolrCloud

Unfortunately, this is all 5.1 and instructs me to run the 'start from
scratch' process.

I wish that I could take my existing one-core no-cloud config and
convert it into a cloud, 8-shard config.


Re: 8 Shards of Cloud with 4.10.3.

2015-02-24 Thread Michael Della Bitta
Benson:

Are you trying to run independent invocations of Solr for every node?
Otherwise, you'd just want to create a 8 shard collection with
maxShardsPerNode set to 8 (or more I guess).

Michael Della Bitta

Senior Software Engineer

o: +1 646 532 3062

appinions inc.

“The Science of Influence Marketing”

18 East 41st Street

New York, NY 10017

t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
https://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts
w: appinions.com http://www.appinions.com/

On Tue, Feb 24, 2015 at 1:27 PM, Benson Margulies bimargul...@gmail.com
wrote:

 With so much of the site shifted to 5.0, I'm having a bit of trouble
 finding what I need, and so I'm hoping that someone can give me a push
 in the right direction.

 On a big multi-core machine, I want to set up a configuration with 8
 (or perhaps more) nodes treated as shards. I have some very particular
 solrconfig.xml and schema.xml that I need to use.

 Could some kind person point me at a relatively step-by-step layout?
 This is all on Linux, I'm happy to explicitly run Zookeeper.



8 Shards of Cloud with 4.10.3.

2015-02-24 Thread Benson Margulies
With so much of the site shifted to 5.0, I'm having a bit of trouble
finding what I need, and so I'm hoping that someone can give me a push
in the right direction.

On a big multi-core machine, I want to set up a configuration with 8
(or perhaps more) nodes treated as shards. I have some very particular
solrconfig.xml and schema.xml that I need to use.

Could some kind person point me at a relatively step-by-step layout?
This is all on Linux, I'm happy to explicitly run Zookeeper.