Again, thank you all for the suggestions.

My ZK ensemble is talking to each other and the outside world:

solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk1.zookeeper.internal 2181
Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on
05/03/2019 12:07 GMT
Latency min/avg/max: 0/0/0
Received: 53
Sent: 33
Connections: 1
Outstanding: 19
Zxid: 0x0
Mode: follower
Node count: 5

solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk2.zookeeper.internal 2181
Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on
05/03/2019 12:07 GMT
Latency min/avg/max: 0/0/0
Received: 37
Sent: 17
Connections: 1
Outstanding: 19
Zxid: 0x200000000
Mode: leader
Node count: 5
Proposal sizes last/min/max: 32/32/36

solr@fe0ad5b40b42:/etc/default# echo srvr | nc zk3.zookeeper.internal 2181
Zookeeper version: 3.5.5-390fe37ea45dee01bf87dc1c042b5e3dcce88653, built on
05/03/2019 12:07 GMT
Latency min/avg/max: 0/0/0
Received: 7
Sent: 3
Connections: 1
Outstanding: 3
Zxid: 0x200000000
Mode: follower
Node count: 5

All of these commands can be executed on the solr container as either the
root user or the solr user (see the command prompt in each command). Note
that zk2 is the leader and zk1 and zk3 are followers. The configuration
files (including the ZOO_MY_ID and ZOO_SERVERS environment variables) are
all set up correctly and by all rights and purposes, ZK appears to be set
up correctly and functioning.

Jorne Franke: I tried implementing your suggestion of providing "/" as the
root node by appending "/" to the end of the ZK_HOST connection string and
it still did not work (e.g. ENV ZK_HOST
zk1.zookeeper.internal:2181,zk2.zookeeper.internal:2181,zk3.zookeeper.internal:2181/
in the Dockerfile). Was this what you meant?  Or were you suggesting to set
the ZK_ROOT in the Solr configs/environment instead?

--
Drew(i...@gmail.com)
http://wyntermute.dyndns.org/blog/

-- I Drive Way Too Fast To Worry About Cholesterol.


On Fri, Oct 18, 2019 at 12:11 PM Ahmed Adel <aa.0...@gmail.com> wrote:

> This could be because Zookeeper ensemble is not properly configured. Using
> a very similar setup which consists of ZK cluster of three hosts and one
> Solr Cloud node (all are containers), the system got running. Each ZK host
> has ZOO_MY_ID and ZOO_SERVERS environment variables set before running ZK.
> In this case, the former variable value would be from 1 to 3 on each host
> and the latter would be "server.1=z1:2888:3888;2181
> server.2=z2:2888:3888;2181 server.3=z3:2888:3888;2181" the same on all
> hosts (the double quotes may be needed for proper parsing). This
> ZOO_SERVERS syntax is for ZK version 3.5. 3.4 is slightly different.
>
> http://aadel.io
>
> On Fri, Oct 18, 2019 at 5:28 PM Drew Kidder <dre...@gmail.com> wrote:
>
> > Thank you all for your suggestions! I appreciate the fast turnaround.
> >
> > My setup is using Amazon ECS for our solr cloud installation. Each ZK is
> in
> > its own container, using Route53 Service Discovery to provide the DNS
> name.
> > The ZK nodes can all talk to each other, and I can communicate to each
> one
> > of those nodes from my local machine and from within the solr container.
> > Solr is one node per container, as Martijn correctly assumed. I am not
> > using a zkRoot at present because my intention is to use ZK solely for
> Solr
> > Cloud and nothing else.
> >
> > I have tried removing the "-z" option from the Dockerfile CMD and using
> the
> > ZK_HOST environment variable (see below). I have even also modified the
> > solr.in.sh and set the ZK_HOST variable there, all to no avail. I have
> > tried both the Dockerfile command route, and have logged into the solr
> > container and tried to run the CMD manually to see if there was a problem
> > with the way I was using the CMD entry. All of those methods give me the
> > same result output captured in the gist below.
> >
> > The gist for my solr.log output is here:
> > https://gist.github.com/dkidder/2db9a6d393dedb97a39ed32e2be0c087
> >
> > My Dockerfile for the solr container looks like this:
> >
> >
> > FROM    solr:8.2
> >
> > EXPOSE    8983 8999 2181
> >
> > VOLUME    /app/logs
> > VOLUME    /app/data
> > VOLUME    /app/conf
> >
> > ## add our jetty configuration (increased request size!)
> > COPY   jetty.xml /opt/solr/server/etc/
> >
> > ## SolrCloud configuration
> > ENV     ZK_HOST zk1:2181,zk2:2181,zk3:2181
> > ENV     ZK_CLIENT_TIMEOUT 30000
> >
> > USER   root
> > RUN    apt-get update
> > RUN    apt-get install -y netcat net-tools vim procps
> > USER   solr
> >
> > # Copy over custom solr plugins
> > COPY    myplugins/src/resources/* /opt/solr/server/solr/my-resources/
> > COPY    lib/*.jar /opt/solr/my-lib/
> >
> > # Copy over my configs
> > COPY    conf/ /app/conf
> >
> > #Start solr in cloud mode, connecting to zookeeper
> > CMD       ["solr","start","-f","-c"]
> >
> > The docker command I use to execute this Dockerfile is `docker run -p
> > 8983:8983 -p 2181:2181 --name $(APP_NAME) $(APP_NAME):latest`
> >
> > Output of `ps -eflww` from within the solr container (as root):
> >
> > root@fe0ad5b40b42:/opt/solr-8.2.0# ps -eflww
> > F S UID        PID  PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY
> TIME
> > CMD
> > 4 S solr         1     0  9  80   0 - 1043842 -    14:36 ?
> 00:00:07
> > /usr/local/openjdk-11/bin/java -server -Xms512m -Xmx512m -XX:+UseG1GC
> > -XX:+PerfDisableSharedMem -XX:+ParallelRefProcEnabled
> > -XX:MaxGCPauseMillis=250 -XX:+UseLargePages -XX:+AlwaysPreTouch
> >
> >
> -Xlog:gc*:file=/var/solr/logs/solr_gc.log:time,uptime:filecount=9,filesize=20M
> > -Dcom.sun.management.jmxremote
> > -Dcom.sun.management.jmxremote.local.only=false
> > -Dcom.sun.management.jmxremote.ssl=false
> > -Dcom.sun.management.jmxremote.authenticate=false
> > -Dcom.sun.management.jmxremote.port=18983
> > -Dcom.sun.management.jmxremote.rmi.port=18983 -DzkClientTimeout=30000
> > -DzkHost=zk1:2181,zk2:2181,zk3:2181 -Dsolr.log.dir=/var/solr/logs
> > -Djetty.port=8983 -DSTOP.PORT=7983 -DSTOP.KEY=solrrocks
> -Duser.timezone=UTC
> > -Djetty.home=/opt/solr/server -Dsolr.solr.home=/var/solr/data
> > -Dsolr.data.home= -Dsolr.install.dir=/opt/solr
> > -Dsolr.default.confdir=/opt/solr/server/solr/configsets/_default/conf
> > -Dlog4j.configurationFile=file:/var/solr/log4j2.xml -Xss256k
> > -Dsolr.jetty.https.port=8983 -jar start.jar --module=http
> > 4 S root        90     0  0  80   0 -  4988 -      14:37 pts/0
> 00:00:00
> > /bin/bash
> > 0 R root        95    90  0  80   0 -  9595 -      14:37 pts/0
> 00:00:00
> > ps -eflww
> >
> > Output of netstat from within the solr container (as root):
> >
> > root@fe0ad5b40b42:/opt/solr-8.2.0# netstat
> > Active Internet connections (w/o servers)
> > Proto Recv-Q Send-Q Local Address           Foreign Address         State
> > tcp        0      0 fe0ad5b40b42:43678      172.20.28.179:2181
> >  TIME_WAIT
> > tcp        0      0 fe0ad5b40b42:60164      172.20.155.241:2181
> > TIME_WAIT
> > tcp        0      0 fe0ad5b40b42:60500      172.20.60.138:2181
> >  TIME_WAIT
> > Active UNIX domain sockets (w/o servers)
> > Proto RefCnt Flags       Type       State         I-Node   Path
> > unix  2      [ ]         STREAM     CONNECTED     129252
> > unix  2      [ ]         STREAM     CONNECTED     129270
> >
> > I'm beginning to think that ZK is not setup correctly. I haven't uploaded
> > any configuration files to ZK yet; my understanding was that I could
> start
> > up a solr cloud node with no collections and upload the configuration
> from
> > there. I was under the impression that it would try to connect to ZK and
> if
> > it couldn't get config files from there it would use local config files.
> Do
> > I need to upload the solr cloud configuration files to ZK before starting
> > up the cluster?  The netstat output makes it look like the solr container
> > is indeed connected to the ZK containers, but there's no indication as to
> > why it cannot connect to Zookeeper that I can see.
> >
> > --
> > Drew(i...@gmail.com)
> > http://wyntermute.dyndns.org/blog/
> >
> > -- I Drive Way Too Fast To Worry About Cholesterol.
> >
> >
> > On Fri, Oct 18, 2019 at 3:11 AM Martijn Koster <
> > mak-luc...@greenhills.co.uk>
> > wrote:
> >
> > >
> > >
> > > > On 18 Oct 2019, at 00:25, Drew Kidder <dre...@gmail.com> wrote:
> > >
> > > > * I'm using the following command line to start a basic solr cloud
> > > instance
> > > > as per the documentation: `bin/solr start -c -z
> > > zk1:2181,zk2:2181,zk3:2181`
> > >
> > > I assume you’re just looking to run a single Solr node in a single
> > > container, right?
> > >
> > > Just set the ZK_HOST environment variable, and remove the command-line
> > > arguments.
> > > And you don’t need to specify the port number unless you deviate from
> the
> > > default.
> > > Have a look at this example
> > >
> >
> https://github.com/docker-solr/docker-solr-examples/blob/master/swarm/docker-compose.yml
> > > <
> > >
> >
> https://github.com/docker-solr/docker-solr-examples/blob/master/swarm/docker-compose.yml#L61with
> > > >
> > >
> > > The “start” command starts Solr in the background, which is typically
> not
> > > what you want
> > > when running Solr under docker.
> > >
> > >
> > > Why your command isn’t working as is, is not clear. When you say you’re
> > > using that
> > > command-line, how do you actually do that? In a full docker command
> line,
> > > or a compose file, or from a “docker exec”, or from some orchestrator.
> > > Share the exact thing you’re doing; perhaps there is mistake there.
> > > Also, run `ps -eflww` in the container to see what command-line
> arguments
> > > the JVM actually got started with.
> > > And share the full startup log somewhere (in a GitHub gist perhaps),
> > there
> > > might be something of interest earlier on.
> > >
> > > >> (running `echo ruok | nc zk1 2181` returns the expected "imok"
> > response
> > > >> from ZK within the docker container where Solr is located)
> > > >> * The netcat command mentioned above shows up in the ZK logs, but
> the
> > > Solr
> > > >> attempts to connect do not (it's like the request isn't even getting
> > to
> > > ZK)
> > >
> > > Then it doesn’t sound like a environmental
> > firewall/security-group/routing
> > > issue.
> > > Next step to debug then could be to check if you actually see Solr make
> > > tcp connections
> > > to port 2181, in the Solr container, using tcpdump/sysdig/netstat or
> some
> > > such.
> > > If that gives a negative result, then you know it’s an issue in your
> Solr
> > > invocation config, or name resolution.
> > > If that gives a positive result, then it’s environmental after all; and
> > > you can dig further.
> > >
> > >
> > > But try the ZK_HOST thing first; it may just fix it.
> > >
> > > — Martijn
> >
>

Reply via email to