Hi,

I have setup 4 Solr (4.9.0) Nodes into a single shard for a given
collection, meaning I should have 4 replicated nodes. I have 3 Zookeepers
in ensemble managing the configs for this collection. I have a load
balancer in front of the 4 nodes to split traffic between them.

I start this collection with an empty data/index directory.

When I send /update requests to the load balancers I see these going to all
4 nodes. Also, I can see that all FOLLOWERs distribute the requests they
receive to the LEADER as is expected. But for some reason the FOLLOWERS are
not getting /replication requests from the LEADER.  So the collection for
the leader contains many thousand of documents and is on the 8th
generation. I see that it's replicable in the admin interface, yet all
FOLLOWER nodes have an empty index.

Hence, I need your insights please.

Thanks,

Marc

To Note:

When I startup my nodes I see the following error in solr.log:
1) When Zookeeper does a clusterstate update, all nodes have their starte
"DOWN", why? This I means that in the Solr Admin interface they show up has
down. This never updates to active.

2) I have a warning :  org.apache.solr.rest.ManagedResource; No registered
observers for /rest/managed, which I need to update solrconfig.xml to fix

3) I have the following error:
ERROR - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; No
UpdateLog found - cannot sync

SOLR.LOG
-------------------------
[....]
INFO  - 2014-07-16 19:47:30.870;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
numShards=null message={
  "operation":"state",
  "state":"down",
  "base_url":"http://192.168.150.90:8983/solr";,
  "core":"collection_name",
  "roles":null,
  "node_name":"192.168.150.90:8983_solr",
  "shard":null,
  "collection":"collection_name",
  "numShards":null,
  "core_node_name":null}
INFO  - 2014-07-16 19:47:30.871;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node1 is
already registered
[....]
WARN  - 2014-07-16 19:47:34.535; org.apache.solr.rest.ManagedResource; No
registered observers for /rest/managed
[....]
INFO  - 2014-07-16 19:48:25.135;
org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (2)
INFO  - 2014-07-16 19:48:25.287;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:48:25.291;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:48:25.293;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
numShards=null message={
  "operation":"state",
  "state":"down",
  "base_url":"http://192.168.200.90:8983/solr";,
  "core":"collection_name",
  "roles":null,
  "node_name":"192.168.200.90:8983_solr",
  "shard":null,
  "collection":"collection_name",
  "numShards":null,
  "core_node_name":null}
INFO  - 2014-07-16 19:48:25.293;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node2 is
already registered
INFO  - 2014-07-16 19:48:25.293;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
registered
[....]
INFO  - 2014-07-16 19:49:00.188;
org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (3)
INFO  - 2014-07-16 19:49:00.322;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:49:00.335;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:49:00.337;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
numShards=null message={
  "operation":"state",
  "state":"down",
  "base_url":"http://192.168.200.91:8983/solr";,
  "core":"collection_name",
  "roles":null,
  "node_name":"192.168.200.91:8983_solr",
  "shard":null,
  "collection":"collection_name",
  "numShards":null,
  "core_node_name":null}
INFO  - 2014-07-16 19:49:00.337;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node3 is
already registered
INFO  - 2014-07-16 19:49:00.337;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
registered
[....]
INFO  - 2014-07-16 19:49:21.220;
org.apache.solr.common.cloud.ZkStateReader$3; Updating live nodes... (4)
INFO  - 2014-07-16 19:49:21.350;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:49:21.357;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:49:21.359;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; Update state
numShards=null message={
  "operation":"state",
  "state":"down",
  "base_url":"http://192.168.150.91:8983/solr";,
  "core":"collection_name",
  "roles":null,
  "node_name":"192.168.150.91:8983_solr",
  "shard":null,
  "collection":"collection_name",
  "numShards":null,
  "core_node_name":null}
INFO  - 2014-07-16 19:49:21.359;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; node=core_node4 is
already registered
INFO  - 2014-07-16 19:49:21.359;
org.apache.solr.cloud.Overseer$ClusterStateUpdater; shard=shard1 is already
registered
[....]
INFO  - 2014-07-16 19:49:25.336;
org.apache.solr.cloud.ShardLeaderElectionContext; Enough replicas found to
continue.
INFO  - 2014-07-16 19:49:25.336;
org.apache.solr.cloud.ShardLeaderElectionContext; I may be the new leader -
try and sync
INFO  - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; Sync
replicas to http://192.168.150.90:8983/solr/pde_frontend/
ERROR - 2014-07-16 19:49:25.336; org.apache.solr.cloud.SyncStrategy; No
UpdateLog found - cannot sync
INFO  - 2014-07-16 19:49:25.337;
org.apache.solr.cloud.ShardLeaderElectionContext; We failed sync, but we
have no versions - we can't sync in that case - we were active before, so
become leader anyway
INFO  - 2014-07-16 19:49:25.337;
org.apache.solr.cloud.ShardLeaderElectionContext; I am the new leader:
http://192.168.150.90:8983/solr/pde_frontend/ shard1
INFO  - 2014-07-16 19:49:25.338; org.apache.solr.common.cloud.SolrZkClient;
makePath: /collections/pde_frontend/leaders/shard1
INFO  - 2014-07-16 19:49:25.360;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:49:25.364;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:49:25.376;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: /overseer/queue state: SyncConnected type NodeChildrenChanged
INFO  - 2014-07-16 19:49:25.381;
org.apache.solr.common.cloud.ZkStateReader$2; A cluster state change:
WatchedEvent state:SyncConnected type:NodeDataChanged
path:/clusterstate.json, has occurred - updating... (live nodes size: 4)
INFO  - 2014-07-16 19:49:25.413; org.apache.solr.cloud.ZkController; We are
http://192.168.150.90:8983/solr/pde_frontend/ and leader is
http://192.168.150.90:8983/solr/pde_frontend/
INFO  - 2014-07-16 19:49:25.414;
org.apache.solr.common.cloud.ZkStateReader; Updating cloud state from
ZooKeeper...
INFO  - 2014-07-16 19:49:28.848; org.apache.solr.core.SolrCore;
[pde_frontend] webapp=/solr path=/admin/ping params={} hits=0 status=0
QTime=1
[....]
INFO  - 2014-07-16 19:53:18.177;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3339682]} 0 1
INFO  - 2014-07-16 19:53:18.213;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update/ params={indent=on&version=2.2&wt=json}
{add=[Listing_3335358]} 0 4
INFO  - 2014-07-16 19:53:18.242;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.90:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3338996]} 0 1
INFO  - 2014-07-16 19:53:18.258; org.apache.solr.core.SolrDeletionPolicy;
SolrDeletionPolicy.onCommit: commits: num=2
commit{dir=NRTCachingDirectory(MMapDirectory@/opt/solr_collections/collection_name/data/index
lockFactory=NativeFSLockFactory@/opt/solr_collections/collection_name/data/index;
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_1,generation=1}
commit{dir=NRTCachingDirectory(MMapDirectory@/opt/solr_collections/collection_name/data/index
lockFactory=NativeFSLockFactory@/opt/solr_collections/collection_name/data/index;
maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_2,generation=2}
INFO  - 2014-07-16 19:53:18.258; org.apache.solr.core.SolrDeletionPolicy;
newest commit generation = 2
INFO  - 2014-07-16 19:53:18.296;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.150.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3332883]} 0 6
INFO  - 2014-07-16 19:53:18.372;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3328261]} 0 1
INFO  - 2014-07-16 19:53:18.389; org.apache.solr.search.SolrIndexSearcher;
Opening Searcher@7daec974[collection_name] realtime
INFO  - 2014-07-16 19:53:18.390;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update/ params={indent=on&version=2.2&wt=json}
{add=[Listing_3328952]} 0 3
INFO  - 2014-07-16 19:53:18.392;
org.apache.solr.update.DirectUpdateHandler2; end_commit_flush
INFO  - 2014-07-16 19:53:18.407;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.90:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3121200]} 0 0
INFO  - 2014-07-16 19:53:18.437;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.150.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3502558]} 0 1
INFO  - 2014-07-16 19:53:18.459;
org.apache.solr.update.processor.LogUpdateProcessor; [collection_name]
webapp=/solr path=/update params={update.distrib=TOLEADER&distrib.from=
http://192.168.200.91:8983/solr/collection_name/&wt=javabin&version=2}
{add=[Listing_3502564]} 0 1

Reply via email to