RE: SolrCloud :: Distributed query processing

2013-01-18 Thread Mishkin, Ernest
Thanks Yonik, that issue is exactly the same as what I observed. Glad it's 
fixed in 4.1


-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Friday, January 18, 2013 3:10 PM
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud :: Distributed query processing

Hopefully the explanation here will shed some light on this:
https://issues.apache.org/jira/browse/SOLR-3912

-Yonik
http://lucidworks.com


The information contained in this message is intended only for the recipient, 
and may be a confidential attorney-client communication or may otherwise be 
privileged and confidential and protected from disclosure. If the reader of 
this message is not the intended recipient, or an employee or agent responsible 
for delivering this message to the intended recipient, please be aware that any 
dissemination or copying of this communication is strictly prohibited. If you 
have received this communication in error, please immediately notify us by 
replying to the message and deleting it from your computer. The McGraw-Hill 
Companies, Inc. reserves the right, subject to applicable local law, to 
monitor, review and process the content of any electronic message or 
information sent to or from McGraw-Hill e-mail addresses without informing the 
sender or recipient of the message. By sending electronic message or 
information to McGraw-Hill e-mail addresses you, as the sender, are consenting 
to McGraw-Hill  processing any of your personal data therein.


SolrCloud :: Distributed query processing

2013-01-18 Thread Mishkin, Ernest
Hello,

I'm trying to reconcile my understanding of how distributed queries are handled 
by SolrCloud with what I see in the server (tomcat running solr) logs.

The setup: Solr 4.0 GA, single collection, one shard, two nodes (master and 
replica), standalone zookeeper ensemble.

Client uses SolrJ CloudSolrServer to issue queries.

Looking at one of the solr instance's logs (same pattern for both master and 
replica) while repeatedly running a query, I sometimes see just one line:
INFO  [org.apache.solr.core.SolrCore   ] webapp=/solr path=/select 
params={start=0&q=&wt=javabin&rows=10&version=2} hits=2 status=0 
QTime=28

And sometimes I see the following 3 lines (for a single client request):
INFO  [org.apache.solr.core.SolrCore   ] webapp=/solr path=/select 
params={fl=user_id,score&shard.url=|/&NOW=1358538200298&start=0&q=&distrib=false&isShard=true&wt=javabin&fsv=true&rows=10&version=2}
 hits=2 status=0 QTime=14
INFO  [org.apache.solr.core.SolrCore   ] webapp=/solr path=/select 
params={shard.url=|/&NOW=1358538200298&start=0&q=&ids=229,118671&distrib=false&isShard=true&wt=javabin&rows=10&version=2}
 status=0 QTime=9
INFO  [org.apache.solr.core.SolrCore   ] webapp=/solr path=/select 
params={start=0&q=&wt=javabin&rows=10&version=2} hits=2 status=0 
QTime=107

I thought that the client simply picks a node (master or replica in this case) 
and that node will fully service the request given that it's a single shard 
setup. But apparently I'm missing something - please help me understand what.

Thanks,
Ernest




The information contained in this message is intended only for the recipient, 
and may be a confidential attorney-client communication or may otherwise be 
privileged and confidential and protected from disclosure. If the reader of 
this message is not the intended recipient, or an employee or agent responsible 
for delivering this message to the intended recipient, please be aware that any 
dissemination or copying of this communication is strictly prohibited. If you 
have received this communication in error, please immediately notify us by 
replying to the message and deleting it from your computer. The McGraw-Hill 
Companies, Inc. reserves the right, subject to applicable local law, to 
monitor, review and process the content of any electronic message or 
information sent to or from McGraw-Hill e-mail addresses without informing the 
sender or recipient of the message. By sending electronic message or 
information to McGraw-Hill e-mail addresses you, as the sender, are consenting 
to McGraw-Hill processing any of your personal data therein.


SolrCloud :: Adding replica :: Sync-up issue

2013-01-14 Thread Mishkin, Ernest
Hello,

I observed a rather weird issue with SolrCloud.

Using Solr 4.0 GA code.

Started with a 3-node Zookeeper ensemble (standalone) and a single Solr 
instance running single collection. numShards was set to 1 during collection 
creation (don't want sharding, just replication).
Everything worked fine.

Started another Solr instance for the same collection. Properly went through 
the steps realizing it needed to sync up (actual url values replaced with 
):

12:50:59.152 INFO  [o.apache.solr.cloud.RecoveryStrategy] Starting recovery 
process.  core=users recoveringAfterStartup=true [RecoveryThread]
12:50:59.152 INFO  [o.a.solr.servlet.SolrDispatchFilter ] user.dir=/home/seg 
[localhost-startStop-1]
12:50:59.153 INFO  [o.a.solr.servlet.SolrDispatchFilter ] 
SolrDispatchFilter.init() done [localhost-startStop-1]
12:50:59.189 INFO  [o.apache.solr.cloud.RecoveryStrategy] ## 
startupVersions=[] [RecoveryThread]
12:50:59.198 INFO  [o.apache.solr.cloud.RecoveryStrategy] Attempting to 
PeerSync from  core=users - recoveringAfterStartup=true [RecoveryThread]
12:50:59.201 INFO  [o.a.s.c.solrj.impl.HttpClientUtil   ] Creating new http 
client, 
config:maxConnectionsPerHost=20&maxConnections=1&connTimeout=3&socketTimeout=3&retry=false
 [RecoveryThread]
12:50:59.377 INFO  [org.apache.solr.update.PeerSync ] PeerSync: core=users 
url= START replicas=[] nUpdates=100 [RecoveryThread]
12:50:59.377 DEBUG [org.apache.solr.update.PeerSync ] PeerSync: core=users 
url=solr startingVersions=0 [] [RecoveryThread]
12:50:59.390 WARN  [org.apache.solr.update.PeerSync ] no frame of reference 
to tell of we've missed updates [RecoveryThread]
12:50:59.390 INFO  [o.apache.solr.cloud.RecoveryStrategy] PeerSync Recovery was 
not successful - trying replication. core=users [RecoveryThread]
12:50:59.390 INFO  [o.apache.solr.cloud.RecoveryStrategy] Starting Replication 
Recovery. core=users [RecoveryThread]
12:50:59.422 INFO  [o.a.solr.common.cloud.ZkStateReader ] A cluster state 
change has occurred - updating... [localhost-startStop-1-EventThread]
12:50:59.575 INFO  [o.a.s.c.solrj.impl.HttpClientUtil   ] Creating new http 
client, 
config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false 
[RecoveryThread]
12:51:02.742 INFO  [o.apache.solr.cloud.RecoveryStrategy] Begin buffering 
updates. core=users [RecoveryThread]
12:51:02.742 INFO  [org.apache.solr.update.UpdateLog] Starting to buffer 
updates. FSUpdateLog{state=ACTIVE, tlog=null} [RecoveryThread]
12:51:02.743 INFO  [o.apache.solr.cloud.RecoveryStrategy] Attempting to 
replicate from . core=users [RecoveryThread]
12:51:02.743 INFO  [o.a.s.c.solrj.impl.HttpClientUtil   ] Creating new http 
client, 
config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false 
[RecoveryThread]
12:51:02.762 INFO  [o.a.s.c.solrj.impl.HttpClientUtil   ] Creating new http 
client, 
config:connTimeout=5000&socketTimeout=2&allowCompression=false&maxConnections=1&maxConnectionsPerHost=1
 [RecoveryThread]
12:51:02.774 INFO  [org.apache.solr.handler.SnapPuller  ]  No value set for 
'pollInterval'. Timer Task not started. [RecoveryThread]
12:51:02.781 INFO  [org.apache.solr.core.SolrCore   ] 
SolrDeletionPolicy.onInit: commits:num=1

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/solr/users/data/index
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@6e28575; 
maxCacheMB=48.0 
maxMergeSizeMB=4.0),segFN=segments_1,generation=1,filenames=[segments_1] 
[RecoveryThread]
12:51:02.782 INFO  [org.apache.solr.core.SolrCore   ] newest commit = 1 
[RecoveryThread]
12:51:02.782 DEBUG [o.apache.solr.update.SolrIndexWriter] Opened Writer 
DirectUpdateHandler2 [RecoveryThread]
12:51:02.784 INFO  [org.apache.solr.update.UpdateHandler] start 
commit{flags=0,_version_=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
 [RecoveryThread]
12:51:02.785 DEBUG [org.apache.solr.update.UpdateLog] TLOG: preCommit 
[RecoveryThread]
12:51:02.823 INFO  [org.apache.solr.core.SolrCore   ] 
SolrDeletionPolicy.onCommit: commits:num=2

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/solr/users/data/index
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@6e28575; 
maxCacheMB=48.0 
maxMergeSizeMB=4.0),segFN=segments_1,generation=1,filenames=[segments_1]

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/solr/users/data/index
 lockFactory=org.apache.lucene.store.NativeFSLockFactory@6e28575; 
maxCacheMB=48.0 
maxMergeSizeMB=4.0),segFN=segments_2,generation=2,filenames=[segments_2] 
[RecoveryThread]
12:51:02.824 INFO  [org.apache.solr.core.SolrCore   ] newest commit = 2 
[RecoveryThread]
12:51:02.828 INFO  [o.a.solr.search.SolrIndexSearcher   ] Opening 
Searcher@5947fe65 main [RecoveryThread]

12:51:02.837 DEBUG [org.apache.solr.update.UpdateLog] TLOG: postCommit 
[RecoveryThread]
12:51:02.837 INFO  [org.apach