custom hashing across cloud shards
Hey All, If you don't specify numShards at the start, then you can do custom hashing, because Solr will just write the document to whatever shard you send it to. However, when I don't specify numshards, I'm having trouble creating more than one shard. It makes one shard and the others I add are simply replicas. Here's the params I'm using to start http://pastebin.com/818SguiA . Am I missing something? I only want custom hashing because before when I had it do the automatic hashing, I was posting all documents to one shard, and then it would forward them to the right place. I want to add them to different shards to distribute the load, so that it's not just one of them handling the forwarding. It seems silly to add them across all the shards and then have each shard forward them to a potentially different shard based on the hash. Thanks!
Re: SolrCloud shard down
I am using Solr 4.3.1 . I did hard commit after indexing. I think you're right that the node was still recovering. I didn't think so since it didn't show up as yellow recovering on the visual display, but after quite a while it went from Down to Active . Thanks! On Fri, Jul 26, 2013 at 7:59 PM, Anshum Gupta ans...@anshumgupta.netwrote: Can you also let me know what version of Solr are you on? On Sat, Jul 27, 2013 at 8:26 AM, Anshum Gupta ans...@anshumgupta.net wrote: Hi Katie, 1. First things first, I would strongly advice to manually update/remove zk or any other info when you're running things in the SolrCloud mode unless you are sure of what you're doing. 2. Also, your node could be currently recovering from the transaction log(did you issue a hard commit after indexing?). The mailing list doesn't allow long texts inline so it'd be good if you could use something like http://pastebin.com/ to share the log in detail. 3. If you had replicas, you wouldn't need to manually switch. It get's taken care of automatically. On Sat, Jul 27, 2013 at 4:16 AM, Katie McCorkell katiemccork...@gmail.com wrote: Hello, I am using the SolrCloud with a zookeeper ensemble like on example C from the wiki except with total of 3 shards and no replicas (oops). After indexing a whole bunch of documents, shard 2 went down and I'm not sure why. I tried restarting it with the jar command and I tried deleting shard1 's zoo_data folder and then restarting but it is still down, and I'm not sure what to do. 1) Is there anyway to avoid reindexing all the data? It's no good to proceed without shard 2 because I don't know which documents are there vs. the other shards, and indexing and querying don't work when one shard is down. I can't exactly tell why restarting it is failing, all I can see is on the admin tool webpage the shard is yellow in the little cloud diagram. On the console is messages that I will copy and paste below. 2) How can I tell the exact problem? 3) If I had had replicas, I could have just switched to shard 2's replica at this point, correct? Thanks! Katie Console message from start.jar --- 2325 [coreLoadExecutor-4-thread-1] INFO org.apache.solr.cloud.ZkController – We are http://172.16.2.182:/solr/collection1/ and leader is http://172.16.2.182:/solr/collection1/ 12329 [recoveryExecutor-6-thread-1] WARN org.apache.solr.update.UpdateLog – Starting log replay tlog{file=/opt/solr-4.3.1/example/solr/collection1/data/tlog/tlog.0005179 refcount=2} active=false starting pos=0 12534 [recoveryExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore – SolrDeletionPolicy.onInit: commits:num=1 commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@ /opt/solr-4.3.1/example/solr/collection1/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@5f99ea3c; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_404,generation=5188,filenames=[_1gqo.fdx, _1h1q.nvm, _1h8x.fdt, _1gmi_Lucene41_0.pos, _1gqo.fdt, _1h8s.nvd, _ 1gmi.si, _1h1q.nvd, _1h6l.fnm, _1h8q.nvm, _1h6l_Lucene41_0.tim, _1h6l_Lucene41_0.tip, _1h8o_Lucene41_0.tim, _1h8o_Lucene41_0.tip, _1aq9_67.del, _1gqo.nvm, _1aq9_Lucene41_0.pos, _1h8q.fdx, _1h1q.fdt, _1h8r.fdt, _1h8q.fdt, _1h8p_Lucene41_0.pos, _1h8s_Lucene41_0.pos, _1h8r.fdx, _1gqo.nvd, _1h8s.fdx, _1h8s.fdt, _1h8x_Lucene41_. -- Anshum Gupta http://www.anshumgupta.net -- Anshum Gupta http://www.anshumgupta.net
Re: solr with java service wrapper
I was using Linux. I used the Java Service Wrapper and found what I needed! It provides a way to wrap the start.jar so that it can be started and stopped using linux daemon, helpful for my case of connecting Solr to a chef recipe. I may write an explanation of this soon.
SolrCloud shard down
Hello, I am using the SolrCloud with a zookeeper ensemble like on example C from the wiki except with total of 3 shards and no replicas (oops). After indexing a whole bunch of documents, shard 2 went down and I'm not sure why. I tried restarting it with the jar command and I tried deleting shard1 's zoo_data folder and then restarting but it is still down, and I'm not sure what to do. 1) Is there anyway to avoid reindexing all the data? It's no good to proceed without shard 2 because I don't know which documents are there vs. the other shards, and indexing and querying don't work when one shard is down. I can't exactly tell why restarting it is failing, all I can see is on the admin tool webpage the shard is yellow in the little cloud diagram. On the console is messages that I will copy and paste below. 2) How can I tell the exact problem? 3) If I had had replicas, I could have just switched to shard 2's replica at this point, correct? Thanks! Katie Console message from start.jar --- 2325 [coreLoadExecutor-4-thread-1] INFO org.apache.solr.cloud.ZkController – We are http://172.16.2.182:/solr/collection1/ and leader is http://172.16.2.182:/solr/collection1/ 12329 [recoveryExecutor-6-thread-1] WARN org.apache.solr.update.UpdateLog – Starting log replay tlog{file=/opt/solr-4.3.1/example/solr/collection1/data/tlog/tlog.0005179 refcount=2} active=false starting pos=0 12534 [recoveryExecutor-6-thread-1] INFO org.apache.solr.core.SolrCore – SolrDeletionPolicy.onInit: commits:num=1 commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr-4.3.1/example/solr/collection1/data/index lockFactory=org.apache.lucene.store.NativeFSLockFactory@5f99ea3c; maxCacheMB=48.0 maxMergeSizeMB=4.0),segFN=segments_404,generation=5188,filenames=[_1gqo.fdx, _1h1q.nvm, _1h8x.fdt, _1gmi_Lucene41_0.pos, _1gqo.fdt, _1h8s.nvd, _1gmi.si, _1h1q.nvd, _1h6l.fnm, _1h8q.nvm, _1h6l_Lucene41_0.tim, _1h6l_Lucene41_0.tip, _1h8o_Lucene41_0.tim, _1h8o_Lucene41_0.tip, _1aq9_67.del, _1gqo.nvm, _1aq9_Lucene41_0.pos, _1h8q.fdx, _1h1q.fdt, _1h8r.fdt, _1h8q.fdt, _1h8p_Lucene41_0.pos, _1h8s_Lucene41_0.pos, _1h8r.fdx, _1gqo.nvd, _1h8s.fdx, _1h8s.fdt, _1h8x_Lucene41_.
solr with java service wrapper
Hello, I was wondering if people had experience using solr with jetty and a java service wrapper for automatic deployment? I thought a service wrapper might be included in the solr download, but I didn't see one. How does one search the mailing list archive? Are there any previous topics about this you could lead me to ? (I don't have specific questions yet) Thanks!!
Deleted Docs
Hello, I am curious about the Deleted Docs: statistic on the solr/#/collection1 Overview page. Does Solr remove docs while indexing? I thought it only did that when Optimizing, however my instance had 726 Deleted Docs, but then after adding some documents that number decreased, eventually to 18 Deleted Docs. I understood these Deleted Docs are from situations where two docs have the same UniqueKey. However my data had way more deleted docs than I expected. I was using a data-generated uniquekey, when I changed to using the UUID generator there were 0 deleted docs. But I just wanted to double check, are there any other cases which would create a Deleted Doc? Thanks so much!! :) Katie
are fields stored or unstored by default xml
In schema.xml I know you can label a field as stored=false or stored=true, but if you say neither, which is it by default? Thank you Katie