custom hashing across cloud shards

2013-08-19 Thread Katie McCorkell
Hey All,

If you don't specify numShards at the start, then you can do custom
hashing, because Solr will just write the document to whatever shard you
send it to.

However, when I don't specify numshards, I'm having trouble creating more
than one shard. It makes one shard and the others I add are simply
replicas. Here's the params I'm using to start
http://pastebin.com/818SguiA . Am I missing something?

I only want custom hashing because before when I had it do the automatic
hashing, I was posting all documents to one shard, and then it would
forward them to the right place. I want to add them to different shards to
distribute the load, so that it's not just one of them handling the
forwarding. It seems silly to add them across all the shards and then have
each shard forward them to a potentially different shard based on the hash.

Thanks!


Re: SolrCloud shard down

2013-07-29 Thread Katie McCorkell
I am using Solr 4.3.1 . I did hard commit after indexing.

I think you're right that the node was still recovering. I didn't think so
since it didn't show up as yellow recovering on the visual display, but
after quite a while it went from Down to Active . Thanks!


On Fri, Jul 26, 2013 at 7:59 PM, Anshum Gupta ans...@anshumgupta.netwrote:

 Can you also let me know what version of Solr are you on?


 On Sat, Jul 27, 2013 at 8:26 AM, Anshum Gupta ans...@anshumgupta.net
 wrote:

  Hi Katie,
 
  1. First things first, I would strongly advice to manually update/remove
  zk or any other info when you're running things in the SolrCloud mode
  unless you are sure of what you're doing.
 
  2. Also, your node could be currently recovering from the transaction
  log(did you issue a hard commit after indexing?).
  The mailing list doesn't allow long texts inline so it'd be good if you
  could use something like http://pastebin.com/ to share the log in
 detail.
 
  3. If you had replicas, you wouldn't need to manually switch. It get's
  taken care of automatically.
 
 
  On Sat, Jul 27, 2013 at 4:16 AM, Katie McCorkell 
 katiemccork...@gmail.com
   wrote:
 
  Hello,
 
   I am using the SolrCloud with a zookeeper ensemble like on example C
 from
  the wiki except with total of 3 shards and no replicas (oops). After
  indexing a whole bunch of documents, shard 2 went down and I'm not sure
  why. I tried restarting it with the jar command and I tried deleting
  shard1
  's zoo_data folder and then restarting but it is still down, and I'm not
  sure what to do.
 
  1) Is there anyway to avoid reindexing all the data? It's no good to
  proceed without shard 2 because I don't know which documents are there
 vs.
  the other shards, and indexing and querying don't work when one shard is
  down.
 
  I can't exactly tell why restarting it is failing, all I can see is on
 the
  admin tool webpage the shard is yellow in the little cloud diagram. On
 the
  console is messages that I will copy and paste below. 2) How can I tell
  the
  exact problem?
 
  3) If I had had replicas, I could have just switched to shard 2's
 replica
  at this point, correct?
 
  Thanks!
  Katie
 
  Console message from start.jar
 
 
 ---
  2325 [coreLoadExecutor-4-thread-1] INFO
   org.apache.solr.cloud.ZkController
   – We are http://172.16.2.182:/solr/collection1/ and leader is
  http://172.16.2.182:/solr/collection1/
  12329 [recoveryExecutor-6-thread-1] WARN
  org.apache.solr.update.UpdateLog
   – Starting log replay
 
 
 tlog{file=/opt/solr-4.3.1/example/solr/collection1/data/tlog/tlog.0005179
  refcount=2} active=false starting pos=0
  12534 [recoveryExecutor-6-thread-1] INFO  org.apache.solr.core.SolrCore
  –
  SolrDeletionPolicy.onInit: commits:num=1
 
  commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@
  /opt/solr-4.3.1/example/solr/collection1/data/index
  lockFactory=org.apache.lucene.store.NativeFSLockFactory@5f99ea3c;
  maxCacheMB=48.0
 
 
 maxMergeSizeMB=4.0),segFN=segments_404,generation=5188,filenames=[_1gqo.fdx,
  _1h1q.nvm, _1h8x.fdt, _1gmi_Lucene41_0.pos, _1gqo.fdt, _1h8s.nvd, _
  1gmi.si,
  _1h1q.nvd, _1h6l.fnm, _1h8q.nvm, _1h6l_Lucene41_0.tim,
  _1h6l_Lucene41_0.tip, _1h8o_Lucene41_0.tim, _1h8o_Lucene41_0.tip,
  _1aq9_67.del, _1gqo.nvm, _1aq9_Lucene41_0.pos, _1h8q.fdx, _1h1q.fdt,
  _1h8r.fdt, _1h8q.fdt, _1h8p_Lucene41_0.pos, _1h8s_Lucene41_0.pos,
  _1h8r.fdx, _1gqo.nvd, _1h8s.fdx, _1h8s.fdt, _1h8x_Lucene41_.
 
 
 
 
  --
 
  Anshum Gupta
  http://www.anshumgupta.net
 



 --

 Anshum Gupta
 http://www.anshumgupta.net



Re: solr with java service wrapper

2013-07-26 Thread Katie McCorkell
I was using Linux. I used the Java Service Wrapper and found what I needed!
It provides a way to wrap the start.jar so that it can be started and
stopped using linux daemon, helpful for my case of connecting Solr to a
chef recipe. I may write an explanation of this soon.


SolrCloud shard down

2013-07-26 Thread Katie McCorkell
Hello,

 I am using the SolrCloud with a zookeeper ensemble like on example C from
the wiki except with total of 3 shards and no replicas (oops). After
indexing a whole bunch of documents, shard 2 went down and I'm not sure
why. I tried restarting it with the jar command and I tried deleting shard1
's zoo_data folder and then restarting but it is still down, and I'm not
sure what to do.

1) Is there anyway to avoid reindexing all the data? It's no good to
proceed without shard 2 because I don't know which documents are there vs.
the other shards, and indexing and querying don't work when one shard is
down.

I can't exactly tell why restarting it is failing, all I can see is on the
admin tool webpage the shard is yellow in the little cloud diagram. On the
console is messages that I will copy and paste below. 2) How can I tell the
exact problem?

3) If I had had replicas, I could have just switched to shard 2's replica
at this point, correct?

Thanks!
Katie

Console message from start.jar
---
2325 [coreLoadExecutor-4-thread-1] INFO  org.apache.solr.cloud.ZkController
 – We are http://172.16.2.182:/solr/collection1/ and leader is
http://172.16.2.182:/solr/collection1/
12329 [recoveryExecutor-6-thread-1] WARN  org.apache.solr.update.UpdateLog
 – Starting log replay
tlog{file=/opt/solr-4.3.1/example/solr/collection1/data/tlog/tlog.0005179
refcount=2} active=false starting pos=0
12534 [recoveryExecutor-6-thread-1] INFO  org.apache.solr.core.SolrCore  –
SolrDeletionPolicy.onInit: commits:num=1

commit{dir=NRTCachingDirectory(org.apache.lucene.store.MMapDirectory@/opt/solr-4.3.1/example/solr/collection1/data/index
lockFactory=org.apache.lucene.store.NativeFSLockFactory@5f99ea3c;
maxCacheMB=48.0
maxMergeSizeMB=4.0),segFN=segments_404,generation=5188,filenames=[_1gqo.fdx,
_1h1q.nvm, _1h8x.fdt, _1gmi_Lucene41_0.pos, _1gqo.fdt, _1h8s.nvd, _1gmi.si,
_1h1q.nvd, _1h6l.fnm, _1h8q.nvm, _1h6l_Lucene41_0.tim,
_1h6l_Lucene41_0.tip, _1h8o_Lucene41_0.tim, _1h8o_Lucene41_0.tip,
_1aq9_67.del, _1gqo.nvm, _1aq9_Lucene41_0.pos, _1h8q.fdx, _1h1q.fdt,
_1h8r.fdt, _1h8q.fdt, _1h8p_Lucene41_0.pos, _1h8s_Lucene41_0.pos,
_1h8r.fdx, _1gqo.nvd, _1h8s.fdx, _1h8s.fdt, _1h8x_Lucene41_.


solr with java service wrapper

2013-07-17 Thread Katie McCorkell
Hello,

I was wondering if people had experience using solr with jetty and a java
service wrapper for automatic deployment? I thought a service wrapper might
be included in the solr download, but I didn't see one.

How does one search the mailing list archive? Are there any previous topics
about this you could lead me to ? (I don't have specific questions yet)

Thanks!!


Deleted Docs

2013-07-09 Thread Katie McCorkell
Hello,

I am curious about the Deleted Docs: statistic on the solr/#/collection1
Overview page. Does Solr remove docs while indexing? I thought it only did
that when Optimizing, however my instance had 726 Deleted Docs, but then
after adding some documents that number decreased, eventually to 18 Deleted
Docs.

I understood these Deleted Docs are from situations where two docs have the
same UniqueKey. However my data had way more deleted docs than I expected.
I was using a data-generated uniquekey, when I changed to using the UUID
generator there were 0 deleted docs. But I just wanted to double check, are
there any other cases which would create a Deleted Doc?

Thanks so much!! :)
Katie


are fields stored or unstored by default xml

2013-07-01 Thread Katie McCorkell
In schema.xml I know you can label a field as stored=false or
stored=true, but if you say neither, which is it by default?

Thank you
Katie