Re: SolrCloud increase replication factor

2016-05-23 Thread Jeff Wartes
https://github.com/whitepages/solrcloud_manager was designed to provide some easier operations for common kinds of cluster operation. It hasn’t been tested with 6.0 though, so if you try it, please let me know your experience. On 5/23/16, 6:28 AM, "Tom Evans"

Re: How to stop searches to solr while full data import is going in SOLR

2016-05-23 Thread Jeff Wartes
The PingRequestHandler contains support for a file check, which allows you to control whether the ping request succeeds based on the presence/absence of a file on disk on the node. http://lucene.apache.org/solr/6_0_0/solr-core/org/apache/solr/handler/PingRequestHandler.html I suppose you could

Re: Solr cloud with Grouping query gives inconsistent results

2016-05-23 Thread Jeff Wartes
My first thought is that you haven’t indexed such that all values of the field you’re grouping on are found in the same cores. See the end of the article here: (Distributed Result Grouping Caveats) https://cwiki.apache.org/confluence/display/solr/Result+Grouping And the “Document Routing”

Re: SolrCloud replicas consistently out of sync

2016-05-19 Thread Jeff Wartes
That case related to consistency after a ZK outage or network connectivity issue. Your case is standard operation, so I’m not sure that’s really the same thing. I’m aware of a few issues that cam happen if ZK connectivity goes wonky, that I hope are fixed in SOLR-8697. This one might be a

state.json being downloaded every 10 seconds

2016-05-16 Thread Jeff Wartes
I have a solr 5.4 cluster with three collections, A, B, C. Nodes either host replicas for collection A, or B and C. Collections B and C are not currently used - no inserts or queries. Collection A is getting significant query traffic, but no insert traffic, and queries are only directed to

Re: state.json being downloaded every 10 seconds

2016-05-16 Thread Jeff Wartes
have replicas B and C. > >What the "something" is that sends requests I'm not quite sure, but >that's a place >to start. > >Best, >Erick > >On Mon, May 16, 2016 at 11:08 AM, Jeff Wartes <jwar...@whitepages.com> wrote: >> >> I have a solr 5.4 clus

Re: Passing Ids in query takes more time

2016-05-05 Thread Jeff Wartes
An ID lookup is a very simple and fast query, for one ID. Or’ing a lookup for 80k ids though is basically 80k searches as far as Solr is concerned, so it’s not altogether surprising that it takes a while. Your complaint seems to be that the query planner doesn’t know in advance that should be

Re: Indexing 700 docs per second

2016-04-19 Thread Jeff Wartes
I have no numbers to back this up, but I’d expect Atomic Updates to be slightly slower than a full update, since the atomic approach has to retrieve the fields you didn't specify before it can write the new (updated) document. On 4/19/16, 11:54 AM, "Tim Robertson"

Re: HTTP Client Only

2016-04-14 Thread Jeff Wartes
If you’re already using java, just use the CloudSolrClient. If you’re using the default router, (CompositeId) it’ll figure out the leaders and send documents to the right place for you. If you’re not using java, then I’d still look there for hints on how to duplicate the functionality. On

Re: Adding replica on solr - 5.50

2016-04-14 Thread Jeff Wartes
I’m all for finding another way to make something work, but I feel like this is the wrong advice. There are two options: 1) You are doing something wrong. In which case, you should probably invest in figuring out what. 2) Solr is doing something wrong. In which case, you should probably invest

Re: Replicas for same shard not in sync

2016-04-26 Thread Jeff Wartes
At the risk of thread hijacking, this is an area where I don’t know I fully understand, so I want to make sure. I understand the case where a node is marked “down” in the clusterstate, but what if it’s down for less than the ZK heartbeat? That’s not unreasonable, I’ve seen some

Re: What if adding 3rd node exceeds replication Factor? [scottchu]

2016-05-25 Thread Jeff Wartes
SolrCloud never creates replicas automatically, unless perhaps you’re using the HDFS-only autoAddReplicas option. Start the new node using the same ZK, and then use the Collections API (https://cwiki.apache.org/confluence/display/solr/Collections+API) to ADDREPLICA. The replicationFactor you

Re: solrcloud consumes more time than solr when write index

2016-07-13 Thread Jeff Wartes
data? > >Thanks! >Kent > >2016-07-12 23:02 GMT+08:00 Jeff Wartes <jwar...@whitepages.com>: > >> Well, two thoughts: >> >> >> 1. If you’re not using solrcloud, presumably you don’t have any replicas. >> If you are, presumably you do. This makes fo

Effects of insert order on query performance

2016-08-11 Thread Jeff Wartes
This isn’t really a question, although some validation would be nice. It’s more of a warning. Tldr is that the insert order of documents in my collection appears to have had a huge effect on my query speed. I have a very large (sharded) SolrCloud 5.4 index. One aspect of this index is a

Re: Effects of insert order on query performance

2016-08-12 Thread Jeff Wartes
h routing: https://sematext.com/blog/2015/09/29/solrcloud-large-tenants-and-routing/ Regards, Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support * http://sematext.com/ On 11.08.2016 19:39, Je

Re: Node not recovering, leader elections not occuring

2016-07-19 Thread Jeff Wartes
It sounds like the node-local version of the ZK clusterstate has diverged from the ZK cluster state. You should check the contents of zookeeper and verify the state there looks sane. I’ve had issues (v5.4) on a few occasions where leader election got screwed up to the point where I had to

Re: solrcloud consumes more time than solr when write index

2016-07-12 Thread Jeff Wartes
Well, two thoughts: 1. If you’re not using solrcloud, presumably you don’t have any replicas. If you are, presumably you do. This makes for a biased comparison, because SolrCloud won’t acknowledge a write until it’s been safely written to all replicas. In short, solrcloud write time is

Re: Help with recovering shard range after zookeeper disaster

2016-06-28 Thread Jeff Wartes
This might come a little late to be helpful, but I had a similar situation with Solr 5.4 once. We ended up finding a ZK snapshot we could restore, but we did also get the cluster back up for most of the interim by taking the now-empty ZK cluster, re-uploading the configs that the collections

Re: Full re-index without downtime

2016-07-06 Thread Jeff Wartes
A variation on #1 here - Use the same cluster, create a new collection, but use the createNodeSet option to logically partition your cluster so no node has both the old and new collection. If your clients all reference a collection alias, instead of a collection name, then all you need to do

Re: Collection will not replicate

2017-02-01 Thread Jeff Wartes
Sounds similar to a thread last year: http://lucene.472066.n3.nabble.com/Node-not-recovering-leader-elections-not-occuring-tp4287819p4287866.html On 2/1/17, 7:49 AM, "tedsolr" wrote: I have version 5.2.1. Short of an upgrade, are there any remedies?

Re: Latest advice on G1 collector?

2017-01-26 Thread Jeff Wartes
Adding my anecdotes: I’m using heavily tuned ParNew/CMS. This is a SolrCloud collection, but per-node I’ve got a 28G heap and a 200G index. The large heap turned out to be necessary because certain operations in Lucene allocate memory based on things other than result size, (index size

Re: Latest advice on G1 collector?

2017-01-25 Thread Jeff Wartes
Hah, interesting. The fact that the CMS collector fails back to a *single-threaded* collection on concurrent-mode-failure had me seriously considering trying the Parallel collector a year or two ago. I figured out (and stopped) the queries that were doing the sudden massive allocations that

Re: Facets based on sampling

2016-11-04 Thread Jeff Wartes
https://issues.apache.org/jira/browse/SOLR-5894 had some pretty interesting looking work on heuristic counts for facets, among other things. Unfortunately, it didn’t get picked up, but if you don’t mind using Solr 4.10, there’s a jar. On 11/4/16, 12:02 PM, "John Davis"

Re: CodaHale metrics for Solr 6?

2016-11-04 Thread Jeff Wartes
Expanding on my comment on the ticket, I’m really quite happy with using codahale/dropwizard metrics with Solr. I don’t know if I’m comfortable just sharing a screenshot of the resulting grafana dashboard, but I’ve got, per-host: - Percentile latencies and rates for GET vs POST (which in

Re: Result Grouping vs. Collapsing Query Parser -- Can one be deprecated?

2016-10-20 Thread Jeff Wartes
I’ll also mention the choice to improve processing speed by allocating more memory, which increases the importance of GC tuning. This bit me when I tried using it on a larger index. https://issues.apache.org/jira/browse/SOLR-9125 I don’t know if the result grouping feature shares the same

Re: CREATEALIAS to non-existing collections

2016-12-09 Thread Jeff Wartes
I’d prefer it if the alias was required to be removed, or pointed elsewhere, before the collection could be deleted. As a best practice, I encourage all SolrCloud users to configure an alias to each collection, and use only the alias in their clients. This allows atomic switching between

Re: Queries regarding solr cache

2016-12-01 Thread Jeff Wartes
I found this, which intends to explore the usage of RoaringDocIdSet for solr: https://issues.apache.org/jira/browse/SOLR-9008 This suggests Lucene’s filter cache already uses it, or did at one point: https://issues.apache.org/jira/browse/LUCENE-6077 I was playing with id set implementations

Re: Memory leak in Solr

2016-12-04 Thread Jeff Wartes
Here’s an earlier post where I mentioned some GC investigation tools: https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201604.mbox/%3c8f8fa32d-ec0e-4352-86f7-4b2d8a906...@whitepages.com%3E In my experience, there are many aspects of the Solr/Lucene memory allocation model that scale

Solr performance on EC2 linux

2017-04-28 Thread Jeff Wartes
tldr: Recently, I tried moving an existing solrcloud configuration from a local datacenter to EC2. Performance was roughly 1/10th what I’d expected, until I applied a bunch of linux tweaks. This should’ve been a straight port: one datacenter server -> one EC2 node. Solr 5.4, Solrcloud, Ubuntu

Re: Solr performance on EC2 linux

2017-05-03 Thread Jeff Wartes
It’s presumably not a small degradation - this guy very recently suggested it’s 77% slower: https://blog.packagecloud.io/eng/2017/03/08/system-calls-are-much-slower-on-ec2/ The other reason that blog post is interesting to me is that his benchmark utility showed the work of entering the kernel

Re: Solr performance on EC2 linux

2017-05-01 Thread Jeff Wartes
I started with the same three-node 15-shard configuration I’d been used to, in an RF1 cluster. (the index is almost 700G so this takes three r4.8xlarge’s if I want to be entirely memory-resident) I eventually dropped down to a 1/3rd size index on a single node (so 5 shards, 100M docs each) so I

Re: Solr performance on EC2 linux

2017-05-01 Thread Jeff Wartes
Yes, that’s the Xenial I tried. Ubuntu 16.04.2 LTS. On 5/1/17, 7:22 PM, "Will Martin" <wmartin...@outlook.com> wrote: Ubuntu 16.04 LTS - Xenial (HVM) Is this your Xenial version? On 5/1/2017 6:37 PM, Jeff Wartes wrote: > I tri

Re: Solr performance on EC2 linux

2017-04-30 Thread Jeff Wartes
with you having such different performance between local and EC2 But thanks for telling us about this! It's totally baffling Erick On Fri, Apr 28, 2017 at 9:09 AM, Jeff Wartes <jwar...@whitepages.com> wrote: > > tldr: Recently, I tried moving an existing

Re: Solr performance on EC2 linux

2017-05-01 Thread Jeff Wartes
We settled on the R4.2XL... The R series is labeled "High-Memory" Which instance type did you end up using? On Mon, May 1, 2017 at 8:22 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 4/28/2017 10:09 AM, Jeff Wartes wrote: > > tldr: Recen

Solr Autoscaling multi-AZ rules

2018-02-07 Thread Jeff Wartes
I’ve been messing around with the Solr 7.2 autoscaling framework this week. Some things seem trivial, but I’m also running into questions and issues. If anyone else has experience with this stuff, I’d be glad to hear it. Specifically: Context: -One collection, consisting of 42 shards, where

Re: Solr Autoscaling multi-AZ rules

2018-02-22 Thread Jeff Wartes
lica": "<7", "node":"#ANY"} , means don't put more than 7 replicas of the collection (irrespective of the shards) in a given node what do you mean by distinct 'RF' ? I think we are screwing up the terminologies a bit here On Wed, Feb 7, 2018

Routing a subquery directly to the shard a document came from

2018-03-27 Thread Jeff Wartes
I have a large 7.2 index with nested documents and many shards. For each result (parent doc) in a query, I want to gather a relevance-ranked subset of the child documents. It seemed like the subquery transformer would be ideal:

Re: Copying a SolrCloud collection to other hosts

2018-03-28 Thread Jeff Wartes
for the duration of the restore But the former isn't tenable if you're sharding due to space constraints, and the latter can't be easily predicted. On 3/28/18, 11:30 AM, "Shawn Heisey" <apa...@elyograg.org> wrote: On 3/28/2018 10:34 AM, Jeff Wartes wrote: > The backup/res

Re: Copying a SolrCloud collection to other hosts

2018-03-28 Thread Jeff Wartes
ere is a shared filesystem requirement. It would be nice if this > Solr feature could be enhanced to have more options like backing up > directly to another SolrCloud using replication/fetchIndex like your cool > solrcloud_manager thing. > > On Wed, Mar 28, 2018 at

Re: Routing a subquery directly to the shard a document came from

2018-03-29 Thread Jeff Wartes
't a query so it isn't parsed. So I have no way to dereference the "$row.[shard]". On 3/27/18, 4:00 PM, "Jeff Wartes" <jwar...@whitepages.com> wrote: I have a large 7.2 index with nested documents and many shards. For each result (parent doc) in a query,

Re: Determining replication status

2018-04-01 Thread Jeff Wartes
There're some edge cases around the response based on the timing. In case it's useful: Here's the bit from solrcloud-haft: (java)

Re: Copying a SolrCloud collection to other hosts

2018-03-28 Thread Jeff Wartes
The backup/restore still requires setting up a shared filesystem on all your nodes though right? I've been using the fetchindex trick in my solrcloud_manager tool for ages now: https://github.com/whitepages/solrcloud_manager#cluster-commands Some of the original features in that tool have been

<    1   2