If I have an arbitrarily complex query that uses ORs, something like:
q=(simple_fieldtype:foo OR complex_fieldtype:foo) AND
(another_simple_fieldtype:bar OR another_complex_fieldtype:bar)
I want to know which fields actually contributed to the match for each document
returned. Something like:
-text Lucene Explanation object for the query and then traverse it to get
your matched field list without all the text. No parsed would be required, but
the Explanation structure could get messy.
-- Jack Krupansky
-Original Message-
From: Jeff Wartes
Sent: Friday, December 07, 2012 11:59 AM
-1999 sounds almost the
same, but I never looked into the source.
On Fri, Dec 7, 2012 at 11:00 PM, Jeff Wartes jwar...@whitepages.com wrote:
Thanks, I did start to dig into how DebugComponent does its thing a
little, and I'm not all the way down the rabbit hole yet, but the
lucene
For what it's worth, Google has done some pretty interesting research into
coping with the idea that particular shards might very well be busy doing
something else when your query comes in.
Check out this slide deck: http://research.google.com/people/jeff/latency.html
Lots of interesting
on the original tokens,
as they do if I remove the ShingleFilterFactory.
I'm using Solr 3.3, any clarification would be appreciated.
Thanks,
-Jeff Wartes
InternationalCorporation.
If this is the form you want to use for synonym matching, it must exist in
your synonym file. Does it?
Steve
-Original Message-
From: Jeff Wartes [mailto:jwar...@whitepages.com]
Sent: Wednesday, August 10, 2011 3:43 PM
To: solr-user@lucene.apache.org
Subject: Can't mix
filter.
-Original Message-
From: Jeff Wartes [mailto:jwar...@whitepages.com]
Sent: Wednesday, August 10, 2011 1:27 PM
To: solr-user@lucene.apache.org
Subject: RE: Can't mix Synonyms with Shingles?
Hi Steven,
The token separator was certainly a deliberate choice, are you saying
of shortcuts difficult before I dig in.
Thanks,
-Jeff Wartes
For what it's worth, I had the same question last year, and I never really
got a good solution:
http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201212.mbox/%3C81
e9a7879c550b42a767f0b86b2b81591a15b...@ex4.corp.w3data.com%3E
I dug into the highlight component for a while, but it turned
This might end up being more of a Lucene question, but anyway...
For a multivalued field, it appears that term frequency is calculated as
something a little like:
sum(tf(value1), ..., tf(valueN))
I'd rather my score not give preference based on how *many* of the values
in the multivalued field
A multivalued text field is directly equivalent to concatenating the
values,
with a possible position gap between the last and first terms of adjacent
values.
That, in a nutshell, would be the problem. Maybe the discussion is over at
this point.
It could be I dumbed down the problem a bit
I'm still pondering aggregate-type operations for scoring multi-valued
fields (original thread: http://goo.gl/zOX53f ), and it occurred to me
that distance-sort with SpatialRecursivePrefixTreeFieldType must be doing
something like that.
Somewhat surprisingly I don't see this in the documentation
. You're right, that doesn't look like
something I can easily use for more general aggregate scoring control. Ah
well.
On 8/14/13 12:35 PM, Smiley, David W. dsmi...@mitre.org wrote:
On 8/14/13 2:26 PM, Jeff Wartes jwar...@whitepages.com wrote:
I'm still pondering aggregate-type operations
, Jeff Wartes lt;
jwartes@
gt; wrote:
Hm, Give me all the stores that only have branches in this area might
be
a plausible use case for farthest distance.
That's essentially a contains question though, so maybe that's
already
supported? I guess it depends on how contains/intersects/etc
It was my hope that storing solr.xml would mean I could spin up a Solr node
pointing it to a properly configured zookeeper ensamble, and that no further
local configuration or knowledge would be necessary.
However, I’m beginning to wonder if that’s sufficient. It’s looking like I may
also
...the differnce between that example and what you are doing here is that
in that example, because both of nodes already had collection1 instance
dirs, they expected to be part of collection1 when they joined the
cluster.
And that, I think, is my misunderstanding. I had assumed that the link
Work is underway towards a new mode where zookeeper is the ultimate
source of truth, and each node will behave accordingly to implement and
maintain that truth. I can't seem to locate a Jira issue for it,
unfortunately. It's possible that one doesn't exist yet, or that it has
an obscure title.
Found it. In case anyone else cares, this appears to be the root issue:
https://issues.apache.org/jira/browse/SOLR-5128
Thanks again.
On 1/30/14, 9:01 AM, Jeff Wartes jwar...@whitepages.com wrote:
Work is underway towards a new mode where zookeeper is the ultimate
source of truth, and each
If you¹re only concerned with moving your shards, (rather than changing
the number of shards), I¹d:
1. Add a new server and fire up Solr pointed to the same ZooKeeper with
the same config
At this point the new server won¹t be indexing anything, but will still
technically be part of the
I’m working on a port of a Solr service to SolrCloud. (Targeting v4.6.0 at
present.) The old query style relied on using /solr/select?qt=foo to select the
proper requestHandler. I know handleSelect=true is deprecated now, but it’d be
pretty handy for testing to be able to be backwards
Got it in one. Thanks!
On 2/11/14, 9:50 AM, Shawn Heisey s...@elyograg.org wrote:
On 2/11/2014 10:21 AM, Jeff Wartes wrote:
I¹m working on a port of a Solr service to SolrCloud. (Targeting v4.6.0
at present.) The old query style relied on using /solr/select?qt=foo to
select the proper
I’ve been experimenting with SolrCloud configurations in AWS. One issue I’ve
been plagued with is that during indexing, occasionally a node decides it can’t
talk to ZK, and this disables updates in the pool. The node usually recovers
within a second or two. It’s possible this happens when I’m
I¹ll second that thank-you, this is awesome.
I asked about this issue in 2010, but when I didn¹t hear anything (and
disappointingly didn¹t find SOLR-1880), we ended up rolling our own
version of this functionality. I¹ve been laboriously migrating it every
time we bump our Solr version ever
There is a RELOAD collection command you might try:
https://cwiki.apache.org/confluence/display/solr/Collections+API#Collection
sAPI-api2
I think you¹ll find this a lot faster than restarting your whole JVM.
On 2/24/14, 4:12 PM, KNitin nitin.t...@gmail.com wrote:
Hi
I have a 4 node
It¹s worth mentioning that scores should not be considered comparable
across queries, so equating ³confidence² and ³score² is a tricky
proposition.
That is, the maxScore for the search field1:foo may be 10.0, and the
maxScore for ³field1:bar² may be 1.0, but that doesn¹t mean the top result
for
This is highly anecdotal, but I tried SOLR-1880 with 4.7 for some tests I
was running, and saw almost a 30% improvement in latency. If you¹re only
doing document selection, it¹s definitely worth having.
I¹m reasonably certain that the patch would work in 4.6 too, but the test
file relies on some
Please note that although the article talks about the ADDREPLICA command,
that feature is coming in Solr 4.8, so don¹t be confused if you can¹t find
it yet. See https://issues.apache.org/jira/browse/SOLR-5130
On 3/20/14, 7:45 AM, Erick Erickson erickerick...@gmail.com wrote:
You might find
You could always just pass the username as part of the GET params for the
query. Solr will faithfully ignore and log any parameters it doesn¹t
recognize, so it¹d show up in your {lot of params}.
That means your log parser would need more intelligence, and your client
would have to pass in the
I vastly prefer git, but last I checked, (admittedly, some time ago) you
couldn't build the project from the git clone. Some of the build scripts
assumed some svn commands will work.
On 4/12/14, 3:56 PM, Furkan KAMACI furkankam...@gmail.com wrote:
Hi Amon;
There has been a conversation about
. Aiyengar wrote:
ant compile / ant -f solr dist / ant test certainly work, I use them
with a
git working copy. You trying something else?
On 14 Apr 2014 19:36, Jeff Wartes jwar...@whitepages.com wrote:
I vastly prefer git, but last I checked, (admittedly, some time ago)
you
couldn't build
It¹s not just FacetComponent, here¹s the original feature ticket for
timeAllowed:
https://issues.apache.org/jira/browse/SOLR-502
As I read it, timeAllowed only limits the time spent actually getting
documents, not the time spent figuring out what data to get or how. I
think that means the
On 4/19/14, 6:51 AM, Ken Krugler kkrugler_li...@transpac.com wrote:
The code I see seems to be using an FSDirectory, or is there another
layer of wrapping going on here?
return new NRTCachingDirectory(FSDirectory.open(new File(path)),
maxMergeSizeMB, maxCachedMB);
I was also curious
To expand on that, the Collections API DELETEREPLICA command is availible
in Solr = 4.6, but will not have the ability wipe the disk until Solr
4.10.
Note that whether or not it deletes anything from disk, DELETEREPLICA will
remove that replica from your cluster state in ZK, so even in 4.10,
If you¹re using SolrJ, CloudSolrServer exposes the information you need
directly, although you¹d have to poll it for changes.
Specifically, this code path will get you a snapshot of the clusterstate:
http://lucene.apache.org/solr/4_5_0/solr-solrj/org/apache/solr/client/solrj
I’d like to ensure an extended warmup is done on each SolrCloud node prior to
that node serving traffic.
I can do certain things prior to starting Solr, such as pump the index dir
through /dev/null to pre-warm the filesystem cache, and post-start I can use
the ping handler with a health check
On 7/21/14, 4:50 PM, Shawn Heisey s...@elyograg.org wrote:
On 7/21/2014 5:37 PM, Jeff Wartes wrote:
I¹d like to ensure an extended warmup is done on each SolrCloud node
prior to that node serving traffic.
I can do certain things prior to starting Solr, such as pump the index
dir through /dev
the primary, secondary etc. sorts will fill those caches.
Best,
Erick
On Mon, Jul 21, 2014 at 5:07 PM, Jeff Wartes jwar...@whitepages.com
wrote:
On 7/21/14, 4:50 PM, Shawn Heisey s...@elyograg.org wrote:
On 7/21/2014 5:37 PM, Jeff Wartes wrote:
I¹d like to ensure an extended warmup is done
It¹s a command like this just prior to jetty startup:
find -L solrhome dir -type f -exec cat {} /dev/null \;
On 7/24/14, 2:11 PM, Toke Eskildsen t...@statsbiblioteket.dk wrote:
Jeff Wartes [jwar...@whitepages.com] wrote:
Well, I¹m not sure what to say. I¹ve been observing a noticeable
Looks to me like you are, or were, hitting the replication handler¹s
backup function:
http://wiki.apache.org/solr/SolrReplication#HTTP_API
ie, http://master_host:port/solr/replication?command=backup
You might not have been doing it explicitly, there¹s some support for a
backup being triggered
I¹m able to do cross-solrcloud-cluster index copy using nothing more than
careful use of the ³fetchindex² replication handler command.
I¹m using this as a build/deployment tool, so I manually create a
collection in two clusters, index into one, test, and then ask the other
cluster to fetchindex
I¹ve been working on this tool, which wraps the collections API to do more
advanced cluster-management operations:
https://github.com/whitepages/solrcloud_manager
One of the operations I¹ve added (copy) is a deployment mechanism that
uses the replication handler¹s snap puller to hot-load a
Message -
From: Jeff Wartes jwar...@whitepages.com
To: solr-user@lucene.apache.org
Sent: Monday, August 18, 2014 9:49:28 PM
Subject: Re: How to restore an index from a backup over HTTP
I¹m able to do cross-solrcloud-cluster index copy using nothing more than
careful use of the ³fetchindex
I had a similar need. The resulting tool is in scala, but it still might
be useful to look at. I had to work through some of those same issues:
https://github.com/whitepages/solrcloud_manager
From a clusterstate perspective, I mostly cared about active vs
non-active, so here¹s a sample output
You need to specify a replication factor of 2 if you want two copies of
each shard. Solr doesn¹t ³auto fill² available capacity, contrary to the
misleading examples on the http://wiki.apache.org/solr/SolrCloud page.
Those examples only have that behavior because they ask you to copy the
examples
On the face of it, your scenario seems plausible. I can offer two pieces
of info that may or may not help you:
1. A write request to Solr will not be acknowledged until an attempt has
been made to write to all relevant replicas. So, B won’t ever be missing
updates that were applied to A, unless
FWIW, since it seemed like there was at least one bug here (and possibly
more), I filed
https://issues.apache.org/jira/browse/SOLR-8171
On 10/6/15, 3:58 PM, "Jeff Wartes" <jwar...@whitepages.com> wrote:
>
>I dug far enough yesterday to find the GET_DOCSET, but not f
The “copy” command in this tool automatically does what Upayavira
describes, including bringing the replicas up to date. (if any)
https://github.com/whitepages/solrcloud_manager
I’ve been using it as a mechanism for copying a collection into a new
cluster (different ZK), but it should work
I’m aware of two public administration tools:
This was announced to the list just recently:
https://github.com/bloomreach/solrcloud-haft
And I’ve been working in this:
https://github.com/whitepages/solrcloud_manager
Both of these hook the Solrcloud client’s ZK access to inspect the cluster
state
If you’re using AWS, there’s this:
https://github.com/LucidWorks/solr-scale-tk
If you’re using chef, there’s this:
https://github.com/vkhatri/chef-solrcloud
(There are several other chef cookbooks for Solr out there, but this is
the only one I’m aware of that supports Solr 5.3.)
For ZK, I’m
I dug far enough yesterday to find the GET_DOCSET, but not far enough to
find why. Thanks, a little context is really helpful sometimes.
So, starting with an empty filterCache...
http://localhost:8983/solr/techproducts/select?q=name:foo=1=true
=popularity
New values: lookups: 0,
https://github.com/whitepages/solrcloud_manager supports 5.x, and I added
some backup/restore functionality similar to SOLR-5750 in the last
release.
Like SOLR-5750, this backup strategy requires a shared filesystem, but
note that unlike SOLR-5750, I haven’t yet added any backup functionality
On 9/4/15, 7:06 AM, "Yonik Seeley" wrote:
>
>Lucene seems to always be changing it's execution model, so it can be
>difficult to keep up. What version of Solr are you using?
>Lucene also changed how filters work, so now, a filter is
>incorporated with the query like so:
>
Tokenizers, Filters, URPs and even a newsletter:
>http://www.solr-start.com/
>
>
>On 3 September 2015 at 16:45, Jeff Wartes <jwar...@whitepages.com> wrote:
>>
>> I have a query like:
>>
>> q==enabled:true
>>
>> For purposes of this conversation
I have a query like:
q==enabled:true
For purposes of this conversation, "fq=enabled:true" is set for every query, I
never open a new searcher, and this is the only fq I ever use, so the filter
cache size is 1, and the hit ratio is 1.
The fq=enabled:true clause matches about 15% of my
I’m doing some fairly simple facet queries in a two-shard 5.3 SolrCloud
index on fields like this:
; wrote:
>what if you set f.city.facet.limit=-1 ?
>
>On Thu, Oct 1, 2015 at 7:43 PM, Jeff Wartes <jwar...@whitepages.com>
>wrote:
>
>>
>> I’m doing some fairly simple facet queries in a two-shard 5.3 SolrCloud
>> index on fields like this:
>>
>> > docValue
stributed requests, it expained here
>https://cwiki.apache.org/confluence/display/solr/Faceting#Faceting-Over-Re
>questParameters
>eg does it happen if you run with distrib=false?
>
>On Fri, Oct 2, 2015 at 12:27 AM, Jeff Wartes <jwar...@whitepages.com>
>wrote:
>
&
ert, but
not a lookup, so the cache hit ratio is always exactly 1.
On 10/2/15, 4:18 AM, "Toke Eskildsen" <t...@statsbiblioteket.dk> wrote:
>On Thu, 2015-10-01 at 22:31 +, Jeff Wartes wrote:
>> It still inserts if I address the core directly and use distrib=f
ibute it. We’ve been running it in production for a year,
>but the config is pretty manual.
>
>wunder
>Walter Underwood
>wun...@wunderwood.org
>http://observer.wunderwood.org/ (my blog)
>
>
>> On Sep 28, 2015, at 4:41 PM, Jeff Wartes <jwar...@whitepages.com> wrote:
>
One would hope that https://issues.apache.org/jira/browse/SOLR-4735 will
be done by then.
On 9/28/15, 11:39 AM, "Walter Underwood" wrote:
>We did the same thing, but reporting performance metrics to Graphite.
>
>But we won’t be able to add servlet filters in 6.x,
If I configure my filterCache like this:
and I have <= 10 distinct filter queries I ever use, does that mean I’ve
effectively disabled cache invalidation? So my cached filter query results
will never change? (short of JVM restart)
I’m unclear on whether autowarm simply copies the value into
of whether it was populated via autowarm.
On 9/24/15, 11:28 AM, "Jeff Wartes" <jwar...@whitepages.com> wrote:
>
>If I configure my filterCache like this:
>autowarmCount="10"/>
>
>and I have <= 10 distinct filter queries I ever use, does that mean I’ve
I’ve been relying on this:
https://code.google.com/archive/p/linux-ftools/
fincore will tell you what percentage of a given file is in cache, and
fadvise can suggest to the OS that a file be cached.
All of the solr start scripts at my company first call fadvise
(FADV_WILLNEED) on all the
If you want two different collections to have two different schemas, those
collections need to reference two different configsets.
So you need another copy of your config available using a different name,
and to reference that other name when you create the second collection.
On 12/4/15, 6:26
I’ve never used the managed schema, so I’m probably biased, but I’ve never
seen much of a point to the Schema API.
I need to make changes sometimes to solrconfig.xml, in addition to
schema.xml and other config files, and there’s no API for those, so my
process has been like:
1. Put the entire
Looks like LIST was added in 4.8, so I guess you’re stuck looking at ZK,
or finding some tool that looks in ZK for you.
The zkCli.sh that ships with zookeeper would probably suffice for a
one-off manual inspection:
https://zookeeper.apache.org/doc/trunk/zookeeperStarted.html#sc_ConnectingT
It’s a pretty common misperception that since solr scales, you can just
spin up new nodes and be done. Amazon ElasticSearch and older solrcloud
getting-started docs encourage this misperception, as does the HDFS-only
autoAddReplicas flag.
I agree that auto-scaling should be approached carefully,
Don’t set solr.data.dir. Instead, set the install dir. Something like:
-Dsolr.solr.home=/data/solr
-Dsolr.install.dir=/opt/solr
I have many solrcloud collections, and separate data/install dirs, and
I’ve never had to do anything with manual per-collection or per-replica
data dirs.
That said,
be...
>
>=xxx
>
>btw, for your app, isn't "slice" old notation?
>
>
>
>
>On 08/01/16 22:05, Jeff Wartes wrote:
>>
>> I’m pretty sure you could change the name when you ADDREPLICA using a
>> core.name property. I don’t know if you can when you
I’m pretty sure you could change the name when you ADDREPLICA using a core.name
property. I don’t know if you can when you initially create the collection
though.
The CLUSTERSTATUS command will tell you the core names:
Looks like it’ll set partialResults=true on your results if you hit the
timeout.
https://issues.apache.org/jira/browse/SOLR-502
https://issues.apache.org/jira/browse/SOLR-5986
On 12/22/15, 5:43 PM, "Vincenzo D'Amore" wrote:
>Well... I can write everything, but
he
>limit on each server but it isn't clear to me how high it should be or if
>raising the limit will cause new problems.
>
>Any advice you could provide in this situation would be awesome!
>
>Cheers,
>Brian
>
>
>
>> On Oct 27, 2015, at 20:50, Jeff Wartes <jwar
dentally and the DIH cannot be run
>because the database is unavailable.
>
>Our collection is simple: 2 nodes - 1 collection - 2 shards with 2
>replicas
>each
>
>So a simple copy (cp command) for both the nodes/shards might work for us?
>How do I restore the data back?
For what it’s worth, I’d suggest you go into a conversation with Azul with a
more explicit “I’m looking to buy” approach. I reached out to them with a more
“I’m exploring my options” attitude, and never even got a trial. I get the
impression their business model involves a fairly expensive (to
t; > >> https://github.com/LucidWorks/auto-phrase-tokenfilter
>> > > > >> >
>> > > > >> > Is there anything else out there that you would recommend I look
>> > at?
>> > > > >> >
>> > > > >>
Oh, interesting. I’ve certainty encountered issues with multi-word synonyms,
but I hadn’t come across this. If you end up using it with a recent solr
verison, I’d be glad to hear your experience.
I haven’t used it, but I am aware of one other project in this vein that you
might be interested
r on the linux command line I get:
>
>/opt/solr-5.4.0/server/solr-webapp/webapp/WEB-INF/lib/hon-lucene-synonyms-2.0.0.jar
>
>But the log file is still carrying class not found exceptions when I
>restart...
>
>Are you in "Cloud" mode? What version of Solr are you using?
Any distributed query falls into the two-phase process. Actually, I think some
components may require a third phase. (faceting?)
However, there are also cases where only a single pass is required. A
fl=id,score will only be a single pass, for example, since it doesn’t need to
get the field
Check your gc log for CMS “concurrent mode failure” messages.
If a concurrent CMS collection fails, it does a stop-the-world pause while it
cleans up using a *single thread*. This means the stop-the-world CMS collection
in the failure case is typically several times slower than a concurrent
to promotion failures. I suspect there's a lot of garbage building up.
>We're going to run tests with field collapsing disabled and see if that
>makes a difference.
>
>Cas
>
>
>On Thu, Jun 16, 2016 at 1:08 PM, Jeff Wartes <jwar...@whitepages.com> wrote:
>
>> Check y
There’s no official way of doing #1, but there are some less official ways:
1. The Backup/Restore API provides some hooks into loading pre-existing data
dirs into an existing collection. Lots of caveats.
2. If you don’t have many shards, there’s always rsync/reload.
3. There are some third-party
I enjoy using collection aliases in all client references, because that allows
me to change the collection all clients use without updating the clients. I
just move the alias.
This is particularly useful if I’m doing a full index rebuild and want an
atomic, zero-downtime switchover.
On
On 1/27/16, 8:28 AM, "Shawn Heisey" wrote:
>
>I don't think any documentation states this, but it seems like a good
>idea to me use an alias from day one, so that you always have the option
>of swapping the "real" collection that you are using without needing to
>change
If you can identify the problem documents, you can just re-index those after
forcing a sync. Might save a full rebuild and downtime.
You might describe your cluster setup, including ZK. it sounds like you’ve done
your research, but improper ZK node distribution could certainly invalidate
some
You could write your own snitch:
https://cwiki.apache.org/confluence/display/solr/Rule-based+Replica+Placement
Or, it would be more annoying, but you can always add/remove replicas manually
and juggle things yourself after you create the initial collection.
On 2/1/16, 8:42 AM, "Tom Evans"
Aliases work when indexing too.
Create collection: collection1
Create alias: this_week -> collection1
Index to: this_week
Next week...
Create collection: collection2
Create (Move) alias: this_week -> collection2
Index to: this_week
On 2/1/16, 2:14 AM, "vidya" wrote:
;> of
>> > SOLR as the field which is the basis of the sort is not included in the
>> > schema for example the price. The customer wants the list in descending
>> > order of the price.
>> >
>> > So I have to get all the 1000 docids from solr an
My suggestion would be to split your problem domain. Use Solr exclusively for
search - index the id and only those fields you need to search on. Then use
some other data store for retrieval. Get the id’s from the solr results, and
look them up in the data store to get the rest of your fields.
I believe the shard state is a reflection of whether that shard is still in use
by the collection, and has nothing to do with the state of the replicas. I
think doing a split-shard operation would create two new shards, and mark the
old one as inactive, for example.
On 2/26/16, 8:50 AM,
My understanding is that the "version" represents the timestamp the searcher
was opened, so it doesn’t really offer any assurances about your data.
Although you could probably bounce a node and get your document counts back in
sync (by provoking a check), it’s interesting that you’re in this
t;
>>>
>>> You might watch the achieved replication factor of your updates and see if
>>> it ever changes
>>>
>
>This is a good tip. I’m not sure I like the implication that any failure to
>write all 3 of our replicas must be retried at the app layer. Is t
Solrcloud does not come with any autoscaling functionality. If you want such a
thing, you’ll need to write it yourself.
https://github.com/whitepages/solrcloud_manager might be a useful head start
though, particularly the “fill” and “cleancollection” commands. I don’t do
*auto* scaling, but I
I’ve been running SolrCloud clusters in various versions for a few years here,
and I can only think of two or three cases that the ZK-stored cluster state was
broken in a way that I had to manually intervene by hand-editing the contents
of ZK. I think I’ve seen Solr fixes go by for those
There is some automation around this process in the backup commands here:
https://github.com/whitepages/solrcloud_manager
It’s been tested with 5.4, and will restore arbitrary replication factors.
Ever assuming the shared filesystem for backups, of course.
On 4/5/16, 3:18 AM, "Reth RM"
I recall I had some luck fixing a leader-less shard (after a ZK quorum failure)
by forcably removing the records for the down-state replicas from the leader
election list, and then forcing an election.
The ZK path looks like collections//leader_elect/shardX/election.
Usually you’ll find the
n zookeeper?
>
>
>
>Your tool is very interesting, I just thought about writing such a tool
>myself.
>From the sources I understand that you represent each node as a path in the
>git repository.
>So, I guess that for restore purposes I will have to do
>the opposite direction a
It’s a bit backwards feeling, but I’ve had luck setting the install dir and
solr home, instead of the data dir.
Something like:
-Dsolr.solr.home=/data/solr
-Dsolr.install.dir=/opt/solr
So all of the Solr files are in in /opt/solr and all of the index/core related
files end up in /data/solr.
I've experimented with that a bit, and Shawn added my comments in IRC to his
Solr/GC page here: https://wiki.apache.org/solr/ShawnHeisey
The relevant bit:
"With values of 4096 and 32768, the IRC user was able to achieve 15% and 19%
reductions in average pause time, respectively, with the
some retry logic in the code that distributes the updates from
>the leader as well.
>
>Best,
>Erick
>
>On Tue, Apr 26, 2016 at 12:51 PM, Jeff Wartes <jwar...@whitepages.com> wrote:
>>
>> At the risk of thread hijacking, this is an area where I don’t know I full
Shawn Heisey’s page is the usual reference guide for GC settings:
https://wiki.apache.org/solr/ShawnHeisey
Most of the learnings from that are in the Solr 5.x startup scripts already,
but your heap is bigger, so your mileage may vary.
Some tools I’ve used while doing GC tuning:
* VisualVM -
1 - 100 of 142 matches
Mail list logo