Also curious why such a large heap is required... If it's due to field
caches being loaded I'd highly recommend MMapDirectory (if not using
already) and turning on DocValues for all fields you plan to perform
sort/facet/analytics on.
steve
On Wed, Dec 21, 2016 at 9:25 AM Pushkar Raste
That sounds like some SAN vendor BS if you ask me. Breaking up 300gb into
smaller chunks would only be relevant if they were caching entire files not
blocks and I find that hard to believe. Would be interested to know more
about the specifics of the problem as the vendor sees it.
As Shawn said
There are two options as I see it..
1. Do something like you describe and create a secondary index, index into
it, then switch... I personally would create a completely separate solr
cloud alongside my existing one vs new core in the same cloud as you might
see some negative impacts on GC caused
Looking deeper into zookeeper as truth mode I was wrong about existing
replicas being recreated once storage is gone.. Seems there is intent for
the type of behavior based upon existing tickets.. We'll look at creating a
patch for this too..
Steve
On Tue, Jul 5, 2016 at 6:00 PM Tomás Fernández
You shouldn't "need" to move the storage as SolrCloud will replicate all
data to the new node and anything in the transaction log will already be
distributed through the rest of the machines..
One option to keep all your data attached to nodes might be to use Amazon
EFS (pretty new) to store your
The ticket in question is https://issues.apache.org/jira/browse/SOLR-9265
We are working on a patch now... will update when we have a working patch /
tests..
Shawn is correct that when adding a new node to a SolrCloud cluster it will
not automatically add replicas/etc..
The idea behind this
the API for you.
>
> Upayavira
>
> On Mon, 4 Jul 2016, at 08:53 PM, Steven Bower wrote:
> > My main issue is having to make any solr collection api calls during a
> > transition.. It makes integrating with orchestration engines way more
> > complex..
> > On Mon, Jul 4, 20
My main issue is having to make any solr collection api calls during a
transition.. It makes integrating with orchestration engines way more
complex..
On Mon, Jul 4, 2016 at 3:40 PM Upayavira wrote:
> Are you using Solrcloud? With Solrcloud this stuff is easy. You just add
> a
We have been working on some changes that should help with this.. 1st
challenge is having the node name remain static regardless of where the
node runs (right now it uses host and port, so this won't work unless you
are using some sort of tunneled or dynamic networking).. We have a patch we
are
commenting so this ends up in Dennis' inbox..
On Tue, Oct 13, 2015 at 7:17 PM Yonik Seeley wrote:
> On Wed, Oct 7, 2015 at 9:42 AM, Ryan Josal wrote:
> > I developed a join transformer plugin that did that (although it didn't
> > flatten the results like
Is it possible to run in cloud mode with zookeeper managing
collections/state/etc.. but to read all config files (solrconfig, schema,
etc..) from local disk?
Obviously this implies that you'd have to keep them in sync..
My thought here is of running Solr in a docker container, but instead of
...@apache.org wrote:
Good question Steve,
You'll have to re-index right off.
~ David
p.s. Sorry I didn't reply sooner; I just switched jobs and reconfigured my
mailing list subscriptions
Steven Bower wrote
If am only indexing point shapes and I want to change the maxDistErr from
0.09 (1m
If am only indexing point shapes and I want to change the maxDistErr from
0.09 (1m res) to 0.00045 will this break as in searches stop working
or will search work but any performance gain won't be seen until all docs
are reindexed? Or will I have to reindex right off?
thanks,
steve
I am noticing the maxDocs between replicas is consistently different and
that in the idf calculation it is used which causes idf scores for the same
query/doc between replicas to be different. obviously an optimize can
normalize the maxDocs scores, but that is only temporary.. is there a way
to
My problem is that both maxDoc() and docCount() both report documents that
have been deleted in their values. Because of merging/etc.. those numbers
can be different per replica (or at least that is what I'm seeing). I need
a value that is consistent across replicas... I see in the comment it
this option in your query
after the end of the last parenthesis, as in this example from the wiki:
fq=geo:IsWithin(POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30)))
distErrPct=0
~ David
Steven Bower wrote
Only points in the index.. Am I correct this won't require a reindex
I am seeing a error when doing a spatial search where a particular point
is showing up within a polygon, but by all methods I've tried that point is
not within the polygon..
First the point is: 41.2299,29.1345 (lat/lon)
The polygon is:
31.2719,32.283
31.2179,32.3681
31.1333,32.3407
Minor edit to the KML to adjust color of polygon
On Mon, Mar 10, 2014 at 4:21 PM, Steven Bower smb-apa...@alcyon.net wrote:
I am seeing a error when doing a spatial search where a particular point
is showing up within a polygon, but by all methods I've tried that point is
not within
42.1284,42.2141
40.0919,47.8482
30.4169,47.5783
26.9892,43.6459
27.2095,41.5676
29.0454,41.2198
On Mon, Mar 10, 2014 at 4:23 PM, Steven Bower smb-apa...@alcyon.net wrote:
Minor edit to the KML to adjust color of polygon
On Mon
different story if you are indexing non-point shapes.
~ David
From: Steven Bower smb-apa...@alcyon.net javascript:;mailto:
smb-apa...@alcyon.net javascript:;
Reply-To: solr-user@lucene.apache.org javascript:;mailto:
solr-user@lucene.apache.org javascript:;
solr-user@lucene.apache.orgjavascript
www.flax.co.uk
On 15 Jan 2014, at 15:39, Steven Bower wrote:
I will open up a JIRA... I'm more concerned over the core locator
stuff vs
the solr.xml.. Should the specification of the core locator go into
the
solr.xml or via some other method?
steve
On Tue, Jan 14, 2014 at 5
On Tue, Jan 14, 2014 at 1:41 PM, Steven Bower smb-apa...@alcyon.net
wrote:
Are there any plans/tickets to allow for pluggable SolrConf and
CoreLocator? In my use case my solr.xml is totally static, i have a
separate dataDir and my core.properties are derived from a separate
configuration
Are there any plans/tickets to allow for pluggable SolrConf and
CoreLocator? In my use case my solr.xml is totally static, i have a
separate dataDir and my core.properties are derived from a separate
configuration (living in ZK) but totally outside of the SolrCloud..
I'd like to be able to not
I was looking at the code for getIndexSize() on the ReplicationHandler to
get at the size of the index on disk. From what I can tell, because this
does directory.listAll() to get all the files in the directory, the size on
disk includes not only what is searchable at the moment but potentially
Under what circumstances will a handler that implements SolrCoreAware have
its inform() method called?
thanks,
steve
the handler is created, either at SolrCore construction
time (solr startup or core reload) or the first time the handler is
requested if it's a lazy-loading handler.
Alan Woodward
www.flax.co.uk
On 15 Nov 2013, at 15:40, Steven Bower wrote:
Under what circumstances will a handler
it should be called only once during hte lifetime of a given plugin,
usually not long after construction -- but it could be called many, many
times in the lifetime of the solr process.
So for a given instance of a handler it will only be called once during the
lifetime of that handler?
Also,
And the close hook will basically only be fired once during shutdown?
On Fri, Nov 15, 2013 at 1:07 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:
: So for a given instance of a handler it will only be called once during
the
: lifetime of that handler?
correct (unless there is a bug
Check out: https://issues.apache.org/jira/browse/SOLR-5302 can do this
using query facets
On Fri, Jul 12, 2013 at 11:35 AM, Jack Krupansky j...@basetechnology.comwrote:
sum(x, y, z) = x + y + z (sums those specific fields values for the
current document)
sum(x, y) = x + y (sum of those two
Check out: https://issues.apache.org/jira/browse/SOLR-5302 it supports
median value
On Wed, Jul 3, 2013 at 12:11 PM, William Bell billnb...@gmail.com wrote:
If you are a programmer, you can modify it and attach a patch in Jira...
On Tue, Jun 4, 2013 at 4:25 AM, Marcin Rzewucki
.
See https://cwiki.apache.org/confluence/display/solr/The+Stats+Component
On Fri, Sep 6, 2013 at 12:28 AM, Steven Bower smb-apa...@alcyon.net
wrote:
Is there a way to get the count of buckets (ie unique values) for a field
facet? the rudimentary approach of course is to get back all buckets
Is there a way to get the count of buckets (ie unique values) for a field
facet? the rudimentary approach of course is to get back all buckets, but
in some cases this is a huge amount of data.
thanks,
steve
I have query like:
q=foo AND bar
defType=edismax
qf=field1
qf=field2
qf=field3
with debug on I see it parsing to this:
(+(DisjunctionMaxQuery((field1:foo | field2:foo | field3:foo))
DisjunctionMaxQuery((field1:and | field2:and | field3:and))
DisjunctionMaxQuery((field1:bar | field2:bar |
On Thu, Aug 15, 2013 at 5:19 PM, Steven Bower smb-apa...@alcyon.net
wrote:
I have query like:
q=foo AND bar
defType=edismax
qf=field1
qf=field2
qf=field3
with debug on I see it parsing to this:
(+(DisjunctionMaxQuery((field1:foo | field2:foo | field3:foo
https://issues.apache.org/jira/browse/SOLR-5163
On Thu, Aug 15, 2013 at 6:04 PM, Steven Bower smb-apa...@alcyon.net wrote:
@Yonik that was exactly the issue... I'll file a ticket... there def
should be an exception thrown for something like this..
It would seem to me that eating any sort
Is there an easy way in code / command line to lint a solr config (or even
just a solr schema)?
Steve
) that are
used for sorting.. I'm suspecting that docvalues will greatly help this
load performance?
thanks,
steve
On Wed, Jul 31, 2013 at 4:32 PM, Steven Bower smb-apa...@alcyon.net wrote:
the list of IDs does change relatively frequently, but this doesn't seem
to have very much impact
, 2013 at 1:10 AM, Steven Bower sbo...@alcyon.net wrote:
not sure what you mean by good hit raitio?
I mean such queries are really expensive (even on cache hit), so if the
list of ids changes every time, it never hit cache and hence executes these
heavy queries every time. It's well known
be down in that sub 100ms range..
steve
On Tue, Jul 30, 2013 at 12:02 PM, Steven Bower sbo...@alcyon.net wrote:
Will give the boolean thing a shot... makes sense...
On Tue, Jul 30, 2013 at 11:53 AM, Smiley, David W. dsmi...@mitre.orgwrote:
I see the problem ‹ it's +pp:*. It may look innocent
I am curious why the field:* walks the entire terms list.. could this be
discovered from a field cache / docvalues?
steve
On Tue, Jul 30, 2013 at 2:00 PM, Steven Bower sbo...@alcyon.net wrote:
Until I get the data refed I there was another field (a date field) that
was there and not when
at 4:18 PM, Mikhail Khludnev
mkhlud...@griddynamics.com wrote:
On Tue, Jul 30, 2013 at 12:45 AM, Steven Bower smb-apa...@alcyon.net
wrote:
- Most of my time (98%) is being spent in
java.nio.Bits.copyToByteArray(long,Object,long,long) which is being
Steven, please
http://blog.thetaphi.de
On Tue, Jul 30, 2013 at 5:10 PM, Steven Bower sbo...@alcyon.net wrote:
Very good read... Already using MMap... verified using pmap and vsz from
top..
not sure what you mean by good hit raitio?
Here are the stacks...
Name Time (ms) Own Time (ms
may be the new leader - try and sync);
How reproducible is this bug for you? It would be great to know if
the patch in the issue fixes things.
-Yonik
http://lucidworks.com
On Wed, May 15, 2013 at 6:04 PM, Steven Bower sbo...@alcyon.net wrote:
They are visible to ls...
On Wed, May
Created https://issues.apache.org/jira/browse/SOLR-4831 to capture this
issue
On Thu, May 16, 2013 at 10:10 AM, Steven Bower sbo...@alcyon.net wrote:
Looking at the timestamps on the tlog files they seem to have all been
created around the same time (04:55).. starting around this time I start
We have a system in which a client is sending 1 record at a time (via REST)
followed by a commit. This has produced ~65k tlog files and the JVM has run
out of file descriptors... I grabbed a heap dump from the JVM and I can see
~52k unreachable FileDescriptors... This leads me to believe that the
if I can create a test case to reproduce this.
Separately, you'll get a lot better performance if you don't commit
per update of course (or at least use something like commitWithin).
-Yonik
http://lucidworks.com
On Wed, May 15, 2013 at 5:06 PM, Steven Bower sbo...@alcyon.net wrote:
We have
map commit requests to hard commit
(default), soft commit, or none.
wunder
On May 15, 2013, at 2:20 PM, Steven Bower wrote:
Most definetly understand the don't commit after each record...
unfortunately the data is being fed by another team which I cannot
control...
Limiting the number
They are visible to ls...
On Wed, May 15, 2013 at 5:49 PM, Yonik Seeley yo...@lucidworks.com wrote:
On Wed, May 15, 2013 at 5:20 PM, Steven Bower sbo...@alcyon.net wrote:
when the TransactionLog objects are dereferenced
their RandomAccessFile object is not closed..
Have the files been
Solr ElasticSearch Support
http://sematext.com/
On May 9, 2013 1:43 AM, Steven Bower smb-apa...@alcyon.net wrote:
Is it currently possible to have per-shard replication factor?
A bit of background on the use case...
If you are hashing content to shards by a known factor (lets say
Is it currently possible to have per-shard replication factor?
A bit of background on the use case...
If you are hashing content to shards by a known factor (lets say date
ranges, 12 shards, 1 per month) it might be the case that most of your
search traffic would be directed to one particular
50 matches
Mail list logo