Re: Very long young generation stop the world GC pause

2016-12-21 Thread Steven Bower
Also curious why such a large heap is required... If it's due to field caches being loaded I'd highly recommend MMapDirectory (if not using already) and turning on DocValues for all fields you plan to perform sort/facet/analytics on. steve On Wed, Dec 21, 2016 at 9:25 AM Pushkar Raste

Re: SOLR Disk Access Latency Problem

2016-09-21 Thread Steven Bower
That sounds like some SAN vendor BS if you ask me. Breaking up 300gb into smaller chunks would only be relevant if they were caching entire files not blocks and I find that hard to believe. Would be interested to know more about the specifics of the problem as the vendor sees it. As Shawn said

Re: Full re-index without downtime

2016-07-06 Thread Steven Bower
There are two options as I see it.. 1. Do something like you describe and create a secondary index, index into it, then switch... I personally would create a completely separate solr cloud alongside my existing one vs new core in the same cloud as you might see some negative impacts on GC caused

Re: deploy solr on cloud providers

2016-07-05 Thread Steven Bower
Looking deeper into zookeeper as truth mode I was wrong about existing replicas being recreated once storage is gone.. Seems there is intent for the type of behavior based upon existing tickets.. We'll look at creating a patch for this too.. Steve On Tue, Jul 5, 2016 at 6:00 PM Tomás Fernández

Re: deploy solr on cloud providers

2016-07-05 Thread Steven Bower
You shouldn't "need" to move the storage as SolrCloud will replicate all data to the new node and anything in the transaction log will already be distributed through the rest of the machines.. One option to keep all your data attached to nodes might be to use Amazon EFS (pretty new) to store your

Re: stateless solr ?

2016-07-05 Thread Steven Bower
The ticket in question is https://issues.apache.org/jira/browse/SOLR-9265 We are working on a patch now... will update when we have a working patch / tests.. Shawn is correct that when adding a new node to a SolrCloud cluster it will not automatically add replicas/etc.. The idea behind this

Re: stateless solr ?

2016-07-04 Thread Steven Bower
the API for you. > > Upayavira > > On Mon, 4 Jul 2016, at 08:53 PM, Steven Bower wrote: > > My main issue is having to make any solr collection api calls during a > > transition.. It makes integrating with orchestration engines way more > > complex.. > > On Mon, Jul 4, 20

Re: stateless solr ?

2016-07-04 Thread Steven Bower
My main issue is having to make any solr collection api calls during a transition.. It makes integrating with orchestration engines way more complex.. On Mon, Jul 4, 2016 at 3:40 PM Upayavira wrote: > Are you using Solrcloud? With Solrcloud this stuff is easy. You just add > a

Re: stateless solr ?

2016-07-04 Thread Steven Bower
We have been working on some changes that should help with this.. 1st challenge is having the node name remain static regardless of where the node runs (right now it uses host and port, so this won't work unless you are using some sort of tunneled or dynamic networking).. We have a patch we are

Re: Solr cross core join special condition

2015-11-11 Thread Steven Bower
commenting so this ends up in Dennis' inbox.. On Tue, Oct 13, 2015 at 7:17 PM Yonik Seeley wrote: > On Wed, Oct 7, 2015 at 9:42 AM, Ryan Josal wrote: > > I developed a join transformer plugin that did that (although it didn't > > flatten the results like

SolrCloud with local configs

2015-05-21 Thread Steven Bower
Is it possible to run in cloud mode with zookeeper managing collections/state/etc.. but to read all config files (solrconfig, schema, etc..) from local disk? Obviously this implies that you'd have to keep them in sync.. My thought here is of running Solr in a docker container, but instead of

Re: Spatial maxDistErr changes

2014-04-03 Thread Steven Bower
...@apache.org wrote: Good question Steve, You'll have to re-index right off. ~ David p.s. Sorry I didn't reply sooner; I just switched jobs and reconfigured my mailing list subscriptions Steven Bower wrote If am only indexing point shapes and I want to change the maxDistErr from 0.09 (1m

Spatial maxDistErr changes

2014-03-17 Thread Steven Bower
If am only indexing point shapes and I want to change the maxDistErr from 0.09 (1m res) to 0.00045 will this break as in searches stop working or will search work but any performance gain won't be seen until all docs are reindexed? Or will I have to reindex right off? thanks, steve

IDF maxDocs / numDocs

2014-03-12 Thread Steven Bower
I am noticing the maxDocs between replicas is consistently different and that in the idf calculation it is used which causes idf scores for the same query/doc between replicas to be different. obviously an optimize can normalize the maxDocs scores, but that is only temporary.. is there a way to

Re: IDF maxDocs / numDocs

2014-03-12 Thread Steven Bower
My problem is that both maxDoc() and docCount() both report documents that have been deleted in their values. Because of merging/etc.. those numbers can be different per replica (or at least that is what I'm seeing). I need a value that is consistent across replicas... I see in the comment it

Re: Issue with spatial search

2014-03-11 Thread Steven Bower
this option in your query after the end of the last parenthesis, as in this example from the wiki: fq=geo:IsWithin(POLYGON((-10 30, -40 40, -10 -20, 40 20, 0 0, -10 30))) distErrPct=0 ~ David Steven Bower wrote Only points in the index.. Am I correct this won't require a reindex

Issue with spatial search

2014-03-10 Thread Steven Bower
I am seeing a error when doing a spatial search where a particular point is showing up within a polygon, but by all methods I've tried that point is not within the polygon.. First the point is: 41.2299,29.1345 (lat/lon) The polygon is: 31.2719,32.283 31.2179,32.3681 31.1333,32.3407

Re: Issue with spatial search

2014-03-10 Thread Steven Bower
Minor edit to the KML to adjust color of polygon On Mon, Mar 10, 2014 at 4:21 PM, Steven Bower smb-apa...@alcyon.net wrote: I am seeing a error when doing a spatial search where a particular point is showing up within a polygon, but by all methods I've tried that point is not within

Re: Issue with spatial search

2014-03-10 Thread Steven Bower
42.1284,42.2141 40.0919,47.8482 30.4169,47.5783 26.9892,43.6459 27.2095,41.5676 29.0454,41.2198 On Mon, Mar 10, 2014 at 4:23 PM, Steven Bower smb-apa...@alcyon.net wrote: Minor edit to the KML to adjust color of polygon On Mon

Re: Issue with spatial search

2014-03-10 Thread Steven Bower
different story if you are indexing non-point shapes. ~ David From: Steven Bower smb-apa...@alcyon.net javascript:;mailto: smb-apa...@alcyon.net javascript:; Reply-To: solr-user@lucene.apache.org javascript:;mailto: solr-user@lucene.apache.org javascript:; solr-user@lucene.apache.orgjavascript

Re: core.properties and solr.xml

2014-01-23 Thread Steven Bower
www.flax.co.uk On 15 Jan 2014, at 15:39, Steven Bower wrote: I will open up a JIRA... I'm more concerned over the core locator stuff vs the solr.xml.. Should the specification of the core locator go into the solr.xml or via some other method? steve On Tue, Jan 14, 2014 at 5

Re: core.properties and solr.xml

2014-01-15 Thread Steven Bower
On Tue, Jan 14, 2014 at 1:41 PM, Steven Bower smb-apa...@alcyon.net wrote: Are there any plans/tickets to allow for pluggable SolrConf and CoreLocator? In my use case my solr.xml is totally static, i have a separate dataDir and my core.properties are derived from a separate configuration

core.properties and solr.xml

2014-01-14 Thread Steven Bower
Are there any plans/tickets to allow for pluggable SolrConf and CoreLocator? In my use case my solr.xml is totally static, i have a separate dataDir and my core.properties are derived from a separate configuration (living in ZK) but totally outside of the SolrCloud.. I'd like to be able to not

Index Sizes

2014-01-07 Thread Steven Bower
I was looking at the code for getIndexSize() on the ReplicationHandler to get at the size of the index on disk. From what I can tell, because this does directory.listAll() to get all the files in the directory, the size on disk includes not only what is searchable at the moment but potentially

SolrCoreAware

2013-11-15 Thread Steven Bower
Under what circumstances will a handler that implements SolrCoreAware have its inform() method called? thanks, steve

Re: SolrCoreAware

2013-11-15 Thread Steven Bower
the handler is created, either at SolrCore construction time (solr startup or core reload) or the first time the handler is requested if it's a lazy-loading handler. Alan Woodward www.flax.co.uk On 15 Nov 2013, at 15:40, Steven Bower wrote: Under what circumstances will a handler

Re: SolrCoreAware

2013-11-15 Thread Steven Bower
it should be called only once during hte lifetime of a given plugin, usually not long after construction -- but it could be called many, many times in the lifetime of the solr process. So for a given instance of a handler it will only be called once during the lifetime of that handler? Also,

Re: SolrCoreAware

2013-11-15 Thread Steven Bower
And the close hook will basically only be fired once during shutdown? On Fri, Nov 15, 2013 at 1:07 PM, Chris Hostetter hossman_luc...@fucit.orgwrote: : So for a given instance of a handler it will only be called once during the : lifetime of that handler? correct (unless there is a bug

Re: How to set a condition over stats result

2013-10-04 Thread Steven Bower
Check out: https://issues.apache.org/jira/browse/SOLR-5302 can do this using query facets On Fri, Jul 12, 2013 at 11:35 AM, Jack Krupansky j...@basetechnology.comwrote: sum(x, y, z) = x + y + z (sums those specific fields values for the current document) sum(x, y) = x + y (sum of those two

Re: StatsComponent with median

2013-10-04 Thread Steven Bower
Check out: https://issues.apache.org/jira/browse/SOLR-5302 it supports median value On Wed, Jul 3, 2013 at 12:11 PM, William Bell billnb...@gmail.com wrote: If you are a programmer, you can modify it and attach a patch in Jira... On Tue, Jun 4, 2013 at 4:25 AM, Marcin Rzewucki

Re: bucket count for facets

2013-09-06 Thread Steven Bower
. See https://cwiki.apache.org/confluence/display/solr/The+Stats+Component On Fri, Sep 6, 2013 at 12:28 AM, Steven Bower smb-apa...@alcyon.net wrote: Is there a way to get the count of buckets (ie unique values) for a field facet? the rudimentary approach of course is to get back all buckets

bucket count for facets

2013-09-05 Thread Steven Bower
Is there a way to get the count of buckets (ie unique values) for a field facet? the rudimentary approach of course is to get back all buckets, but in some cases this is a huge amount of data. thanks, steve

AND not working

2013-08-15 Thread Steven Bower
I have query like: q=foo AND bar defType=edismax qf=field1 qf=field2 qf=field3 with debug on I see it parsing to this: (+(DisjunctionMaxQuery((field1:foo | field2:foo | field3:foo)) DisjunctionMaxQuery((field1:and | field2:and | field3:and)) DisjunctionMaxQuery((field1:bar | field2:bar |

Re: AND not working

2013-08-15 Thread Steven Bower
On Thu, Aug 15, 2013 at 5:19 PM, Steven Bower smb-apa...@alcyon.net wrote: I have query like: q=foo AND bar defType=edismax qf=field1 qf=field2 qf=field3 with debug on I see it parsing to this: (+(DisjunctionMaxQuery((field1:foo | field2:foo | field3:foo

Re: AND not working

2013-08-15 Thread Steven Bower
https://issues.apache.org/jira/browse/SOLR-5163 On Thu, Aug 15, 2013 at 6:04 PM, Steven Bower smb-apa...@alcyon.net wrote: @Yonik that was exactly the issue... I'll file a ticket... there def should be an exception thrown for something like this.. It would seem to me that eating any sort

Schema Lint

2013-08-06 Thread Steven Bower
Is there an easy way in code / command line to lint a solr config (or even just a solr schema)? Steve

Re: Performance question on Spatial Search

2013-08-05 Thread Steven Bower
) that are used for sorting.. I'm suspecting that docvalues will greatly help this load performance? thanks, steve On Wed, Jul 31, 2013 at 4:32 PM, Steven Bower smb-apa...@alcyon.net wrote: the list of IDs does change relatively frequently, but this doesn't seem to have very much impact

Re: Performance question on Spatial Search

2013-07-31 Thread Steven Bower
, 2013 at 1:10 AM, Steven Bower sbo...@alcyon.net wrote: not sure what you mean by good hit raitio? I mean such queries are really expensive (even on cache hit), so if the list of ids changes every time, it never hit cache and hence executes these heavy queries every time. It's well known

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
be down in that sub 100ms range.. steve On Tue, Jul 30, 2013 at 12:02 PM, Steven Bower sbo...@alcyon.net wrote: Will give the boolean thing a shot... makes sense... On Tue, Jul 30, 2013 at 11:53 AM, Smiley, David W. dsmi...@mitre.orgwrote: I see the problem ‹ it's +pp:*. It may look innocent

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
I am curious why the field:* walks the entire terms list.. could this be discovered from a field cache / docvalues? steve On Tue, Jul 30, 2013 at 2:00 PM, Steven Bower sbo...@alcyon.net wrote: Until I get the data refed I there was another field (a date field) that was there and not when

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
at 4:18 PM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: On Tue, Jul 30, 2013 at 12:45 AM, Steven Bower smb-apa...@alcyon.net wrote: - Most of my time (98%) is being spent in java.nio.Bits.copyToByteArray(long,Object,long,long) which is being Steven, please http://blog.thetaphi.de

Re: Performance question on Spatial Search

2013-07-30 Thread Steven Bower
On Tue, Jul 30, 2013 at 5:10 PM, Steven Bower sbo...@alcyon.net wrote: Very good read... Already using MMap... verified using pmap and vsz from top.. not sure what you mean by good hit raitio? Here are the stacks... Name Time (ms) Own Time (ms

Re: Transaction Logs Leaking FileDescriptors

2013-05-16 Thread Steven Bower
may be the new leader - try and sync); How reproducible is this bug for you? It would be great to know if the patch in the issue fixes things. -Yonik http://lucidworks.com On Wed, May 15, 2013 at 6:04 PM, Steven Bower sbo...@alcyon.net wrote: They are visible to ls... On Wed, May

Re: Transaction Logs Leaking FileDescriptors

2013-05-16 Thread Steven Bower
Created https://issues.apache.org/jira/browse/SOLR-4831 to capture this issue On Thu, May 16, 2013 at 10:10 AM, Steven Bower sbo...@alcyon.net wrote: Looking at the timestamps on the tlog files they seem to have all been created around the same time (04:55).. starting around this time I start

Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Steven Bower
We have a system in which a client is sending 1 record at a time (via REST) followed by a commit. This has produced ~65k tlog files and the JVM has run out of file descriptors... I grabbed a heap dump from the JVM and I can see ~52k unreachable FileDescriptors... This leads me to believe that the

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Steven Bower
if I can create a test case to reproduce this. Separately, you'll get a lot better performance if you don't commit per update of course (or at least use something like commitWithin). -Yonik http://lucidworks.com On Wed, May 15, 2013 at 5:06 PM, Steven Bower sbo...@alcyon.net wrote: We have

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Steven Bower
map commit requests to hard commit (default), soft commit, or none. wunder On May 15, 2013, at 2:20 PM, Steven Bower wrote: Most definetly understand the don't commit after each record... unfortunately the data is being fed by another team which I cannot control... Limiting the number

Re: Transaction Logs Leaking FileDescriptors

2013-05-15 Thread Steven Bower
They are visible to ls... On Wed, May 15, 2013 at 5:49 PM, Yonik Seeley yo...@lucidworks.com wrote: On Wed, May 15, 2013 at 5:20 PM, Steven Bower sbo...@alcyon.net wrote: when the TransactionLog objects are dereferenced their RandomAccessFile object is not closed.. Have the files been

Re: Per Shard Replication Factor

2013-05-10 Thread Steven Bower
Solr ElasticSearch Support http://sematext.com/ On May 9, 2013 1:43 AM, Steven Bower smb-apa...@alcyon.net wrote: Is it currently possible to have per-shard replication factor? A bit of background on the use case... If you are hashing content to shards by a known factor (lets say

Per Shard Replication Factor

2013-05-08 Thread Steven Bower
Is it currently possible to have per-shard replication factor? A bit of background on the use case... If you are hashing content to shards by a known factor (lets say date ranges, 12 shards, 1 per month) it might be the case that most of your search traffic would be directed to one particular