Re: Congratulations to the new Apache Solr PMC Chair, Jan Høydahl!

2021-02-20 Thread Yonik Seeley
Congrats Jan! Go Solr! -Yonik On Thu, Feb 18, 2021 at 1:56 PM Anshum Gupta wrote: > Hi everyone, > > I’d like to inform everyone that the newly formed Apache Solr PMC nominated > and elected Jan Høydahl for the position of the Solr PMC Chair and Vice > President. This decision was approved by

Re: Help using Noggit for streaming JSON data

2020-09-17 Thread Yonik Seeley
See this method: /** Reads a JSON string into the output, decoding any escaped characters. */ public void getString(CharArr output) throws IOException And then the idea is to create a subclass of CharArr to incrementally handle the string that is written to it. You could overload write

Re: Solr admin interface freezes on Chrome

2019-10-02 Thread Yonik Seeley
Can someone open a JIRA to track this problem? -Yonik On Wed, Oct 2, 2019 at 7:04 PM Solr User wrote: > > Works fine on Firefox, and I > > haven't made any changes to our Solr instance (v8.1.1) in a while. > > Had a co-worker with a similar issue. He had a pop-blocker enabled in > chrome that

Re: Optimizing fq query performance

2019-04-13 Thread Yonik Seeley
More constrained but matching the same set of documents just guarantees that there is more information to evaluate per document matched. For your specific case, you can optimize fq = 'field1:* AND field2:value' to =field1:*=field2:value This will at least cause field1:* to be cached and reused if

Re: Problem with white space or special characters in function queries

2019-03-29 Thread Yonik Seeley
On Thu, Mar 28, 2019 at 6:05 PM Jan Høydahl wrote: > Functions can never contain spaces. Spaces work fine in functions in general. The issue is the "bf" parameter as it uses whitespace to delimit multiple functions IIRC. -Yonik > Try to substitute the term with a variable, i.e. a request

Re: Solr 7.X negative filter not working

2018-09-20 Thread Yonik Seeley
I just tried the master branch quickly, and I can't reproduce this. "params":{ "q":"*:*", "debug":"true", "fq":"title_t:(NOT Kings)"}}, [...] "QParser":"LuceneQParser", "filter_queries":["title_t:(NOT Kings)"], "parsed_filter_queries":["-title_t:kings"],

Re: CACHE -> fieldValueCache usage

2018-09-20 Thread Yonik Seeley
On Wed, Sep 19, 2018 at 9:44 AM Vincenzo D'Amore wrote: > Looking at Solr Admin Panel I've found the CACHE -> fieldValueCache tab > where all the values are 0. > > [...] > > what do you thing, is that normal? Yep, that's completely normal. That cache is only used by certain operations on

Re: 7.3 appears to leak

2018-06-28 Thread Yonik Seeley
> * SortedIntDocSet instances ánd ConcurrentLRUCache$CacheEntry instances are > both leaked on commit; If these are actually filterCache entries being leaked, it stands to reason that a whole searcher is being leaked somewhere. -Yonik

Re: Retrieving json.facet from a search

2018-06-28 Thread Yonik Seeley
There isn't typed support, but you can use the generic support like so: .getResponse().get("facets") -Yonik On Thu, Jun 28, 2018 at 2:31 PM, Webster Homer wrote: > I have a fairly large existing code base for querying Solr. It is > architected where common code calls solr and returns a solrj

Re: Solr 7.3, FunctionScoreQuery no longer displays debug output

2018-05-17 Thread Yonik Seeley
If this used to work, I wonder if it's something to do with changes to boost: https://issues.apache.org/jira/browse/LUCENE-8099 -Yonik On Thu, May 17, 2018 at 5:48 PM, Markus Jelsma wrote: > Hello, > > Sorry to disturb. Is there anyone here able to reproduce and

Re: Error using multiple terms in function query

2018-05-15 Thread Yonik Seeley
Problems like this are usually caused by the whole query not even making it to Solr due to bad HTTP param encoding. For example, if you're using curl with request parameters in the URL, you need to manually encode spaces as either "+" or "%20" -Yonik On Tue, May 15, 2018 at 7:41 PM, Shamik

Re: Solr Json Facet

2018-05-08 Thread Yonik Seeley
d without escaping. > > This is the HTTP response: > > response.content > > ' 2.0//EN">\n\n400 Bad > Request\n\nBad Request\nYour browser sent > a request that this server could not understand. />\n\n\nApache/2.2.15 (Oracle) Server at leydenh Port > 80\n\n' > >

Re: Solr Json Facet

2018-05-08 Thread Yonik Seeley
On Tue, May 8, 2018 at 1:36 PM, Kojo wrote: > If I tag the fq query and I query for a simple word it works fine too. But > if query a multi word with space in the middle it breaks: Most likely the full query is not getting to Solr because of an HTTP protocol error (i.e. the

Re: Error in indexing JSON with space in value

2018-03-22 Thread Yonik Seeley
t; 0120: 69 6f 6e 3d 32 34 20 41 46 54 45 52 3d 27 27 22 ion=24 AFTER=''" > 0130: 2c 0a 20 20 20 20 22 63 6f 64 65 22 3a 34 30 30 ,."code":400 > 0140: 7d 7d 0a}}. > { > "responseHeader":{ > "status":400,

Re: Error in indexing JSON with space in value

2018-03-22 Thread Yonik Seeley
It looks like a curl globbing issue from the curl error message you included: "curl: (3) [globbing] bad range specification in column 39" You can try turning off curl globbing with the -g param. That may not be the only issue though, as the command shown shouldn't have triggered curl globbing.

Re: Issue Using JSON Facet API Buckets in Solr 6.6

2018-02-22 Thread Yonik Seeley
I've reproduced the issue and opened https://issues.apache.org/jira/browse/SOLR-12020 -Yonik On Thu, Feb 22, 2018 at 11:03 AM, Yonik Seeley <ysee...@gmail.com> wrote: > Thanks Antelmo, I'm trying to reproduce this now. > -Yonik > > > On Mon, Feb 19, 2018 at 10:13 AM, A

Re: Issue Using JSON Facet API Buckets in Solr 6.6

2018-02-22 Thread Yonik Seeley
ace from the logs: >> >> https://pastebin.com/rsHvKK63 >> >> https://pastebin.com/8amxacAj >> >> I am not using any custom code or plugins with the Solr instance. >> >> Please let me know if you need anything else and thanks for looking into >> this.

Re: facet.method=uif not working in solr cloud?

2018-02-15 Thread Yonik Seeley
ber of replicas? > If we are doing frequent auto commits, fieldvaluecache will be invalidated > and uif will have to pay the upfront cost again after each commit? Right. It's not good for frequently changing indexes. -Yonik > > > On Wed, Feb 14, 2018 at 11:51 AM, Yonik Seeley <ysee

Re: facet.method=uif not working in solr cloud?

2018-02-14 Thread Yonik Seeley
aster once that cost has been paid. -Yonik > On Tue, Feb 13, 2018 at 7:41 AM, Yonik Seeley <ysee...@gmail.com> wrote: > >> Great, thanks for tracking that down! >> It's interesting that a mincount of 0 disables uif processing in the >> first place. IIRC, it's on

Re: Issue Using JSON Facet API Buckets in Solr 6.6

2018-02-14 Thread Yonik Seeley
Could you provide the full stack trace containing "Invalid Date String" and the full request that causes it? Are you using any custom code/plugins in Solr? -Yonik On Mon, Feb 12, 2018 at 4:55 PM, Antelmo Aguilar wrote: > Hi, > > I was using the following part of a query to get

Re: facet.method=uif not working in solr cloud?

2018-02-13 Thread Yonik Seeley
Great, thanks for tracking that down! It's interesting that a mincount of 0 disables uif processing in the first place. IIRC, it's only the hash-based method (as opposed to array-based) that can't return zero counts. -Yonik On Tue, Feb 13, 2018 at 6:17 AM, Alessandro Benedetti

Re: facet.method=uif not working in solr cloud?

2018-02-12 Thread Yonik Seeley
Feels like we should open an issue for this (that facet.method=uif is only respected if you specify another esoteric parameter...) -Yonik On Mon, Feb 12, 2018 at 8:34 PM, Wei wrote: > Adding facet.distrib.mco=true did the trick. Thanks Toke and Alessandro! > > Cheers, >

Re: Solr4 To Solr6 CPU load issues

2018-02-12 Thread Yonik Seeley
On Sun, Feb 11, 2018 at 8:47 AM, ~$alpha` wrote: > I have upgraded Solr4.0 Beta to Solr6.6. The Cache results look Awesome but > overall the CPU load on solr6.6 is double the load on solr4.0 and hence I am > not able to roll solr6.6 to 100% of my traffic. > > *Some Key

Re: Solr 7.2.1 - cursorMark and elevateIds

2018-01-25 Thread Yonik Seeley
Yes, please open a JIRA issue. The elevate component modifies the sort parameter, and it looks like that doesn't play well with cursorMark, which needs to serialize/deserialize sort values. We can either fix the issue, or at a minimum provide a better error message if cursorMark is limited to

Re: Json Facet Query Stripping Field Name with Hyphen

2018-01-04 Thread Yonik Seeley
The JSON Facet API uses the function query parser for something like sum(week_-91) so you'll probably have problems with any function that uses these fields as well. As Erick says, you're better off renaming the fields. There is a workaround for wonky field names via the "field" function:

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-12 Thread Yonik Seeley
imple use-case, so if we're slower than ES for some reason, it should be very easy to fix. -Yonik > On Tue, Dec 12, 2017 at 7:27 PM, Yonik Seeley <ysee...@gmail.com> wrote: > >> OK great, so it's definitely not the main query (which is just a >> single term query in

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-12 Thread Yonik Seeley
!sum=true }metric_97' stats.field='{!sum=true > }metric_98' stats.field='{!sum=true }metric_99' > stats.field='{!sum=true }metric_100' stats.field='{!sum=true > }metric_101' stats.field='{!sum=true }metric_102' > stats.field='{!sum=true }metric_103' stats.field='{!sum=true > }metric_104'}

Re: Solr Aggregation queries are way slower than Elastic Search

2017-12-11 Thread Yonik Seeley
I think the SolrJ below uses the old stats component. Hopefully the JSON Facet API would be faster for this, but it's not completely clear what the main query here looks like, and if it's the source of any bottleneck rather than the aggregations. What does the generated query string actually look

Re: Skewed IDF in multi lingual index, again

2017-12-05 Thread Yonik Seeley
On Tue, Dec 5, 2017 at 5:15 AM, alessandro.benedetti wrote: > "Lucene/Solr doesn't actually delete documents when you delete them, it > just marks them as deleted. I'm pretty sure that the difference between > docCount and maxDoc is deleted documents. Maybe I don't

Re: Skewed IDF in multi lingual index, again

2017-12-04 Thread Yonik Seeley
On Mon, Dec 4, 2017 at 1:35 PM, Shawn Heisey wrote: > I'm pretty sure that the difference between docCount and maxDoc is deleted > documents. docCount (not the best name) here is the number of documents with the field being searched. docFreq (df) is the number of documents

Re: JVM GC Issue

2017-12-03 Thread Yonik Seeley
On Sat, Dec 2, 2017 at 8:59 PM, S G wrote: > I am a bit curious on the docValues implementation. > I understand that docValues do not use JVM memory and > they make use of OS cache - that is why they are more performant. > > But to return any response from the

Re: Solr 7.x: Issues with unique()/hll() function on a string field nested in a range facet

2017-11-21 Thread Yonik Seeley
I opened https://issues.apache.org/jira/browse/SOLR-11664 to track this. I should be able to look into this shortly if no one else does. -Yonik On Tue, Nov 21, 2017 at 6:02 PM, Yonik Seeley <ysee...@gmail.com> wrote: > Thanks for the complete info that allowed me to easily

Re: Solr 7.x: Issues with unique()/hll() function on a string field nested in a range facet

2017-11-21 Thread Yonik Seeley
Thanks for the complete info that allowed me to easily reproduce this! The bug seems to extend beyond hll/unique... I tried min(string_s) and got wonky results as well. -Yonik On Tue, Nov 21, 2017 at 7:47 AM, Volodymyr Rudniev wrote: > Hello, > > I've encountered 2 issues

Re: Nested facet complete wrong counts

2017-11-11 Thread Yonik Seeley
Also, If you're looking at all constraints, you shouldn't need refine:true But if you do need it, it was only added in Solr 7.0 (and I see you're using 6.6) -Yonik On Sat, Nov 11, 2017 at 9:48 AM, Yonik Seeley <ysee...@gmail.com> wrote: > On Sat, Nov 11, 2017 at 9:18 AM, Kenny K

Re: Nested facet complete wrong counts

2017-11-11 Thread Yonik Seeley
On Sat, Nov 11, 2017 at 9:18 AM, Kenny Knecht wrote: > Hi Yonik, > > I am aware of the estimate on the hll. But we don't use the hll as a > baseline for comparison. We ask the values for one facet (for example > Gender). We store these counts for each bucket. Next we do

Re: Nested facet complete wrong counts

2017-11-10 Thread Yonik Seeley
I do notice you are using hll (hyper-log-log) which is a distributed cardinality *estimate* : https://en.wikipedia.org/wiki/HyperLogLog -Yonik On Fri, Nov 10, 2017 at 11:32 AM, kenny wrote: > Hi all, > > We are doing some tests in solr 6.6 with json facet api and we get >

Re: Upgrade path from 5.4.1

2017-11-01 Thread Yonik Seeley
On Wed, Nov 1, 2017 at 2:36 PM, Erick Erickson wrote: > I _always_ prefer to reindex if possible. Additionally, as of Solr 7 > all the numeric types are deprecated in favor of points-based types > which are faster on all fronts and use less memory. They are a good step

Re: Really slow facet performance in 6.6

2017-10-25 Thread Yonik Seeley
On Mon, Oct 23, 2017 at 3:06 PM, John Davis wrote: > Hello, > > We are seeing really slow facet performance with new solr release. This is > on an index of 2M documents. A few things we've tried: What happens when you run this facet request again? The first time a UIF

Re: Jetty maxThreads

2017-10-20 Thread Yonik Seeley
The high number of maxThreads is to avoid distributed deadlock. The fix is multiple thread pools, depending on request type: https://issues.apache.org/jira/browse/SOLR-7344 -Yonik On Wed, Oct 18, 2017 at 4:41 PM, Walter Underwood wrote: > Jetty maxThreads is set to

Re: Solr facets counts deep paged returns inconsistent counts

2017-10-20 Thread Yonik Seeley
(deeper paging "discoveres" a new constraint which ranks higher). Regular faceting does more overrequest by default, and does refinement by default. So adding refine:true and a deeper overrequest for json facets should perform equivalently. -Yonik Kenny > > On 20-10-17 17:12,

Re: Solr facets counts deep paged returns inconsistent counts

2017-10-20 Thread Yonik Seeley
Facet refinement in Solr guarantees that counts for returned constraints are correct, but does not guarantee that the top N returned isn't missing a constraint. Consider the following shard counts (3 shards) for the following constraints (aka facet values): constraintA: 2 0 0 constraintB: 0 2 0

Re: Trying to fix Too Many Boolean Clauses Exception

2017-10-18 Thread Yonik Seeley
On Wed, Oct 18, 2017 at 12:23 PM, Erick Erickson wrote: > What have you tried? And what is the current setting? > > This usually occurs when you are assembling very large OR clauses, > sometimes for ACL calculations. > > So if you have a query of the form > q=field:(A OR

Re: Concern on solr commit

2017-10-18 Thread Yonik Seeley
On Wed, Oct 18, 2017 at 5:09 AM, Leo Prince wrote: > Is there any known negative impacts in setting up autoSoftCommit as 1 > second other than RAM usage..? Briefly: Don't use autowarming (but keep caches enabled!) Use docValues for fields you will facet and sort

Re: [ANNOUNCE] Apache Solr 7.1.0 released

2017-10-17 Thread Yonik Seeley
It pointed to 7.1.0 for me perhaps a browser cache issue? Anyway, you can go directly as well: http://www.apache.org/dyn/closer.lua/lucene/solr/7.1.0 -Yonik On Tue, Oct 17, 2017 at 11:25 AM, Susheel Kumar wrote: > Thanks, Shalin. > > But the download mirror still has

Re: Concern on solr commit

2017-10-17 Thread Yonik Seeley
Related: maxWarmingSearchers behavior was fixed (block for another commit to succeed first rather than fail) in Solr 6.4 and later. https://issues.apache.org/jira/browse/SOLR-9712 Also, if any of your "realtime" search requests only involve retrieving certain documents by ID, then you can use

Re: FieldValueCache in solr 6.6

2017-10-06 Thread Yonik Seeley
On Fri, Oct 6, 2017 at 12:45 PM, sile wrote: > Hi Yonik, > > Thanks for your answer :). > > It works. > > Another question: > > What is recommended to be used in solr 6.6 for faceting (docValues or > UnInvertedField), because UnInvertedField performs better for subsequent >

Re: FieldValueCache in solr 6.6

2017-10-06 Thread Yonik Seeley
If you're using regular faceting (as opposed to the JSON Facet API), you can try facet.method=uif https://issues.apache.org/jira/browse/SOLR-8466 Background: UIF (UnInvertedField which are the entries in the FieldValueCache) was completely removed from use at some point in the 5.x timeframe. It

Re: FilterCache size should reduce as index grows?

2017-10-06 Thread Yonik Seeley
On Fri, Oct 6, 2017 at 6:50 AM, Toke Eskildsen wrote: > Letting the default use maxSizeMB would be better IMO. But I assume > that FastLRUCache is used for a reason, so that would have to be > extended to support that parameter first. FastLRUCache is the default on the filter cache

Re: FilterCache size should reduce as index grows?

2017-10-05 Thread Yonik Seeley
On Thu, Oct 5, 2017 at 3:20 AM, Toke Eskildsen wrote: > On Wed, 2017-10-04 at 21:42 -0700, S G wrote: > > It seems that the memory limit option maxSizeMB was added in Solr 5.2: > https://issues.apache.org/jira/browse/SOLR-7372 > I am not sure if it works with all caches in Solr, but

Re: FilterCache size should reduce as index grows?

2017-10-05 Thread Yonik Seeley
On Thu, Oct 5, 2017 at 10:07 AM, Erick Erickson wrote: > The other thing I'd point out is that if your hit ratio is low, you > might as well disable it entirely. I'd normally recommend against turning it off entirely, except in *very* custom cases. Even if the user

Re: SOLR 6.1 | Continuous hits coming for unwanted URL pattern

2017-09-26 Thread Yonik Seeley
Looks like it's some sort of ping (liveness) query, probably from a load balancer? Actually, it looks like it's a SolrJ client... here's the code that sets up that exact query:

Re: When will be solr 7.1 released?

2017-09-26 Thread Yonik Seeley
On Tue, Sep 26, 2017 at 2:02 PM, Nawab Zada Asad Iqbal wrote: > Thanks Yonik and Erick. > > That is helpful. > I am slightly confused about the branch name conventions. I expected 7x to > be named as branch_7_0 branch_7x is the main branch for all 7.x releases. When it's time

Re: When will be solr 7.1 released?

2017-09-26 Thread Yonik Seeley
One can also use a nightly snapshot build to try out the latest stuff: 7.x: https://builds.apache.org/job/Solr-Artifacts-7.x/lastSuccessfulBuild/artifact/solr/package/ 8.0: https://builds.apache.org/job/Solr-Artifacts-master/lastSuccessfulBuild/artifact/solr/package/ -Yonik On Tue, Sep 26,

Re: Consecutive calls to a query give different results

2017-09-07 Thread Yonik Seeley
x structure that we don't (and can't easily) update statistics when a document is marked as deleted. -Yonik > Erick > > On Wed, Sep 6, 2017 at 7:48 PM, Yonik Seeley <ysee...@gmail.com> wrote: >> Different replicas of the same shard can have different numbers of >&

Re: Consecutive calls to a query give different results

2017-09-06 Thread Yonik Seeley
Different replicas of the same shard can have different numbers of deleted documents (really just marked as deleted), and deleted documents are irrelevant to term statistics (like the number of documents a term appears in). Documents marked for deletion stop contributing to corpus statistics when

Re: NumberFormatException for multvalue, pint

2017-09-06 Thread Yonik Seeley
On Wed, Sep 6, 2017 at 4:09 PM, Steve Pruitt wrote: > Can't get a multi-valued pint field to update. > > The schema defines the field: multiValued="true" required="false" docValues="true" stored="true"/> > > I get the exception on this input: 7780386,7313483 > > Caused

Re: slow solr facet processing

2017-09-05 Thread Yonik Seeley
:03 -0400, Yonik Seeley wrote: >> It's due to this (see comments in UnInvertedField): > > I have read that. What I don't understand is the difference between 4.x > and 6.x. But as you say, Ere seems to be in the process of verifying > whether this is simply due to more segments in

Re: slow solr facet processing

2017-09-04 Thread Yonik Seeley
On Mon, Sep 4, 2017 at 6:38 AM, Toke Eskildsen wrote: > On Mon, 2017-09-04 at 13:21 +0300, Ere Maijala wrote: >> Thanks for the insight, Yonik. I can confirm that #2 is true. I ran >> >> >> >> and after it completed I was able to retrieve 2000 values in 17ms. > > Very interesting. Is

Re: slow solr facet processing

2017-09-01 Thread Yonik Seeley
ve effect >> though, which is unfortunate. Otherwise it reports that applied method is >> UIF, but the performance is actually much worse than with FC. I'll do just >> another round of testing to verify all this. I can report to SOLR-8096 when >> I have something conclusive. >>

Re: slow solr facet processing

2017-09-01 Thread Yonik Seeley
>> still not nowhere near as fast as 4.10.2, but a whole lot better. It seems >> that docValues needs to be disabled for facet.method=uif to have effect >> though, which is unfortunate. Otherwise it reports that applied method is >> UIF, but the performance is actually muc

Re: slow solr facet processing

2017-08-31 Thread Yonik Seeley
A possible improvement for some multiValued fields might be to use the "uif" facet method (UnInvertedField was the default method for multiValued fields in 4.x) I'm not sure if you would need to reindex without docValues on that field to try it though. Example: to enable on the "union" field, add

Re: Huge Facets and Streaming

2017-08-21 Thread Yonik Seeley
On Mon, Aug 21, 2017 at 6:01 AM, Mikhail Khludnev wrote: > Hello! > > I need to count really wide facet on 30 shards index with roughly 100M > docs, the facet response is about 100M values takes 0.5G in text file. > > So, far I experimented with old facets. It calculates per

Re: QueryParser changes query by itself

2017-08-16 Thread Yonik Seeley
The queryCache shouldn't be involved, this is somehow an issue in parsing (and Solr doesn't currently cache parsing). Perhaps there is something shared in your SynonymQParser instances that isn't quite thread safe? It could also be something in the text analysis in lucene as well (related to the

Re: JSON facet SUM precision and accuracy is incorrect

2017-08-08 Thread Yonik Seeley
This is due to function queries currently lacking type information (this problem will occur anywhere function queries are used and is not unique to JSON Facet). Function queries were originally only used in lucene scoring (which only uses float). The inner sum(amount1_d,amount2_d) uses

Re: _version_ as LongPointField returns error

2017-06-12 Thread Yonik Seeley
e FieldCache). That's not yet supported for Point* fields. -Yonik > On Mon, Jun 12, 2017 at 10:13 AM Yonik Seeley <ysee...@gmail.com> wrote: > >> I think the _version_ field should be >> - indexed="false" >> - stored="false" >> - docValues=

Re: _version_ as LongPointField returns error

2017-06-12 Thread Yonik Seeley
I think the _version_ field should be - indexed="false" - stored="false" - docValues="true" -Yonik On Mon, Jun 12, 2017 at 12:08 PM, Shawn Feldman wrote: > I changed all my TrieLong Fields to Point fields. _version_ always returns > an error unless i turn on

Re: JSON facet performance for aggregations

2017-05-24 Thread Yonik Seeley
On Mon, May 8, 2017 at 11:27 AM, Yonik Seeley <ysee...@gmail.com> wrote: > I opened https://issues.apache.org/jira/browse/SOLR-10634 to address > this performance issue. OK, this has been committed. A quick test shows about a 30x speedup when faceting on a string/numeric docvalues fie

Re: JSON facet performance for aggregations

2017-05-08 Thread Yonik Seeley
Do you recommend streaming at that case? > > Please advise. > > Thanks > Mikhail > > -Original Message- > From: Yonik Seeley [mailto:ysee...@gmail.com] > Sent: Sunday, May 07, 2017 6:25 PM > To: solr-user@lucene.apache.org > Subject: Re: JSON facet perform

Re: JSON facet performance for aggregations

2017-05-07 Thread Yonik Seeley
, Mikhail Ibraheem <mikhail.ibrah...@oracle.com> wrote: > Hi Yonik, > We are using Solr 6.5 > Both studentId and grades are double: >stored="true" docValues="true" multiValued="false" required="false"/> > > We have 1.5 million records

Re: Poll: Master-Slave or SolrCloud?

2017-04-30 Thread Yonik Seeley
On Tue, Apr 25, 2017 at 1:33 PM, Otis Gospodnetić wrote: > I think I saw mentions (maybe on user or dev MLs or JIRA) about > potentially, in the future, there only being SolrCloud mode (and dropping > SolrCloud name in favour of Solr). I personally never saw this

Re: JSON facet performance for aggregations

2017-04-30 Thread Yonik Seeley
It is odd there would be quite such a big performance delta. What version of solr are you using? What is the fieldType of "grades"? -Yonik On Sun, Apr 30, 2017 at 5:15 AM, Mikhail Ibraheem wrote: > 1- > studentId has docValue = true . it is of type double which is

Re: prefix facet performance

2017-04-24 Thread Yonik Seeley
In SimpleFacets.getFacetTermEnumCounts, we seek to the first term matching the prefix using the index and then for each term after compare the prefix until it no longer matches. -Yonik On Mon, Apr 24, 2017 at 5:04 AM, alessandro.benedetti wrote: > Thanks Yonik and Maria.

Re: prefix facet performance

2017-04-21 Thread Yonik Seeley
On Fri, Apr 21, 2017 at 4:25 PM, Maria Muslea wrote: > The field is: > > > > and using unique() I found that it has 700K+ unique values. > > The query before (that takes ~10s): > > wt=json=true=*:*=0=true=concept=A/ > > the query after (that is almost instant): > >

Re: prefix facet performance

2017-04-18 Thread Yonik Seeley
How many unique values in the index? You could try facet.method=enum -Yonik On Tue, Apr 18, 2017 at 8:16 PM, Maria Muslea wrote: > Hi, > > I have ~40K documents in SOLR (not many) and a multivalued facet field that > contains at least 2K values per document. > > The

Re: Disable All kind of caching in Solr/Lucene

2017-03-31 Thread Yonik Seeley
On Fri, Mar 31, 2017 at 1:53 PM, Nilesh Kamani wrote: > @Alexandre - Could you please point me to reference doc to remove default > cache settings ? > > @Yonik - The code change is in Solr Indexer to sort the results. OK, so to test indexing performance, there are no

Re: Disable All kind of caching in Solr/Lucene

2017-03-31 Thread Yonik Seeley
On Fri, Mar 31, 2017 at 9:44 AM, Nilesh Kamani wrote: > I am planning to do load testing for some of my code changes and I need to > disable all kind of caching. Perhaps you should be aiming to either: 1) seek a config + query load that maximizes time spent in your code

Re: JSON Facet API Virtual Field Support

2017-03-24 Thread Yonik Seeley
On Fri, Mar 24, 2017 at 7:52 PM, Furkan KAMACI wrote: > Hi, > > I test JSON Facet API of Solr. Is it possible to create a virtual field > which is generated by using existing fields at response and supports > elementary arithmetic operations? > > Example: > > Schema

Re: fq performance

2017-03-17 Thread Yonik Seeley
On Fri, Mar 17, 2017 at 2:17 PM, Shawn Heisey <apa...@elyograg.org> wrote: > On 3/17/2017 8:11 AM, Yonik Seeley wrote: >> For Solr 6.4, we've managed to circumvent this for filter queries and >> other contexts where scoring isn't needed. >> http://yonik.com/solr-6

Re: fq performance

2017-03-17 Thread Yonik Seeley
On Fri, Mar 17, 2017 at 9:09 AM, Shawn Heisey wrote: [...] > Lucene has a global configuration called "maxBooleanClauses" which > defaults to 1024. For Solr 6.4, we've managed to circumvent this for filter queries and other contexts where scoring isn't needed.

Re: Get handler not working

2017-03-16 Thread Yonik Seeley
the > documents appropriately from our basic testing. > > On Thu, Mar 16, 2017 at 9:42 AM David Hastings <hastings.recurs...@gmail.com> > wrote: > > i still would like to see an experiment where you change the field to id > instead of iqdocid, > > On Thu, Mar 1

Re: Get handler not working

2017-03-16 Thread Yonik Seeley
Something to do with routing perhaps? (the mapping of ids to shards, by default is based on hashes of the id) -Yonik On Thu, Mar 16, 2017 at 9:16 AM, Chris Ulicny wrote: > iqdocid is already set to be the uniqueKey value. > > I tried reindexing a few documents back into the

Re: Simulating group.facet for JSON facets, high mem usage w/ sorting on aggregation...

2017-02-10 Thread Yonik Seeley
FYI, I just opened https://issues.apache.org/jira/browse/SOLR-10122 for this -Yonik On Fri, Feb 10, 2017 at 4:32 PM, Yonik Seeley <ysee...@gmail.com> wrote: > On Thu, Feb 9, 2017 at 6:58 AM, Bryant, Michael > <michael.bry...@kcl.ac.uk> wrote: >> Hi all, >> >

Re: Simulating group.facet for JSON facets, high mem usage w/ sorting on aggregation...

2017-02-10 Thread Yonik Seeley
On Thu, Feb 9, 2017 at 6:58 AM, Bryant, Michael wrote: > Hi all, > > I'm converting my legacy facets to JSON facets and am seeing much better > performance, especially with high cardinality facet fields. However, the one > issue I can't seem to resolve is excessive

Re: ClassCastException: BasicResultContext cannot be cast to SolrDocumentList

2016-12-20 Thread Yonik Seeley
This is a bug (that code should no longer be expecting a SolrDocumentList) Can you open a JIRA issue? -Yonik On Tue, Dec 20, 2016 at 12:02 PM, Yago Riveiro wrote: > I'm hitting this exception in 6.3.0, any ideas? > > null:java.lang.ClassCastException: >

Re: Nested JSON Facets (Subfacets)

2016-12-15 Thread Yonik Seeley
Interesting I don't recall a bug like that being fixed. Anyway, glad it works for you now! -Yonik On Thu, Dec 15, 2016 at 11:01 AM, Chantal Ackermann wrote: > Hi Yonik, > > after upgrading to Solr 6.3.0, the nested function works as expected! (Both > with and

Re: Nested JSON Facets (Subfacets)

2016-12-14 Thread Yonik Seeley
That should work... what version of Solr are you using? Did you change the type of the popularity field w/o completely reindexing? You can try to verify the number of documents in each bucket that have the popularity field by adding another sub-facet next to cat_pop:

Re: Rollback w/ Atomic Update

2016-12-13 Thread Yonik Seeley
On Tue, Dec 13, 2016 at 10:36 AM, Todd Long wrote: > We've noticed that partial updates are not rolling back with subsequent > commits based on the same document id. Our only success in mitigating this > issue has been to issue an empty commit immediately following the rollback.

Re: empty result set for a sort query

2016-12-12 Thread Yonik Seeley
Ah, 2-phase distributed search is the most likely answer (and currently classified as more of a limitation than a bug)... Phase 1 collects the top N ids from each shard (and merges them to find the global top N) Phase 2 retrieves the stored fields for the global top N If any of the ids have been

Re: empty result set for a sort query

2016-12-11 Thread Yonik Seeley
On Sun, Dec 11, 2016 at 11:22 AM, moscovig wrote: > Hi > In solr 6.2.1 as server and solr 6.2.0 for client > It's a 2 shards index, 3 replicas for each shard. > > We are fetching the latest document with sorting over creationTime desc and > rows=1. > > At the same time we are

Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Yonik Seeley
We've got a patch to prevent the exceptions: https://issues.apache.org/jira/browse/SOLR-9712 -Yonik On Fri, Dec 9, 2016 at 7:45 PM, Joel Bernstein wrote: > The question about allowing more the one on-deck searcher is a good one. > The current behavior with

Re: Solr 6 Performance Suggestions

2016-11-22 Thread Yonik Seeley
It depends highly on what your requests look like, and which ones are slower. If you're request mix is heterogeneous, find the types of requests that seem to have the largest slowdown and let us know what they look like. -Yonik On Tue, Nov 22, 2016 at 8:54 AM, Max Bridgewater

Re: How to get "max(date)" from a facet field? (Solr 6.3)

2016-11-21 Thread Yonik Seeley
On Mon, Nov 21, 2016 at 3:42 PM, Michael Joyner wrote: > Help, > > (Solr 6.3) > > Trying to do a "sub-facet" using the new json faceting API, but can't seem > to figure out how to get the "max" date in the subfacet? > > I've tried a couple of different ways: > > == query == >

Re: SolrJ optimize method -- not returning immediately when the "wait" options are false

2016-11-08 Thread Yonik Seeley
https://issues.apache.org/jira/browse/SOLR-2018 There used to be a waitFlush parameter (wait until the IndexWriter has written all the changes) as well as a waitSearcher parameter (wait until a new searcher has been registered... i.e. whatever changes you made will be guaranteed to be visible).

Re: Parallelize Cursor approach

2016-11-04 Thread Yonik Seeley
No, you can't get cursor-marks ahead of time. They are the serialized representation of the last sort values encountered (hence not known ahead of time). -Yonik On Fri, Nov 4, 2016 at 8:48 PM, Chetas Joshi wrote: > Hi, > > I am using the cursor approach to fetch results

Re: Facets based on sampling

2016-11-04 Thread Yonik Seeley
Sampling has been on my TODO list for the JSON Facet API. How much it would help depends on where the bottlenecks are, but that in conjunction with a hashing approach to collection (assuming field cardinality is high) should definitely help. -Yonik On Fri, Nov 4, 2016 at 3:02 PM, John Davis

Re: Aggregate Values Inside a Facet Range

2016-11-04 Thread Yonik Seeley
On Fri, Nov 4, 2016 at 2:25 PM, Furkan KAMACI wrote: > I mean, I have to facet by dates and aggregate values inside that facet > range. Is it possible to do that without multiple queries at Solr? This (old) blog shows a percentiles calculation under a range facet:

Re: Merge policy

2016-10-27 Thread Yonik Seeley
On Thu, Oct 27, 2016 at 9:56 AM, Arkadi Colson wrote: > Thanks for the answer! > Do you know if there is a way to trigger an optimize for only 1 shard and > not the whole collection at once? > Adding a "distrib=false" parameter should work I think. -Yonik

Re: JSON Facet Syntax Sorting

2016-10-26 Thread Yonik Seeley
On Wed, Oct 26, 2016 at 3:16 AM, Zheng Lin Edwin Yeo wrote: > Hi, > > I'm using Solr 6.2.1. > > For the JSON Facet Syntax, are we able to sort on multiple values at one go? > > Like for example, if I want to sort by count, follow by the average price. > is this the correct

Re: Graph Traversal Question

2016-10-26 Thread Yonik Seeley
On Wed, Oct 26, 2016 at 7:13 AM, Grant Ingersoll <gsing...@apache.org> wrote: > On Tue, Oct 25, 2016 at 6:26 PM Yonik Seeley <ysee...@gmail.com> wrote: > > In your example below it would be akin to injecting the rating onto those > responses as well, not just in the

Re: Does _version_ field in schema need to be indexed and/or stored?

2016-10-25 Thread Yonik Seeley
On Tue, Oct 25, 2016 at 6:41 PM, Brent wrote: > I know that in the sample config sets, the _version_ field is indexed and not > stored, like so: > > > > Is there any reason it needs to be indexed? It may depend on your solr version, but the starting configsets currently

  1   2   3   4   5   6   7   8   9   10   >