Re: Solr 5.5.2

2016-05-26 Thread Nick Vasilyev
t; 4> agitate for a 5.5.2 that includes this fix (after the fix has been > vetted). > > Best, > Erick > > On Thu, May 26, 2016 at 11:08 AM, Nick Vasilyev > <nick.vasily...@gmail.com> wrote: > > Is there an anticipated release date for 5.5.2? I know 5.5.1 was just &g

Solr 5.5.2

2016-05-26 Thread Nick Vasilyev
Is there an anticipated release date for 5.5.2? I know 5.5.1 was just released a while ago and although it fixes the faceting performance (SOLR-8096), distributed grouping is broken (SOLR-8940). I just need a solid 5.x release that is stable and with all core functionality working. Thanks

Re: Boosts for relevancy (shopping products)

2016-03-18 Thread Nick Vasilyev
ded input. > > I'll certainly look into the machine learning aspect, will be good to put > some basic knowledge I have into practice. > > I'd been led to believe the tie parameter didn't actually do a lot. :-/ > > > > On 03/18/2016 12:07 PM, Nick Vasilyev wrote: >

Inconsistent Shard Usage for Distributed Queries

2016-03-15 Thread Nick Vasilyev
Hello, I have a brand new installation of Solr 5.4.1 and I am running into a strange problem with one of my collections. Collection *products* has 5 shards and replication factor of two. Both replicas are up and show green status on the Cloud page in the UI. When I run a default search on the

Re: Inconsistent Shard Usage for Distributed Queries

2016-03-15 Thread Nick Vasilyev
e > side-effect. Simply issuing a commit on the url to the _collection_ will > cause > commits to happen on all replicas, as: > > blah/solr/collection/update?commit=true > > Best, > Erick > > On Tue, Mar 15, 2016 at 9:11 AM, Nick Vasilyev <nick.vasily...@gmai

Re: Inconsistent Shard Usage for Distributed Queries

2016-03-15 Thread Nick Vasilyev
parameters, > something like: > > blha blah blah/products/query/shard1_core3/query?q=*:*. That > addresses the specific core rather than rely on any internal query > routing logic.. > > Best, > Erick > > On Tue, Mar 15, 2016 at 8:43 AM, Nick Vasilyev <nick.vasily...

Solr Managed Schema by Default in 5.5

2016-03-11 Thread Nick Vasilyev
Hi, I started playing around with Solr 5.5 and created a collection using the following: ./solr create_collection -c test -p 9000 -replicationFactor 2 -d basic_configs -shards 2 The collection created fine, however I see that although I specified basic_configs, it was deployed in managed schema

Re: Solr Managed Schema by Default in 5.5

2016-03-11 Thread Nick Vasilyev
rote: > On 3/11/2016 7:01 AM, Nick Vasilyev wrote: > > Is this now the default behavior for basic_configs? I would really like > to > > maintain an option to easily create collection with classic schema > settings > > without jumping through all of these hoops. > >

Re: Inconsistent Shard Usage for Distributed Queries

2016-03-15 Thread Nick Vasilyev
, it will not be propagated to other shards and the same shard on the other replica. Full commit update?commit=true=true works fine. I know that the reload button was not intended to issue commits, but it's quicker than typing out the command. On Tue, Mar 15, 2016 at 12:24 PM, Nick Vasilyev <nick.vas

Re: Solr Managed Schema by Default in 5.5

2016-03-11 Thread Nick Vasilyev
Got it. Thank you for clarifying this, I was under impression that I would only be able to make changes via the API. I will look into this some more. On Fri, Mar 11, 2016 at 11:51 AM, Shawn Heisey <apa...@elyograg.org> wrote: > On 3/11/2016 9:28 AM, Nick Vasilyev wrote: > > May

Re: Boosts for relevancy (shopping products)

2016-03-19 Thread Nick Vasilyev
I work with a similar catalog; except our data is especially bad. We've found that several things helped: - Item level grouping (group same item sold by multiple vendors). Rank items with more vendors a bit higher. - Include a boost function for other attributes, such as an original image of the

Re: How fast indexing?

2016-03-20 Thread Nick Vasilyev
There can be a lot of factors, can you provide a bit of additional information to get started? - How many items are you indexing per second? - How does the indexing process look like? - How large is each item? - What hardware are you using? - How is your Solr set up? JVM memory, collection

Solr 5.2.1 on Java 8 GC

2016-04-28 Thread Nick Vasilyev
Hello, We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing long GC pauses when running jobs that do some hairy faceting. The same jobs worked fine with our previous 4.6 Solr. The JVM is configured with 32GB heap with default GC settings, however I've been tweaking the GC settings

Re: Solr 5.2.1 on Java 8 GC

2016-04-29 Thread Nick Vasilyev
; * Eclipse Memory Analyzer - I used this to analyze heap dumps before I got > a YourKit license: http://www.eclipse.org/mat/ > > Good luck! > > > > > > > On 4/28/16, 9:27 AM, "Yonik Seeley" <ysee...@gmail.com> wrote: > > >On Thu, Apr 28, 2016

Re: Solr5.5:DocValues/CopyField does not work with Atomic updates

2016-04-30 Thread Nick Vasilyev
I am also running into this problem on Solr 6. On Sun, Apr 24, 2016 at 6:10 PM, Karthik Ramachandran < kramachand...@commvault.com> wrote: > I have opened JIRA > > https://issues.apache.org/jira/browse/SOLR-9034 > > I will upload the patch soon. > > With Thanks & Regards > Karthik Ramachandran >

Re: Solr 5.2.1 on Java 8 GC

2016-04-28 Thread Nick Vasilyev
at 12:06 PM, Yonik Seeley <ysee...@gmail.com> wrote: > On Thu, Apr 28, 2016 at 11:50 AM, Nick Vasilyev > <nick.vasily...@gmail.com> wrote: > > mmfr_exact is a string field. key_phrases is a multivalued string field. > > One guess is that top-level field caches (and UnInv

Re: Solr 5.2.1 on Java 8 GC

2016-04-28 Thread Nick Vasilyev
can be re-worked some, especially considering there are thousands of similar requests going out. However we didn't have this issue before and I am worried that it may be a symptom of a larger underlying problem. On Thu, Apr 28, 2016 at 11:34 AM, Yonik Seeley <ysee...@gmail.com> wrote:

Re: Solr 5.2.1 on Java 8 GC

2016-04-28 Thread Nick Vasilyev
On Thu, Apr 28, 2016 at 11:43 AM, Nick Vasilyev > <nick.vasily...@gmail.com> wrote: > > Hi Yonik, > > > > I forgot to mention that the index is approximately 50 million docs split > > across 4 shards (replication factor 2) on 2 solr replicas. > > > > This

Re: Solr 5.2.1 on Java 8 GC

2016-04-28 Thread Nick Vasilyev
aybe 25% more or 2 GB more. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > > > > On Apr 28, 2016, at 8:50 AM, Nick Vasilyev <nick.vasily...@gmail.com> > wrote: > > > > mmfr_exact is a string field. ke

Re: Solr 5.2.1 on Java 8 GC

2016-04-28 Thread Nick Vasilyev
Correction, the key_phrases is set up as follows: On Thu, Apr 28, 2016 at 12:03 PM, Nick Vasilyev <nick.vasily...@gmail.com> wrote: > The working set is larger than the heap. This is our largest collection > and all sha

Re: API call for optimising a collection

2016-05-17 Thread Nick Vasilyev
As far as I know, you have to run it on each core. On May 18, 2016 1:04 AM, "Binoy Dalal" wrote: > Is there no api call that can optimize an entire collection? > > I tried the collections api page on the confluence wiki but couldn't find > anything, and a Google search

json.facet streaming

2016-05-17 Thread Nick Vasilyev
I am on the nightly build of 6.1 and I am experimenting with json.facet streaming, however the response I am getting back looks like regular query response. I was expecting something like the streaming api. Is this right or am I missing something? Hhere is the json.facet string.

Re: json.facet streaming

2016-05-17 Thread Nick Vasilyev
ot;:8928379}, { "processor":"FacetFieldProcessorStream", "elapse":0, "field":"group", "limit":10, "domainSize":8576804}]}, "json":{"facet":{

Re: json.facet streaming

2016-05-17 Thread Nick Vasilyev
wrote: > So it looks like facets are being computed... do you not see them in > the response? > -Yonik > > > On Tue, May 17, 2016 at 9:12 AM, Nick Vasilyev <nick.vasily...@gmail.com> > wrote: > > I enabled query debugging, here is the facet-trace snippet. > >

Re: json.facet streaming

2016-05-17 Thread Nick Vasilyev
Got it. Thanks for clarifying. On Tue, May 17, 2016 at 9:58 AM, Yonik Seeley <ysee...@gmail.com> wrote: > On Tue, May 17, 2016 at 9:41 AM, Nick Vasilyev <nick.vasily...@gmail.com> > wrote: > > Hi Yonik, I do see them in the response, but the JSON format is like > &g

Re: Re-indexing in SolRCloud while keeping the collection online -- Best practice?

2016-05-11 Thread Nick Vasilyev
Aliasing works great, I implemented it after upgrading to Solr 5 and it allows us to do this exact thing. The only thing you have to watch out for is indexing new items (if they overwrite old ones) while you are re-indexing. I took it a step further for another collection that stores a lot of

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-05 Thread Nick Vasilyev
are using the "same OS version" on a diff machine, that > could > >> explain the discrepency if you (or someone else) increased the file > >> descriptor limit on the "old machine" but that neverh appened on the > 'new > >> machine"

Re: Filtering on nGroups

2016-05-06 Thread Nick Vasilyev
(probably). pre-processing isn't very dynamic though so there are lots of situations where that's just not viable. Best, Erick On Thu, May 5, 2016 at 6:05 PM, Nick Vasilyev <nick.vasily...@gmail.com> wrote: > I am grouping documents on a field and would like to retrieve documents &g

Re: Filtering on nGroups

2016-05-06 Thread Nick Vasilyev
I guess it would also work if I could facet on the group counts. I just need to know how many groups of different sizes there are. On Fri, May 6, 2016 at 2:10 PM, Nick Vasilyev <nick.vasily...@gmail.com> wrote: > I am on 6.1 preview, I just need this to gather some one time m

Re: Solr 5.2.1 on Java 8 GC

2016-05-01 Thread Nick Vasilyev
How do you log GC frequency and time to compare it with other GC configurations? Also, do you tweak parameters automatically or is there a set of configuration that get tested? Lastly, I was under impression that G1 is not recommended to be used based on some issues with Lucene, so I haven't

Filtering on nGroups

2016-05-05 Thread Nick Vasilyev
I am grouping documents on a field and would like to retrieve documents where the number of items in a group matches a specific value or a range. I haven't been able to experiment with all new functionality, but I wanted to see if this is possible without having to calculate the count and add it

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-04 Thread Nick Vasilyev
<susheel2...@gmail.com> wrote: > Thanks, Nick. Do we know any suggested # for file descriptor limit with > Solr6? Also wondering why i haven't seen this problem before with Solr > 5.x? > > On Wed, May 4, 2016 at 4:54 PM, Nick Vasilyev <nick.vasily...@gmail.com> > wrote: &g

Re: Solr cloud 6.0.0 with ZooKeeper 3.4.8 Errors

2016-05-04 Thread Nick Vasilyev
It looks like you have too many open files, try increasing the file descriptor limit. On Wed, May 4, 2016 at 3:48 PM, Susheel Kumar wrote: > Hello, > > I am trying to setup 2 node Solr cloud 6 cluster with ZK 3.4.8 and used the > install service to setup solr. > > After

Re: block join rollups

2016-04-18 Thread Nick Vasilyev
Hi Yonik, Well, no one replied to this yet, so I thought I'd chime in with some of the use cases that I am working with. Please note that I am lagging a big behind the last few releases, so I haven't had time to experiment with Solr 5.3+, I am sure that some of this is included in there already

JSON Facet Stats Mincount

2016-04-14 Thread Nick Vasilyev
Hello, I am trying to get a list of items that have more than one manufacturer using the following json facet query. This works fine without mincount, but errors out as soon as I add it. Is this possible or am I doing something wrong? json.facet={ groupID: { type: terms, field:

Solr Rounding Issue On Float fields.

2016-07-21 Thread Nick Vasilyev
Hi, I am running into a weird rounding issue on Solr 5.2.1. I have a float field (also tried tfloat), I am indexing 154035.26 into it (confirmed in the data), but at query time, I get back 154035.27 (.01 more). Additionally when I query for the document and include this number in the q parameter,

Re: Solr Rounding Issue On Float fields.

2016-07-21 Thread Nick Vasilyev
, Nick Vasilyev <nick.vasily...@gmail.com> wrote: > Hi, I am running into a weird rounding issue on Solr 5.2.1. I have a float > field (also tried tfloat), I am indexing 154035.26 into it (confirmed in > the data), but at query time, I get back 154035.27 (.01 more). > Additio

Re: Solr Rounding Issue On Float fields.

2016-07-21 Thread Nick Vasilyev
Thanks Chris. Searching for both values and retrieving the documents would be alright as long as the data was correct. In this case, the data that I am indexing into Solr is not the same data that I am pulling out at query time. That is the real impact here. On Thu, Jul 21, 2016 at 6:12 PM,

Re: Use of solr + banana for faceted search

2016-07-21 Thread Nick Vasilyev
repository of custom panels > for Banana which we can benefit from ? > > Sincerely, > Darshan > > On Wed, Jul 20, 2016 at 11:55 AM, Darshan Pandya <darshanpan...@gmail.com> > wrote: > > > Nick, Thanks for your help. I'll test it out and respond back. > > > >

Re: Use of solr + banana for faceted search

2016-07-20 Thread Nick Vasilyev
Banana has a facet panel that allows you to configure several fields to facet on, you can have multiple fields and they will show up as an accordion. However, keep in mind that the field needs to be tokenized for faceting (i.e. string) and upon selection the filter is added to the fq parameter in

Re: How to re-index SOLR data

2016-08-09 Thread Nick Vasilyev
Hi, I work on a python Solr Client library and there is a reindexing helper module that you can use if you are on Solr 4.9+. I use it all the time and I think it works pretty well. You can re-index all documents from a collection into another

Re: Find groups where at least one item matches a query

2017-02-05 Thread Nick Vasilyev
Check out the group.limit argument. On Feb 5, 2017 12:10 PM, "Cristian Popovici" wrote: > Erick, thanks for you answer. > > Sorry - I forgot to mention that I do not know the group id when I perform > the query. > Grouping - I think - does not help for me as it

Discreptancy in json.facet uniqe and group.ngroups

2016-09-05 Thread Nick Vasilyev
Hi, I need to get the number of distinct values of a field and I am getting different counts between the json.facet interface and group.ngroups. Here are the two queries: {'q': '*:*', 'rows': 0, 'json.facet': '{'mfr': "unique('mfr')"}' }) This brings up around 6,000 in the mfr field. However,

Re: Discreptancy in json.facet uniqe and group.ngroups

2016-09-06 Thread Nick Vasilyev
, Alexandre Rafalovitch <arafa...@gmail.com> wrote: > Perhaps https://issues.apache.org/jira/browse/SOLR-7452 ? > > Newsletter and resources for Solr beginners and intermediates: > http://www.solr-start.com/ > > > On 5 September 2016 at 23:07, Nick Vasilyev <nic

Re: Miserable Experience Using Solr. Again.

2016-09-15 Thread Nick Vasilyev
Just wanted to chime in on the technical set-up of the Solr "petting zoo", I think I can help here; just let me know what you need. Here is the idea; just have a vagrant box with ansible provisioning Zoo keepers and Solr, creating collections, and etc That way anyone starting out can just

Re: Best python 3 client for solrcloud

2016-11-24 Thread Nick Vasilyev
I am a comitter for https://github.com/moonlitesolutions/SolrClient. I think its pretty good, my aim with it is to provide several reusable modules for working with Solr in python. Not just querying, but working with collections indexing, reindexing, etc.. Check it out and let me know what you

Re: How to retrieve 200K documents from Solr 4.10.2

2016-10-12 Thread Nick Vasilyev
Check out cursorMark, it should be available in your release. There is some good information on this page: https://cwiki.apache.org/confluence/display/solr/Pagination+of+Results On Wed, Oct 12, 2016 at 5:46 PM, Salikeen, Obaid < obaid.salik...@iacpublishinglabs.com> wrote: > Hi, > > I am using