Re: query parsing

2015-09-24 Thread Upayavira
typically, the index dir is inside the data dir. Delete the index dir and you should be good. If there is a tlog next to it, you might want to delete that also. If you dont have a data dir, i wonder whether you set the data dir when creating your core or collection. Typically the instance dir and

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-24 Thread Uwe Reh
Am 23.09.2015 um 10:02 schrieb Mikhail Khludnev: ... Accelerating non-DV facets is not so clear so far. Please show profiler snapshot for non-DV facets if you wish to go this way. Hi, attached is a visualvm profile to several times a simplified query (just one facet):

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-24 Thread Mikhail Khludnev
Uwe Unfortunately fieldValueCache was dropped there https://github.com/apache/lucene-solr/commit/fca4c22da81447867533fb28c0f06150cdc2eb9d#diff-5ac9dc7b128b4dd99b764060759222b2R428 However, I see that it's still available in new JSON facets (thus, you need to amend your app). Otherwise, you can

RE: Solr Join between two indexes taking too long.

2015-09-24 Thread Russell Taylor
Hi Mikhail, The initial join query is slow-ish but okay. It's the paging through the results which is very fast which is what we need and 5.3 gives us that. In 4.10 is was still slow. So yes we are happy with the query time. The longValue is not multi valued, I tried

Re: Solr Log Analysis

2015-09-24 Thread Ahmet Arslan
Hi Tarala, Never used my self but please see: https://soleami.com/blog/soleami-start_en.html Ahmet On Thursday, September 24, 2015 3:16 AM, "Tarala, Magesh" wrote: I'm using Solr 4.10.4 in a 3 node cloud setup. I have 3 shards and 3 replicas for the collection. I want to

Re: query parsing

2015-09-24 Thread Alessandro Benedetti
I would focus on this : " > 5> now kick off the DIH job and look again. > Now it shows a histogram, but most of the "terms" are long -- the full texts of (the table.column) eventlogtext.logtext, including the whitespace (with %0A used for newline characters)... So, it appears it is not being

Re: Weird Exception

2015-09-24 Thread Upayavira
What were you trying to do when this happened? Bear in mind that a tdate field *is* by definition multivalued. It is indexed at multiple levels of precision. I bet if you reindexed with this field as a date field type, you won't hit this issue. The date field type is still a TrieDateField, but

Re: Taking Solr to production with docker

2015-09-24 Thread Ugo Matrangolo
Hi, still don't get it :) With Solr 5 it auto-installs itself as a supervised service and works really nice in an AWS CloudFormation template. Best Ugo On Wed, Sep 23, 2015 at 10:01 PM, Joe Lawson < jlaw...@opensourceconnections.com> wrote: > we get to run commands like, docker run solr and

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-24 Thread Uwe Reh
Am 22.09.2015 um 18:10 schrieb Walter Underwood: Faceting on an author field is almost always a bad idea. Or at least a slow, expensive idea. Hi Wunder, n a technical context, the 'author'-facet may be suboptimal. In our businesses (library services) it's a core feature. Yes the facet is

Help on autocomplete / suggester

2015-09-24 Thread Andrea Gazzarini
Hi guys, as part of a customer requirement, I need to provide an autocomplete / suggester feature. For that reason I started looking at the Suggester Component. The target Solr version is not yet determined: I mean, there's another project in production, of the same customer, which is using

Re: Is docValues required in Solr 5.x for distributed result grouping?

2015-09-24 Thread Alessandro Benedetti
I didn't know DocValues to be used apart from sorting and faceting (fc algorithm) on fields. Of course the Doc Values data structure can be used by anything who wants to retrieve the column base view of documents per field, but is anywhere documented all the ways it's used in Solr ? Cheers

RE: query parsing

2015-09-24 Thread Duck Geraint (ext) GBJH
Okay, so maybe I'm missing something here (I'm still relatively new to Solr myself), but am I right in thinking the following is still in your solrconfig.xml file: true managed-schema If so, wouldn't using a managed schema make several of your field definitions inside the

Re: Taking Solr to production with docker

2015-09-24 Thread Joe Lawson
I think this sums up "what is docker": https://youtu.be/F44GtxHO2MI On Sep 24, 2015 4:37 AM, "Ugo Matrangolo" wrote: > Hi, > > still don't get it :) > > With Solr 5 it auto-installs itself as a supervised service and works > really nice in an AWS CloudFormation

Re: Dismax and StandardTokenizer: OR queries despite mm=100%

2015-09-24 Thread Andreas Hubold
Thank you, autoGeneratePhraseQueries did the job. I assume that this setting just affects query generation and I don't need to reindex after changing the field type accordingly. Is this correct? BTW, I just found SOLR-3589 where the same issue was reported and fixed for the edismax parser.

Re: Dismax and StandardTokenizer: OR queries despite mm=100%

2015-09-24 Thread Ahmet Arslan
Hi Andreas, You are correct, no re-indexing required for autoGeneratePhraseQueries. Ahmet On Thursday, September 24, 2015 3:52 PM, Andreas Hubold wrote: Thank you, autoGeneratePhraseQueries did the job. I assume that this setting just affects query generation

Re: Taking Solr to production with docker

2015-09-24 Thread Epo Jemba
Ugo Don't get me wrong I know Solr is already scaling by itself , But in some cases, Solr in order to be fully usable has to be integrated/extended with a bunch of other apps : Your own, load-balancers, frontends , etc In order ALL of those work together the right way, you come up with a higher

Re: Taking Solr to production with docker

2015-09-24 Thread Martijn Koster
> On 23 Sep 2015, at 15:13, Upayavira wrote: > > I'm wondering if there's anything specific that is needed to run Solr > inside Docker? Is there something you have in mind? There isn't really. See https://hub.docker.com/r/makuk66/docker-solr/

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-24 Thread Yonik Seeley
On Mon, Sep 21, 2015 at 8:09 AM, Uwe Reh wrote: > our bibliographic index (~20M entries) runs fine with Solr 4.10.3 > With Solr 5.3 faceted searching is constantly incredibly slow (~ 20 seconds) [...] > > The 'fieldValueCache' seems to be unused (no inserts nor

Re: Is docValues required in Solr 5.x for distributed result grouping?

2015-09-24 Thread Tomoko Uchida
Hi, > Of course the Doc Values data structure can be used by anything who wants > to retrieve the column base view of documents per field, but is anywhere > documented all the ways it's used in Solr ? According to the Solr guide about docvalues, it is used in faceting, sorting, and grouping.

Solr Cloud: Indexing in a Map reduce Job with Kerberos

2015-09-24 Thread Bertrand Venzal
Hi all, As a bit of background, we're trying to run a map-reduce job on a Hadoop cluster (CDH version 5.4.5) which involved writing from Solr during both the Map phase. To accomplish this, we are using the Solrj library with version 4.10.3-cdh5.4.5. In the driver class which launch the MR Job,

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-24 Thread Alessandro Benedetti
Yonik, I am really excited about the Json faceting module. I find it really interesting. Is there any pros/cons in using them, or it's definitely the "approach of the future" ? I saw your benchmarks and seems impressive. I have not read all the topic in details, just briefly, but is Json faceting

Re: query parsing

2015-09-24 Thread Erick Erickson
Geraint: Good Catch! I totally missed that. So all of our focus on schema.xml has been... totally irrelevant. Now that you pointed that out, there's also the addition: add-unknown-fields-to-the-schema, which indicates you started this up in "schemaless" mode. In short, solr is trying to guess

Re: Cloud Deployment Strategy... In the Cloud

2015-09-24 Thread Dan Davis
ant is very good at this sort of thing, and easier for Java devs to learn than Make. Python has a module called fabric that is also very fine, but for my dev. ops. it is another thing to learn. I tend to divide things into three categories: - Things that have to do with system setup, and need

Re: Is docValues required in Solr 5.x for distributed result grouping?

2015-09-24 Thread Oliver Schrenk
The error message looks a lot like this bug https://issues.apache.org/jira/browse/SOLR-7495 group.faceting is broken for numeric values. Does it mean that I have to enable docvalues for every field that I want to facet on? On 24 Sep 2015, at 17:02, Tomoko Uchida

Re: Solr Log Analysis

2015-09-24 Thread Will Hayes
Hi - If you use Logstash for the log ingestion the config below will parse what you need for search analytics including: terms, 0 results, response times and more. Happy to assist off-list if you have any questions

[ANNOUNCE] Apache Lucene 5.3.1 released

2015-09-24 Thread Noble Paul
24 September 2015, Apache Solr™ 5.3.1 available The Lucene PMC is pleased to announce the release of Apache Solr 5.3.1 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting,

[ANNOUNCE] Apache Solr 5.3.1 released

2015-09-24 Thread Noble Paul
24 September 2015, Apache Solr™ 5.3.1 available The Lucene PMC is pleased to announce the release of Apache Solr 5.3.1 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting,

Re: [ANNOUNCE] Apache Lucene 5.3.1 released

2015-09-24 Thread Noble Paul
Wrong title On Thu, Sep 24, 2015 at 10:55 PM, Noble Paul wrote: > 24 September 2015, Apache Solr™ 5.3.1 available > > > The Lucene PMC is pleased to announce the release of Apache Solr 5.3.1 > > > Solr is the popular, blazing fast, open source NoSQL search platform > > from

Different ports for search and upload request

2015-09-24 Thread Siddhartha Singh Sandhu
Hi, I wanted to know if we can configure different ports as end points for uploading and searching API. Also, if someone could point me in the right direction. Regards, Sid.

Re: Solr Log Analysis

2015-09-24 Thread Otis Gospodnetić
Hi Magesh, Here are 2 more solutions you could use: 1) Site Search Analytics -- this basically integrates into your search results via JS like Google Analytics and automatically captures a bunch of search and click data and gives you a number of

Autowarm and filtercache invalidation

2015-09-24 Thread Jeff Wartes
If I configure my filterCache like this: and I have <= 10 distinct filter queries I ever use, does that mean I’ve effectively disabled cache invalidation? So my cached filter query results will never change? (short of JVM restart) I’m unclear on whether autowarm simply copies the value into

Re: Different ports for search and upload request

2015-09-24 Thread Siddhartha Singh Sandhu
Thank you so much. Safe to ignore the following(not a query):- *Never did this. *But how about this crazy idea: Take an Amazon EFS and share it between 2 EC2. Use one EC2 endpt to update the index on EFS while the other reads from it. This way each EC2 can use its own compute and not share its

Re: Different ports for search and upload request

2015-09-24 Thread Siddhartha Singh Sandhu
Hey, Thank you for your reply. The use case would be that I can concurrently load data into my index via one port and then make that(*data) available(NRT search) to user through another high availability search endpoint without the fear of my requests clogging one port. Regards, Sid. On Thu,

Re: Different ports for search and upload request

2015-09-24 Thread Yonik Seeley
On Thu, Sep 24, 2015 at 5:00 PM, Siddhartha Singh Sandhu wrote: > Hey, > > Thank you for your reply. > > The use case would be that I can concurrently load data into my index via > one port and then make that(*data) available(NRT search) to user through > another high

Re: Different ports for search and upload request

2015-09-24 Thread Alexandre Rafalovitch
But they would still compete for the servlet engine's threads. Putting them on different ports will not change anything. Now, if you wanted to put them on different network interfaces, that could be something. But I do not think it is possible, as the select and update are both just configuration

Re: Different ports for search and upload request

2015-09-24 Thread Shawn Heisey
On 9/24/2015 2:01 PM, Siddhartha Singh Sandhu wrote: > I wanted to know if we can configure different ports as end points for > uploading and searching API. Also, if someone could point me in the right > direction. >From our perspective, no. I have no idea whether it is possible at all ... it

Re: Autowarm and filtercache invalidation

2015-09-24 Thread Jeff Wartes
Answering my own question: Looks like the default filterCache regenerator uses the old cache to re-executes queries in the context of the new searcher and does nothing with the old cache value. So, the new searcher’s cache contents will be consistent with that searcher’s view, regardless of

Re: Different ports for search and upload request

2015-09-24 Thread billnbell
Scary stuff If you did that you better reload the core Bill Bell Sent from mobile > On Sep 24, 2015, at 5:05 PM, Siddhartha Singh Sandhu > wrote: > > Thank you so much. > > Safe to ignore the following(not a query):- > > *Never did this. *But how about this crazy

Re: Autowarm and filtercache invalidation

2015-09-24 Thread Erick Erickson
Jeff: Yes, exactly. Otherwise the autowarming would be quite useless since what's stored in the cache is the _lucene_ doc ID (either as a bitmap or as a list of IDs). And the lucene doc ID can change when merging, so the old IDs are useless. Best, Erick On Thu, Sep 24, 2015 at 2:11 PM, Jeff

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-24 Thread billnbell
Can we add it back with a parameter at least ? Bill Bell Sent from mobile > On Sep 24, 2015, at 8:58 AM, Yonik Seeley wrote: > >> On Mon, Sep 21, 2015 at 8:09 AM, Uwe Reh wrote: >> our bibliographic index (~20M entries) runs fine with Solr

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-24 Thread Yonik Seeley
On Thu, Sep 24, 2015 at 9:58 AM, Yonik Seeley wrote: > Indeed. Use of the fieldValueCache (UnInvertedField) was secretly > removed as part of LUCENE-5666, causing these performance regressions. I did some performance benchmarks and opened an issue. It's bad.

Re: Different ports for search and upload request

2015-09-24 Thread Susheel Kumar
I am not aware of such a feature in Solr but do want to know your use case / logic behind coming up with different ports. If it is for security / exposing to user, usually Solr shouldn't be exposed to user directly but via application / service / api. Thanks, Susheel On Thu, Sep 24, 2015 at

Re: faceting is unusable slow since upgrade to 5.3.0

2015-09-24 Thread Yonik Seeley
On Thu, Sep 24, 2015 at 10:16 AM, Alessandro Benedetti wrote: > Yonik, I am really excited about the Json faceting module. > I find it really interesting. > Is there any pros/cons in using them, or it's definitely the "approach of > the future" ? Thanks! The cons to

How to know index file in OS Cache

2015-09-24 Thread Aman Tandon
Hi, Is there any way to know that the index file/s is present in the OS cache or RAM. I want to check if the index is present in the RAM or in OS cache and which files are not in either of them. With Regards Aman Tandon