Re: SPLITSHARD throwing OutOfMemory Error

2018-10-04 Thread Zheng Lin Edwin Yeo
Hi Atita, It would be good to consider upgrading to have the use of the better features like better memory consumption and better authentication. On a side note, it is also good to upgrade now in Solr 7, as Solr Indexes can only be upgraded from the previous major release version (Solr 6) to the

Connecting Solr to Nutch

2018-10-04 Thread Timeka Cobb
Hello out there! I'm trying to create a small search engine and have installed Nutch 1.15 and Solr 7.5.0..issue now is connecting the 2 primarily because the files required to create the Nutch core in Solr doesn't exist i.e. basicconfig. How do I go about connecting the 2 so I can begin crawling

Re: checksum failed (hardware problem?)

2018-10-04 Thread Stephen Bianamara
To be more concrete: Is the definitive test of whether or not a core's index is corrupt to copy it onto a new set of hardware and attempt to write to it? If this is a definitive test, we can run the experiment and update the report so you have a sense of how often this happens. Since this is a

Boolean clauses in ComplexPhraseQuery

2018-10-04 Thread Chuming Chen
Hi All, Does Solr supports boolean clauses inside ComplexPhraseQuery? For example: {!complexphrase inOrder=true} NOT (field: “value is this” OR field: “value is that”) Thanks, Chuming

Re: Filtering group query results

2018-10-04 Thread Shawn Heisey
On 10/4/2018 7:10 AM, Greenhorn Techie wrote: We have a requirement where we need to perform a group query in Solr where results are grouped by user-name (which is a field in our indexes) . We then need to filter the results based on numFound response parameter present under each group. In

Re: Modify the log directory for dih

2018-10-04 Thread Shawn Heisey
On 10/4/2018 12:30 AM, lala wrote: Hi, I am using: Solr: 7.4 OS: windows7 I start solr using a service on startup. In that case, I really have no idea where anything is on your system. There is no service installation from the Solr project for Windows -- either you obtained that from

Re: solr and diversification

2018-10-04 Thread Diego Ceccarelli (BLOOMBERG/ LONDON)
The use case is on ranking news, Joel. And yes, I have the feeling that it might improve relevance and in 2011/2012 there was a lot of work on this in academia.. Thanks Tim, I'll check out MMR. From: solr-user@lucene.apache.org At: 09/28/18 20:24:44To: solr-user@lucene.apache.org Subject:

Filtering group query results

2018-10-04 Thread Greenhorn Techie
Hi, We have a requirement where we need to perform a group query in Solr where results are grouped by user-name (which is a field in our indexes) . We then need to filter the results based on numFound response parameter present under each group. In essence, we want to return results only where

Re: Update Request Processors are Not Chained

2018-10-04 Thread Furkan KAMACI
I found the problem :) Problem is processor are not combined into one chain. On Thu, Oct 4, 2018 at 3:57 PM Furkan KAMACI wrote: > I've defined my update processors as: > > > class="org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory"> > >

Update Request Processors are Not Chained

2018-10-04 Thread Furkan KAMACI
I've defined my update processors as: content en,tr language_code other true true true signature false content 3 org.apache.solr.update.processor.TextProfileSignature

Re: SPLITSHARD throwing OutOfMemory Error

2018-10-04 Thread Atita Arora
Hi Andrzej, We're rather weighing on a lot of other stuff to upgrade our Solr for a very long time like better authentication handling, backups using CDCR, new Replication mode and this probably has just given us another reason to upgrade. Thank you so much for the suggestion, I think its good to

Re: SPLITSHARD throwing OutOfMemory Error

2018-10-04 Thread Andrzej Białecki
I know it’s not much help if you’re stuck with Solr 6.1 … but Solr 7.5 comes with an alternative strategy for SPLITSHARD that doesn’t consume as much memory and nearly doesn’t consume additional disk space on the leader. This strategy can be turned on by “splitMethod=link” parameter. > On 4

Re: SPLITSHARD throwing OutOfMemory Error

2018-10-04 Thread Atita Arora
Hi Edwin, Thanks for following up on this. So here are the configs : Memory - 30G - 20 G to Solr Disk - 1TB Index = ~ 500G and I think that it possibly is due to the reason why this could be happening is that during split shard, the unsplit index + split index persists on the instance and may

Re: SPLITSHARD throwing OutOfMemory Error

2018-10-04 Thread Zheng Lin Edwin Yeo
Hi Atita, What is the amount of memory that you have in your system? And what is your index size? Regards, Edwin On Tue, 25 Sep 2018 at 22:39, Atita Arora wrote: > Hi, > > I am working on a test setup with Solr 6.1.0 cloud with 1 collection > sharded across 2 shards with no replication. When

Re: CMS GC - Old Generation collection never finishes (due to GC Allocation Failure?)

2018-10-04 Thread Ere Maijala
Hi, In addition to what others wrote already, there are a couple of things that might trigger sudden memory allocation surge that you can't really account for: 1. Deep paging, especially in a sharded index. Don't allow it and you'll be much happier. 2. Faceting without docValues

Re: Modify the log directory for dih

2018-10-04 Thread lala
Hi, I am using: Solr: 7.4 OS: windows7 I start solr using a service on startup. Additional info: I am developing a web application that uses solr as search engine, I use DIH to index folders in solr using the FileListEntityProcessor. What I need is logging each index operation in a file that I