Re: SolrCloud fails to create new collections

2014-02-06 Thread Ray Cheng
I restarted one of the Solr server of the SolrCloud and then domain creation worked. Not sure why this happened, whether anybody has seen it, and when this problem will happen again. Thanks, Ray On Wednesday, February 5, 2014 11:19 AM, Ray Cheng rch...@rocketmail.com wrote: Some more

Re: Optimize and replication: some questions battery.

2014-02-06 Thread Luis Cappa Banda
Hi Chris, Thank you very much for your response! It was very instructive. I knew some performance tips to improve search and I configured a very low merge factor (mergeFactor2/mergeFactor) to boost search operations instead of indexation ones. I haven't got a deep knowledge of internal Lucene

Re: Optimize and replication: some questions battery.

2014-02-06 Thread Toke Eskildsen
On Thu, 2014-02-06 at 10:22 +0100, Luis Cappa Banda wrote: I knew some performance tips to improve search and I configured a very low merge factor (mergeFactor2/mergeFactor) to boost search operations instead of indexation ones. That would give you a small search speed increase and a huge

Re: geofilt customization

2014-02-06 Thread Sohan Kalsariya
Well I am explaining my problem more in detail. I working with the website http://allevents.in ( It is events discovery and promotion platform) I am applying solr search in this site. So whenever anyone visit this website and search the events for example: Concerts in London. For that I am using

Re: Optimize and replication: some questions battery.

2014-02-06 Thread Luis Cappa Banda
Hi Toke! Thanks for answering. That's it: I talk about index corruption just to prevent, not because I have already noticed it. During some tests in the past I checked that a mergeFactor of 2 improves more than a little bit search speed instead common merge factors such as 10, for example. Of

Re: Optimize Index in solr 4.6

2014-02-06 Thread Shawn Heisey
On 2/5/2014 11:20 PM, Sesha Sendhil Subramanian wrote: I am running solr cloud with 10 shards. I do a batch indexing once everyday and once indexing is done I call optimize. I see that optimize happens on each shard one at a time and not in parallel. Is it possible for the optimize to happen

Re: geofilt customization

2014-02-06 Thread Angel Tchorbadjiiski
Hey Sohan! On 06.02.2014 11:13, Sohan Kalsariya wrote: Well I am explaining my problem more in detail. I working with the website http://allevents.in ( It is events discovery and promotion platform) I am applying solr search in this site. So whenever anyone visit this website and search the

NumberFormatException while replicating an index with SolrCloud

2014-02-06 Thread Ugo Matrangolo
Hi, ran in a weird error while replicating the index using SolrCloud. On a 4.6.1 cluster the indexing replica process fails for most of the documents with an error like this (on the replica side): 2014-02-06 11:55:45,249 [qtp-75] DEBUG org.apache.solr.update.processor.LogUpdateProcessor -

Re: geofilt customization

2014-02-06 Thread Yonik Seeley
On Thu, Feb 6, 2014 at 5:13 AM, Sohan Kalsariya sohankalsar...@gmail.com wrote: So in short i want to ask that Is it possible not to define the `distance` parameter ? So I can get my desired results? No, the distance parameter is required for geofilt. If you are still looking to calculate

Highlight results in Arabic are backword

2014-02-06 Thread Fatima Issawi
Hello, I am getting highlight results in Arabic, but the order of the words are backwards. Querying on that field gives me the correct result, though. Is there are setting I’m missing? An extract from an example query from my Solr Console is below: { responseHeader: { status: 0,

Performance impact using edismax over dismax

2014-02-06 Thread Srinivasa7
Hi All, I have a requirement to search a query for entire string in the query parameter, so I have shifted my query processor to edismax, and sending the query as 'actual query string'+OR+actual+query+string. It it returning desired results how ever I am wondering whether there is a

Re: Highlight results in Arabic are backword

2014-02-06 Thread Steve Rowe
Hi Fatima, I don’t think there’s an actual problem, it just looks like it because the program you’re using to look at the JSON makes a different choice for laying out the highlighting results than it does for the field values. In fact, all the bytes are the same, and in the same order for

Intercept updates and cascade loading of Index.

2014-02-06 Thread soodyogesh
I would like to know is there a way to intercept or callback whenever new documents get added to Collection My use case is I would have one Collection split across shards and whenever new document arrives in Collection I would like to get callback with document itself so that I can take some of

Re: Performance impact using edismax over dismax

2014-02-06 Thread Jack Krupansky
Are you using the pf parameter to automatically add the quoted phrase? You can include a boost in pf so that documents with the full phrase rank higher. Generally, dismax and edismax should give comparable performance, but some parameter settings (e.g., pf, pf2, pf3) do add somewhat to query

RE: Partial Word Search

2014-02-06 Thread Teague James
Jack, Thanks for responding! I had tried configuring this asymmetrically before with no luck, so I tried it again, and still no luck. My understanding is that the default behavior for Solr is OR and I do not have a 'q.op=' anywhere that would change that behavior. Since it is only a 1 term search

Re: Performance impact using edismax over dismax

2014-02-06 Thread Srinivasa7
HI Jack, I am not using pf parameter rather I am sending query in Quotes. So sample query string for 'east ende rs' I am querying east ende rs+OR+east+ende+rs with defType is solrconfig set to edismax. Other than that I am not using any parameters (pf, pf2, pf3) Thanks Srinivasa --

Re: Newbie question on Deduplication overWriteDupes flag

2014-02-06 Thread Chris Hostetter
: How do I achieve, add if not there, fail if duplicate is found. I though You can use the optimistic concurrency features to do this, by including a _version_=-1 field value in the document. this will instruct solr that the update should only be processed if the document does not already

Re: Partial Word Search

2014-02-06 Thread Jack Krupansky
Did you remember to completely reindex your data after changing the analyzer? Also, use the Solr admin UI Analysis page to verify the analysis for the test cases, both index and query. The fact that you are using the default query operator of OR means that even the symmetric analysis should

RE: Partial Word Search

2014-02-06 Thread Teague James
Update: RESOLVED On a hunch I decided to forego trying to separate the EdgeNGramFilterFactory from this one column and apply it to all columns that are copied into the 'text' filed that Solr uses for searching. I moved the filter factory into fieldType 'text_general' which is the type that

Re: Performance impact using edismax over dismax

2014-02-06 Thread Jack Krupansky
Use the pf parameter and then you won't have to modify the original query at all! And you can add a boost for the phrase, which is a common practice. pf=search-field^10.0 -- Jack Krupansky -Original Message- From: Srinivasa7 Sent: Thursday, February 6, 2014 11:21 AM To:

Commit Issue in Solr 3.4

2014-02-06 Thread samarth s
Hi, I have been using the solr version 3.4 in a project for about more than a year. It is only now that I have started facing a weird problem of never ending back to back commit cycles. I can say this looking at the InfoStream logs, that, as soon as one commit cycle is done with another one

Re: Writing a customize updateRequestHandler

2014-02-06 Thread Chris Hostetter
: I want to write a custom updateRequestHandler. : Can you pl.s guide me the steps I need to perform for that ? https://people.apache.org/~hossman/#xyproblem XY Problem Your question appears to be an XY Problem ... that is: you are dealing with X, you are assuming Y will help you, and you are

Re: Commit Issue in Solr 3.4

2014-02-06 Thread Shawn Heisey
On 2/6/2014 9:56 AM, samarth s wrote: Size of index = 260 GB Total Docs = 100mn Usual writing speed = 50K per hour autoCommit-maxDocs = 400,000 autoCommit-maxTime = 1500,000 (25 mins) merge factor = 10 M/c memory = 30 GB, Xmx = 20 GB Server - Jetty OS - Cent OS 6 With 30GB of RAM (is

Re: Max Limit to Schema Fields - Solr 4.X

2014-02-06 Thread Erick Erickson
Sometimes you can spoof the many fields problem by using prefixes on the data. Rather than fielda, fieldb... Have field and index values like fielda_value, fieldb_value into a single field. Then do the right thing when searching. Watch tokenization though. Best Erick On Feb 5, 2014 4:59 AM, Mike

Re: Intercept updates and cascade loading of Index.

2014-02-06 Thread Shalin Shekhar Mangar
There is no callback in Solr. You can either have your indexer application provide a call back or you can write a custom UpdateRequestProcessor which can extract the relevant fields into a new document and use SolrJ to write it to another collection. On Thu, Feb 6, 2014 at 9:17 PM, soodyogesh

Re: high memory usage with small data set

2014-02-06 Thread Erick Erickson
Check the admin page for the number of used cache entries as time passes. I'm wondering if you're consuming lots of memory but it's not apparent at first, your caches might be filling up over time... FWIW, Erick On Feb 5, 2014 8:19 AM, Johannes Siegert johannes.sieg...@marktjagd.de wrote: Hi

Re: NumberFormatException while replicating an index with SolrCloud

2014-02-06 Thread Ugo Matrangolo
Hi, I have just found out what the problem was: Solr does not support non native types like BigDecimals. Moving my schema fields to plain float solved the problem. Regards, Ugo On Thu, Feb 6, 2014 at 12:32 PM, Ugo Matrangolo ugo.matrang...@gmail.comwrote: Hi, ran in a weird error while

Tf-Idf for a specific query

2014-02-06 Thread David Miller
Hi Guys.. I require to obtain Tf-idf score from Solr for a certain set of documents. But the catch is that, I needs the IDF (or DF) to be calculated on the documents returned by the specific query and not the entire corpus. Please provide me some hint on whether Solr has this feature or if I can

Re: 4.3.1 SC - IndexWriter issues causing replication + failures

2014-02-06 Thread Tim Vaillancourt
Some more info to provide: -Replication almost never completes following the this IndexWriter is closed stacktraces. -When the replication begins after this IndexWriter is closed error, over a few hours the replica eventually fills the disk to 100% with index files under data/. There are so many

Re: Newbie question on Deduplication overWriteDupes flag

2014-02-06 Thread Alexandre Rafalovitch
A follow up question on this (as it is kind of new functionality). What happens if several documents are submitted and one of them fails due to that? Do they get rolled back or only one? Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn:

Re: Highlight results in Arabic are backword

2014-02-06 Thread Alexandre Rafalovitch
Arabic if complex. Basically, don't trust anything you see until you put that content on the screen with the surrounding tag marked with attribute dir='rtl' (e.g. p dir='rlt'arabic test/p). Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn:

Re: Intercept updates and cascade loading of Index.

2014-02-06 Thread Alexandre Rafalovitch
You could probably use a commit hook script (in solrconfig.xml) and then pull the relevant documents/fields (if they are stored) into the other instances either with DIH ( http://wiki.apache.org/solr/DataImportHandler#SolrEntityProcessor ) or just as a triggered import. Otherwise, you may have

Re: Intercept updates and cascade loading of Index.

2014-02-06 Thread soodyogesh
Thanks for reply Does that mean I need to compile SOLR with my custom request processor ? or there is way to extend existing framework and plugin new implementation Thanks. -- View this message in context:

Re: geofilt customization

2014-02-06 Thread Sohan Kalsariya
Thank you. On Thu, Feb 6, 2014 at 6:38 PM, Yonik Seeley yo...@heliosearch.com wrote: On Thu, Feb 6, 2014 at 5:13 AM, Sohan Kalsariya sohankalsar...@gmail.com wrote: So in short i want to ask that Is it possible not to define the `distance` parameter ? So I can get my desired results?

Facet pivot and distributed search

2014-02-06 Thread Geert Van Huychem
Hi I'm using Solr 4.5 in a multi-core environment. I've setup - one core per documenttype: text, rss, tweet and external documents. - one distrib core which basically distributes the query to the 4 cores mentioned hereabove. Facet pivot works on each core individually, but when I send the exact

Re: Facet pivot and distributed search

2014-02-06 Thread Shalin Shekhar Mangar
Yes this is a open issue. https://issues.apache.org/jira/browse/SOLR-2894 On Fri, Feb 7, 2014 at 1:13 PM, Geert Van Huychem ge...@iframeworx.be wrote: Hi I'm using Solr 4.5 in a multi-core environment. I've setup - one core per documenttype: text, rss, tweet and external documents. - one