Re: Best SSD block size for large SOLR indexes

2014-03-19 Thread Salman Akram
Thanks for the info. The articles were really useful but still seems I have to do my own testing to find the right page size? I thought for large indexes there would already be some tests done in SOLR community. Side note: We are heavily using Microsoft technology (.NET etc) for development so by

Re: Best SSD block size for large SOLR indexes

2014-03-19 Thread Salman Akram
We do have couple of commodity SSDs already and they perform good. However, our user queries are very complex and quite a few of them go above a minute so we really had to do something about it. Using this beast vs putting the whole index to RAM, the beast still seemed a better option. Also we

Re: Partial Counts in SOLR

2014-03-19 Thread Salman Akram
Anyone? On Mon, Mar 17, 2014 at 12:03 PM, Salman Akram salman.ak...@northbaysolutions.net wrote: Below is one of the sample slow query that takes mins! ((stock or share*) w/10 (sale or sell* or sold or bought or buy* or purchase* or repurchase*)) w/10 (executive or director) If a filter

Re: About enableLazyFieldLoading and memory

2014-03-19 Thread Miguel
An interesting check would be disable compression on stored fields, and to check if your searcher works better. Disable compression should increase stored and searcher should be quicker. I have read that disable compression all you need to do is to write a new codec that uses a stored fields

Re: About enableLazyFieldLoading and memory

2014-03-19 Thread david . davila
That could be an interesting test. Unfortunately now I don't have time to do that, but maybe in future. In order to avoid these memory consumptions we have reduced DocumentCache, and we don't have any problems. Besides, big queries that can cause problems are never made twice, so the

How to secure Solr admin page?

2014-03-19 Thread Tony Xue
Hi all, I was following the instructions in the official wiki: https://wiki.apache.org/solr/SolrSecurity But I don't have any idea about what directory I should put between url-pattern/url-pattern to secure only admin page. I tried to put url-pattern/admin/*/url-pattern but it didn't work.

Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-19 Thread Martin de Vries
We are running stable now for a full day, so the bug has been fixed. Many thanks! Martin

Support for wildcard queries in elevate.xml

2014-03-19 Thread Bratislav Stojanovic
Hi, I have searched the mailing list archives but couldn't find the right answer so far. I want to elevate some results using instructions from QueryElevationComponent page, but I'm not sure how to set queries in *elevate.xml* file. My query looks like this : (content:foobar OR text:foobar) AND

Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R
We run a central database of 14M (and growing) photos with dates, captions, keywords, etc. We currently upgrading from old Lucene Servers to latest Solr running with a couple of dedicated servers (6 core, 36GB, 500SSD). Planning on using Solr Cloud. We take in thousands of changes each day

Re: SolrCloud constantly crashes after upgrading to Solr 4.7

2014-03-19 Thread Steve Rowe
I’m glad it’s working for you now, thanks for reporting back. - Steve On Mar 19, 2014, at 5:32 AM, Martin de Vries mar...@downnotifier.com wrote: We are running stable now for a full day, so the bug has been fixed. Many thanks! Martin

frange and field with hyphen

2014-03-19 Thread Marcin Rzewucki
Hi everyone, I got the following issue recently. I'm trying to use frange on a field which has hyphen in name: lst name=params str name=debugQuerytrue/str str name=indenton/str str name=q*:*/str str name=wtxml/str arr name=fq str {!frange l=1 u=99}sub(if(1,

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Toke Eskildsen
On Wed, 2014-03-19 at 11:55 +0100, Colin R wrote: We run a central database of 14M (and growing) photos with dates, captions, keywords, etc. We currently upgrading from old Lucene Servers to latest Solr running with a couple of dedicated servers (6 core, 36GB, 500SSD). Planning on using

Re: Indexing large documents

2014-03-19 Thread Alexei Martchenko
Even the most non-structured data has to have some breakpoint. I've seen projects running solr that used to index whole books one document per chapter plus a synopsis boosted doc. The question here is how you need to search and match those docs. alexei martchenko Facebook

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R
Hi Toke Thanks for replying. My question is really regarding index architecture. One big or many small (with merged big ones) We probably get 5-10K photos added each day. Others are updated, some are deleted. Updates need to happen quite fast (e.g. within minutes of our Databases receiving

Re: frange and field with hyphen

2014-03-19 Thread Marcin Rzewucki
Wow, that was fast reply :) It works. Thank you! On 19 March 2014 13:24, Jack Krupansky j...@basetechnology.com wrote: For any improperly named field (that don't use the java identifier conventions), you simply need to use the field function with the field name in apostrophes:

Sort by exact match

2014-03-19 Thread Rok Rejc
Hi all, I have a field in the index - lets call it Name. Name can have one or more words. I want to query all documents which match by name (full or partial match), and order the results: - first display result(s) with exact matches - after that display results with partial matched and order them

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Toke Eskildsen
On Wed, 2014-03-19 at 13:28 +0100, Colin R wrote: My question is really regarding index architecture. One big or many small (with merged big ones) One difference is that having a single index/collection gives you better ranked searches within each collection. If you only use date/filename

Re: Partial Counts in SOLR

2014-03-19 Thread Erick Erickson
Yes, that'll be slow. Wildcards are, at best, interesting and at worst resource consumptive. Especially when you're doing this kind of positioning information as well. Consider looking at the problem sideways. That is, what is your purpose in searching for, say, buy*? You want to find buy,

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Colin R
Hi Toke Our current configuration Lucene 2.(something) with RAILO/CFML app server. 10K drives, Quad Core, 16GB, Two servers. But the indexing and searching are starting to fail and our developer is no longer with us so it is quicker to rebuild than fix all the code. Our existing config is lots

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Erick Erickson
Oh My. 2(something) is ancient, I second your move to scrap the current situation and start over. I'm really curious what the _reason_ for such a complex setup are/were. I second Toke's comments. This is actually quite small by modern Solr/Lucene standards. Personally I would index them all to a

Re: frange and field with hyphen

2014-03-19 Thread Erick Erickson
Jack's solution works, but I really, truly, strongly recommend that you follow the usual Java variable-naming conventions for your fields. In fact, I tend to use only lower case and underscores. The reason is that you'll run into this again and again and again. Your front-end will forget to put

Re: Sort by exact match

2014-03-19 Thread Erick Erickson
Sorting applies to the entire result set, there's no notion of sort some docs one way and sort others another way. So I don't know any OOB way to do what you want. I don't know what your response time requirements are, but you could do this by firing off two queries and collating the results. If

join and filter query with AND

2014-03-19 Thread Marcin Rzewucki
Hi, I have the following issue with join query parser and filter query. For such query: str name=q*:*/str str name=fq (({!join from=inner_id to=outer_id fromIndex=othercore}city:Stara Zagora)) AND (prod:214) /str I got error: lst name=error str name=msg org.apache.solr.search.SyntaxError:

searche for single char number when ngram min is 3

2014-03-19 Thread Andreas Owen
Is there a way to tell ngramfilterfactory while indexing that number shall never be tokenized? then the query should be able to find numbers. Or do i have to change the ngram min for numbers to 1, if that is possible? So to speak put the hole number as token and not all possible tokens. Or can i

Re: More heap usage in Solr during indexing

2014-03-19 Thread solr2020
We are doing Autocommit for every five minutes. -- View this message in context: http://lucene.472066.n3.nabble.com/More-heap-usage-in-Solr-during-indexing-tp4124898p4125497.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: searche for single char number when ngram min is 3

2014-03-19 Thread Jack Krupansky
Interesting point. I think it would be nice to have an option to treat numeric sequences (or maybe with commas and decimal point as well) as integral tokens that won't be split by ngramming. It's worth a Jira. OTOH, you have to make a value judgment whether a query for 3.14 should only exact

Re: Best SSD block size for large SOLR indexes

2014-03-19 Thread Shawn Heisey
On 3/19/2014 12:09 AM, Salman Akram wrote: Thanks for the info. The articles were really useful but still seems I have to do my own testing to find the right page size? I thought for large indexes there would already be some tests done in SOLR community. Side note: We are heavily using

Re: join and filter query with AND

2014-03-19 Thread Erick Erickson
It looks to me like you're feeding this from some kind of text file and you really _do_ have a line break after Stara Or have a line break in the string you paste into the URL or something similar. Kind of shooting in the dark though. Erick On Wed, Mar 19, 2014 at 8:48 AM, Marcin Rzewucki

Re: underscore in query error

2014-03-19 Thread Erick Erickson
Attachments don't come through the user list very well, you might have to put it up on pastebin or some such and provide a link. But my guess is that your analysis chain is doing something interesting you don't expect, the analyzer output you tried to paste would help here. Also, if you could

Re: Newbie Question: Master Index or 100s Small Index

2014-03-19 Thread Shawn Heisey
On 3/19/2014 4:55 AM, Colin R wrote: My question is an architecture one. These photos are currently indexed and searched in three ways. 1: The 14M pictures from above are split into a few hundred indexes that feed a single website. This means index sizes of between 100 and 500,000 entries

underscore in query error

2014-03-19 Thread Andreas Owen
If I use the underscore in the query I don't get any results. If I remove the underscore it finds the docs with underscore. Can I tell solr to search through the ngtf instead of the wdf or is there any better solution? Query: yh_cug I attached a doc with the analyzer output

Re: Partial Counts in SOLR

2014-03-19 Thread Salman Akram
This was one example. Users can even add phrase searches with wildcards/proximity etc so can't really use stemming. Sharding is definitely something we are already looking into. On Wed, Mar 19, 2014 at 6:59 PM, Erick Erickson erickerick...@gmail.comwrote: Yes, that'll be slow. Wildcards are,

Filter in terms component

2014-03-19 Thread Jilani Shaik
Hi, I have huge index and using Solr. I need terms component with filter by a field. Please let me know is there anything that I can get it. Please provide me some pointers, even to develop this by going through the Lucene. Please suggest. Thanks, Jilani

Re: Filter in terms component

2014-03-19 Thread Ahmet Arslan
Hi Jilani, What features of terms component are you after? If if it is just terms.prefix, it could be simulated with facet component with facet.prefix parameter. faceting component respects filter queries. On Wednesday, March 19, 2014 8:58 PM, Jilani Shaik jilani24...@gmail.com wrote: Hi,

Re: Filter in terms component

2014-03-19 Thread Jilani Shaik
Hi Ahmet, I have gone through the facet component, as our application has 300+ million docs and it very time consuming with this component and also it uses cache. So I have gone through the terms component where Solr is reading index for field terms, is there any approach where I can get the

Re: Filter in terms component

2014-03-19 Thread Ahmet Arslan
Hi, If you just need counts may be you can make use of  http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions Ahmet On Wednesday, March 19, 2014 9:49 PM, Jilani Shaik jilani24...@gmail.com wrote: Hi Ahmet, I have gone through the facet component, as our application has 300+ million

w/10 ? [was: Partial Counts in SOLR]

2014-03-19 Thread T. Kuro Kurosaka
In the thread Partial Counts in SOLR, Salman gave us this sample query: ((stock or share*) w/10 (sale or sell* or sold or bought or buy* or purchase* or repurchase*)) w/10 (executive or director) I'm not familiar with this w/10 notation. What does this mean, and what parser(s) supports this

Excessive Heap Usage from docValues?

2014-03-19 Thread tradergene
Hello All, I'm hoping to get your assistance in debugging what seems like a memory issue. I have a Solr index with about 32 million docs. Each doc is relatively small but has multiple dynamic fields that are storing INTs. The initial problem that I had to resolve is that we were running into

Re: How to return more fields on Solr 4.5.1 Suggester?

2014-03-19 Thread Ahmet Arslan
Hey Omer, Create a copy movie_title and use edgy_text described here :  http://searchhub.org/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/ With this approach you can request whatever field you want with fl parameter. Ahmet On Monday, March 17, 2014 3:48 PM, Erick Erickson

Re: Indexing large documents

2014-03-19 Thread Tom Burton-West
Hi Stephen, We regularly index documents in the range of 500KB-8GB on machines that have about 10GB devoted to Solr. In order to avoid OOM's on Solr versions prior to Solr 4.0, we use a separate indexing machine(s) from the search server machine(s) and also set the termIndexInterval to 8 times

Re: Zookeeper exceptions - SEVERE

2014-03-19 Thread Chris W
Thanks. Temporarily got over the problem by specifying custom limits through jute.maxbuffer=customlimit On Tue, Mar 18, 2014 at 9:45 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Sorry guys I spoke too fast. I looked at the code again. No it doesn't correlate with commits at all.

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-19 Thread Otis Gospodnetic
Hi, Guessing it's surround query parser's support for within backed by span queries. Otis Solr ElasticSearch Support http://sematext.com/ On Mar 19, 2014 4:44 PM, T. Kuro Kurosaka k...@healthline.com wrote: In the thread Partial Counts in SOLR, Salman gave us this sample query: ((stock or

Re: Excessive Heap Usage from docValues?

2014-03-19 Thread Otis Gospodnetic
Hi, Which type of doc values? See Wiki or reference guide for a list of types. Otis Solr ElasticSearch Support http://sematext.com/ On Mar 19, 2014 5:02 PM, tradergene nos...@krevets.com wrote: Hello All, I'm hoping to get your assistance in debugging what seems like a memory issue. I

Re: searche for single char number when ngram min is 3

2014-03-19 Thread Alexandre Rafalovitch
Does NGram factory support keyword token-type protection? If so, it could be just a matter of marking a number as keyword. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events

Solr4.7 No live SolrServers available to handle this request

2014-03-19 Thread Sathya
Hi Friends, I am new to Solr. I have 5 solr node in 5 different machine. When i index the data, sometimes *No live SolrServers available to handle this request* exception occur in 1 or 2 machines. I dont know why its happen and how to solve this. Kindly help me to solve this issue. -- View