Re: Handling categories( level one and two) based navigation

2013-08-14 Thread tamanjit.bin...@yahoo.co.in
This may be helpful, especially the last bit: http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html http://blog.griddynamics.com/2011/06/solr-experience-search-parent-child.html -- View this message in context:

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Kevin Osborn
Interesting, that did work. Do you or anyone else have any ideas or what I should look at? While soft commit is not a requirement in my project, my understanding is that it should help performance. On the same index, I will be doing both a large number of queries as well as updates. If I have to

Re: SolrCloud: Programmatically create multiple collections?

2013-08-14 Thread xinwu
Hey Shawn .Thanks for your reply. I just want to access the base_url easily by a short instanceDir name. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Programmatically-create-multiple-collections-tp3916927p4084480.html Sent from the Solr - User mailing list

Re: SolrCloud: Programmatically create multiple collections?

2013-08-14 Thread xinwu
Thank you Ani. -- View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-Programmatically-create-multiple-collections-tp3916927p4084485.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: SOLR4 Spatial sorting and query string

2013-08-14 Thread roySolr
Great, it works very well. In solr 4.5 i will use geodist() again! Thanks David -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR4-Spatial-sorting-and-query-string-tp4084318p4084487.html Sent from the Solr - User mailing list archive at Nabble.com.

Who's cleaning the Fieldcache?

2013-08-14 Thread Andrea Gazzarini
After doing some replications (replicationOnOptimize) I see - on master filesystem files that belong to two segments (I suppose the oldest is just a commit point) - on master admin console (SolrIndexReader{this=4f2452c6,r=ReadOnlyDirectoryReader@4f2452c6,refCnt=1,*segments=**1*}) but on

Re: autocomplete feature - where to begin

2013-08-14 Thread Mysurf Mail
Thanks. Will read it now :-) On Tue, Aug 13, 2013 at 8:33 PM, Cassandra Targett casstarg...@gmail.comwrote: The autocomplete feature in Solr is built on the spell checker component, and is called Suggester, which is why you've seen both of those mentioned. It's implemented with a

Re: Ping request uses wrong default search field?

2013-08-14 Thread Bram Van Dam
On 08/13/2013 06:12 PM, Chris Hostetter wrote: So if you have a defaults df in your solrconfig.xml it's going to override the defaultSearchField in schema.xml Alright, thanks for clarifying that. Makes sense not that I know the default in schema.xml is deprecated. - Bram

PostingsHighlighter returning fields which don't match

2013-08-14 Thread ses
We are trying out the new PostingsHighlighter with Solr 4.2.1 and finding that the highlighting section of the response includes self-closing tags for all the fields in hl.fl (by default for edismax it is all fields in qf) where there are no highlighting matches. In contrast the same query on Solr

Solr with custom tokenizer

2013-08-14 Thread Алексей Курган
There is a problem with custom tokenizer for Solr. We have developed our own tokenizer for Solr, that he rescued phones from the text and put additional tokens to token stream. But unfortunately, these additional tokens are not indexed by Solr. For an example, the text Hello (111) 222-33-44 all!

Wrong leader election leads to shard removal

2013-08-14 Thread Manuel Le Normand
Hello, My solr cluster runs on RH Linux with tomcat7 servlet. NumOfShards=40, replicationFactor=2, 40 servers each has 2 replicas. Solr 4.3 For experimental reasons I splitted my cluster to 2 sub-clusters, each containing a single replica of each shard. When connecting back these sub-clusters the

Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Tor Egil
Setup: 3 zk servers 3 solr 4.4 servers (1 shard with 2 replicas) Every now and then Solr gets trapped recovering Clusterstate says: Leader says: and the restarted replica says: I've tried removing the data directory and restarting the replica, but it ends up in the same loop. I must kill

get term frequency, just only keywords search

2013-08-14 Thread danielitos85
I need to get the TermFrequency in Solr4 but just about my keywords search and not for all the keywords that a field cointains. I try to explain with an example: I have this fiels: field name=idtype=string indexed=true stored=truerequired=true / field name=foods

RE: get term frequency, just only keywords search

2013-08-14 Thread Markus Jelsma
Try the TermsComponent. It will return one or more terms and their counts for a given field only. -Original message- From:danielitos85 danydany@gmail.com Sent: Wednesday 14th August 2013 11:30 To: solr-user@lucene.apache.org Subject: get term frequency, just only keywords

RE: get term frequency, just only keywords search

2013-08-14 Thread danielitos85
Thanks for your answer, but I'm tring to use TermsComponent but the output is similar. TermsComponent returns all the terms (*and not just for the term that I have search*) and their counts for a given field. something wrong? -- View this message in context:

Re: SOLR4 Spatial sorting and query string

2013-08-14 Thread roySolr
Hello, I have a question about performance with a lot of points and spatial search. First i will explain my situation: We have some products data and want to store every geo location of stores that sells this product. I use a multivalued coordinates field with the geo data: arr

RE: get term frequency, just only keywords search

2013-08-14 Thread Markus Jelsma
Why? Using terms.limit or a ^term$ regex should limit the response to the exact term right? -Original message- From:danielitos85 danydany@gmail.com Sent: Wednesday 14th August 2013 12:20 To: solr-user@lucene.apache.org Subject: RE: get term frequency, just only keywords search

cross-core join functionquery could use a local qf parameter for parameter dereferencing

2013-08-14 Thread Paul Blanchaert
I'm using the same cross-core join functionquery quite extensively in my use case: for facet queries, filter queries and field aliasing. I tried to derefence the join query syntax with local parameters to enable clean client request, but this seemed not so simple and only possible via introducing

Search against one field and boost against different fields..

2013-08-14 Thread deepakinniah
Is this possible to search on one filed and boost on different fields like q=text:deepakdefType=edismaxqf=name^10?.. else any other way to achieve this? -- View this message in context:

RE: get term frequency, just only keywords search

2013-08-14 Thread danielitos85
thanks a lot Markus ;) If I use regex parameter it works -- View this message in context: http://lucene.472066.n3.nabble.com/get-term-frequency-just-only-keywords-search-tp4084510p4084525.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Measuring SOLR performance

2013-08-14 Thread Dmitry Kan
Hi Roman, This looks much better, thanks! The ordinary non-comarison mode works. I'll post here, if there are other findings. Thanks for quick turnarounds, Dmitry On Wed, Aug 14, 2013 at 1:32 AM, Roman Chyla roman.ch...@gmail.com wrote: Hi Dmitry, oh yes, late night fixes... :) The latest

Unable to deplay solr 4.3.0 on jboss EAP 6.1 in mode full JavaEE 6

2013-08-14 Thread Roland Everaert
Hi, For the past months I have deplaoyed and used SOLR 4.3.0 on a JBOSS EAP 6.1 using the standalone configuration. Now due to the addition of a new service, I have to start jboss with a modified version of the standalone-full.xml configuration file, because the service uses JavaEE 6. The only

RE: get term frequency, just only keywords search

2013-08-14 Thread danielitos85
sorry, but now I give more attention at the results and it don't return that I needed. If I have two documents indexed: the text of my first document: ice-cream pizza pizza pizza the text of my second document: pizza tomato it returns the followed code: response result name=response

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Erick Erickson
right, SOLR-5081 is possible but somewhat unlikely given the fact that you actually don't have very many nodes in your cluster. soft commits aren't relevant to the tlog, but here's the thing. Your tlogs may get replayed when you restart solr. If they're large, this may take a long time. When you

Re: Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Mark Miller
What does the cluster state and leader say? Anything interesting you can pull from the logs? - Mark On Aug 14, 2013, at 4:58 AM, Tor Egil trefs...@gmail.com wrote: Setup: 3 zk servers 3 solr 4.4 servers (1 shard with 2 replicas) Every now and then Solr gets trapped recovering

Re: Wrong leader election leads to shard removal

2013-08-14 Thread Manuel Le Normand
Does this sound like the scenario that happened: By removing the index dir from replica 2 I also removed the tlog from which the zookeeper extracts the version of the two replicas and decides which one should be elected to leader. As replica 2 did have no tlog, the zk didn't have anyway to compare

Re: get term frequency, just only keywords search

2013-08-14 Thread Jack Krupansky
You can use the termfreq or tf function query in your field list to return the term frequency for a term, like: fl=id,tf(foods,'pizza') -- Jack Krupansky -Original Message- From: danielitos85 Sent: Wednesday, August 14, 2013 5:29 AM To: solr-user@lucene.apache.org Subject: get term

Re: Search against one field and boost against different fields..

2013-08-14 Thread Jack Krupansky
Try the bq boost query parameter bf=name:deepak^10 Or, make the main query term mandatory and the boost term optional: q=+text:deepak name:deepak^10 -- Jack Krupansky -Original Message- From: deepakinniah Sent: Wednesday, August 14, 2013 2:38 AM To: solr-user@lucene.apache.org

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Jason Hellman
Kevin, I wouldn't have considered using softCommits at all based on what I understand from your use case. You appear to be loading in large batches, and softCommits are better aligned to NRT search where there is a steady stream of smaller updates that need to be available immediately. As

Re: SOLR4 Spatial sorting and query string

2013-08-14 Thread Smiley, David W.
Roy, How fast/slow this is is dependent on the total number of points in documents that match the search results. If one of those documents has 1000 points but most have a handful then it isn't such a big deal. The bigger problem is: https://issues.apache.org/jira/browse/LUCENE-4698 ~ David

Re: PostingsHighlighter returning fields which don't match

2013-08-14 Thread Jack Krupansky
No, there is no option to disable that feature of the postings highlighter. This code in PostingsSolrHighlighter.java: protected NamedListObject encodeSnippets(String[] keys, String[] fieldNames, MapString,String[] snippets) { NamedListObject list = new SimpleOrderedMapObject(); for (int i

Re: PostingsHighlighter returning fields which don't match

2013-08-14 Thread Robert Muir
On Wed, Aug 14, 2013 at 3:53 AM, ses stew...@ssims.co.uk wrote: We are trying out the new PostingsHighlighter with Solr 4.2.1 and finding that the highlighting section of the response includes self-closing tags for all the fields in hl.fl (by default for edismax it is all fields in qf) where

Re: SolrCloud: Programmatically create multiple collections?

2013-08-14 Thread Shawn Heisey
On 8/14/2013 12:34 AM, xinwu wrote: Hey Shawn .Thanks for your reply. I just want to access the base_url easily by a short instanceDir name. For index updates and queries, you *can* access it by the /solr/mycollection name. Although there may be no core by that name, the base URL will work.

Re: Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Tor Egil
This is from the leader log. (There are other statements inbetween, but I think they are irrelevant). First of all it reads the zookeeper state. This happens now and then. Then I guess the replica says Hey, I'm alive, please start the recover process: From the replica log, which tries to

Re: Wrong leader election leads to shard removal

2013-08-14 Thread Mark Miller
On Aug 14, 2013, at 9:01 AM, Manuel Le Normand manuel.lenorm...@gmail.com wrote: Does this sound like the scenario that happened: By removing the index dir from replica 2 I also removed the flog Did you also remove the tlog dir? It's normally: data/index data/tlog from which the

Re: Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Tor Egil
Mark, just to be sure, you can see the raw text formatted text in my original, (and last) post? It was left out in the text you qouted... -- View this message in context: http://lucene.472066.n3.nabble.com/Clusterstate-says-state-recovering-but-Core-says-I-see-state-null-tp4084504p4084577.html

Re: Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Shawn Heisey
On 8/14/2013 8:04 AM, Tor Egil wrote: This is from the leader log. (There are other statements inbetween, but I think they are irrelevant). First of all it reads the zookeeper state. This happens now and then. Then I guess the replica says Hey, I'm alive, please start the recover process:

Huge discrepancy between QTime and ElapsedTime

2013-08-14 Thread Jean-Sebastien Vachon
Hi All, I am running some benchmarks to tune our Solr 4.3 cloud and noticed that while the reported QTime is quite satisfactory (100 ms or so), the elapsed time is quite large (around 5 seconds). The collection contains 12.8M documents and the index size on disk is about 35 GB.. I have only

Re: Huge discrepancy between QTime and ElapsedTime

2013-08-14 Thread Scott Lundgren
Jean-Sebastien, We have had similar issues. In our cases, our QTime varied between 100ms and as much as 120s (that's right, 120,000ms). The times were so long that they resulted in timeouts upstream. In our case, we have settled in on the following hypothesis: The actual retrieval time (clock

Re: Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Mark Miller
On Aug 14, 2013, at 10:24 AM, Shawn Heisey s...@elyograg.org wrote: I'm not sure what happens if you swap cores in a SolrCloud environment. It's possible that this kind of swapping could lead to a very unstable system. Yeah, it's totally not supported. I've threatened to make a JIRA issue

Re: Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Mark Miller
On Aug 14, 2013, at 10:04 AM, Tor Egil trefs...@gmail.com wrote: The name of the core swap is used because I would like to upload new configs to the swap core, and then do an actual swap when the core is up and running with new data…. I would do this with two collections and a collection

Re: Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Michael Della Bitta
We build new collections nightly (identifiers change for us) and change aliases once they're done. Easy and effective. Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY

Re: Huge discrepancy between QTime and ElapsedTime

2013-08-14 Thread Shawn Heisey
On 8/14/2013 9:09 AM, Jean-Sebastien Vachon wrote: I am running some benchmarks to tune our Solr 4.3 cloud and noticed that while the reported QTime is quite satisfactory (100 ms or so), the elapsed time is quite large (around 5 seconds). The collection contains 12.8M documents and the index

Re: get term frequency, just only keywords search

2013-08-14 Thread danielitos85
Thanks Jack, I'm tring to use tf function but I don't understand: why he returns a float value and not integer? At the start of this topic I explained an example where I used term Frequency but it don't works how I need because he returns the term frequency about all the terms of my field.

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Kevin Osborn
Thanks so much for your help and for the explanations. Eventually, we will be doing several batches in parallel. But at least now I know where to look and can do some testing on various scenarios. Since we may be doing a lot of heavy uploading (while still doing a lot of queries), having a

Re: get term frequency, just only keywords search

2013-08-14 Thread Jack Krupansky
The rationale for float vs. integer is probably that function queries are primarily intended for computing a boost for scoring which is a float. The term vectors search component is designed to provide all term vectors for each selected document. -- Jack Krupansky -Original Message-

Re: Shard splitting failure, with and without composite hashing

2013-08-14 Thread mewmewball
Hey guys, I filed a jira for this and apparently this problem has been fixed in Lucene but didn't make it into the 4.4 release. Please see jira for more info about patches: https://issues.apache.org/jira/browse/SOLR-5144. -- View this message in context:

Re: Huge discrepancy between QTime and ElapsedTime

2013-08-14 Thread Yonik Seeley
On Wed, Aug 14, 2013 at 12:39 PM, Shawn Heisey s...@elyograg.org wrote: You also have grouping enabled. From what I understand, that can be slow. If you turn that off, what happens to your elapsed times? QTime would include that. It includes everything up until the point where the response

Re: get term frequency, just only keywords search

2013-08-14 Thread danielitos85
Thanks. I tried to use termfreq function and it is ok for me but, last my question is: if I have a query like this: q=pizza+tomatofl=id,termfreq(myfield,**) how set my query (pizza+tomato) in second param in termfreq function? -- View this message in context:

RE: Huge discrepancy between QTime and ElapsedTime

2013-08-14 Thread Jean-Sebastien Vachon
Thanks Shawn and Scott for your feedback. It is really appreciated. -Original Message- From: Shawn Heisey [mailto:s...@elyograg.org] Sent: August-14-13 12:39 PM To: solr-user@lucene.apache.org Subject: Re: Huge discrepancy between QTime and ElapsedTime On 8/14/2013 9:09 AM,

Distance sort on a multi-value field

2013-08-14 Thread Jeff Wartes
I'm still pondering aggregate-type operations for scoring multi-valued fields (original thread: http://goo.gl/zOX53f ), and it occurred to me that distance-sort with SpatialRecursivePrefixTreeFieldType must be doing something like that. Somewhat surprisingly I don't see this in the documentation

Load a list of values in a solr field and query over its items

2013-08-14 Thread Utkarsh Sengar
Hello, Is it possible to load a list in a solr filed and query for items in that list? example_core1: document1: FieldName=user_ids Value=8,6,1,9,3,5,7 FieldName=allText Value=text to be searched over with title and description document2: FieldName=user_ids Value=8738,624623,7272.82272,733

Re: Load a list of values in a solr field and query over its items

2013-08-14 Thread Aloke Ghoshal
Should work once you set up both fields as multiValued ( http://wiki.apache.org/solr/SchemaXml#Common_field_options). On Thu, Aug 15, 2013 at 12:07 AM, Utkarsh Sengar utkarsh2...@gmail.comwrote: Hello, Is it possible to load a list in a solr filed and query for items in that list?

Re: Load a list of values in a solr field and query over its items

2013-08-14 Thread Utkarsh Sengar
Never mind,got my answer here: http://stackoverflow.com/a/5800830/231917 field name=tagstag1/tags field name=tagstag2/tags ... field name=tagstagn/tags once you have all the values index you can search or filter results by any value, e,g. you can find all documents with tag1 using query like

Re: Distance sort on a multi-value field

2013-08-14 Thread Smiley, David W.
On 8/14/13 2:26 PM, Jeff Wartes jwar...@whitepages.com wrote: I'm still pondering aggregate-type operations for scoring multi-valued fields (original thread: http://goo.gl/zOX53f ), and it occurred to me that distance-sort with SpatialRecursivePrefixTreeFieldType must be doing something like

Re: get term frequency, just only keywords search

2013-08-14 Thread Jack Krupansky
q=pizza+tomatofl=id,termfreq(myfield,'pizza'),termfreq(myfield,'tomato') -- Jack Krupansky -Original Message- From: danielitos85 Sent: Wednesday, August 14, 2013 1:22 PM To: solr-user@lucene.apache.org Subject: Re: get term frequency, just only keywords search Thanks. I tried to use

unformatted: Clusterstate says state:recovering, but Core says I see state: null?

2013-08-14 Thread Tor Egil
This time without formatting ;-) This is from the leader log. (There are other statements inbetween, but I think they are irrelevant). First of all it reads the zookeeper state. This happens now and then. Then I guess the replica says Hey, I'm alive, please start the recover process:

RE: Solr4 update and query performance question

2013-08-14 Thread Joshi, Shital
We didn't copy/paste Solr3 config to solr4. We started with Solr4 config and only updated new searcher queries and few other things. There is no batching while updating/inserting documents in Solr3, is that correct? Committing 1000 documents in Solr3 takes 19 seconds while in Solr4 it takes

Re: Ping request uses wrong default search field?

2013-08-14 Thread Chris Hostetter
: On 08/13/2013 06:12 PM, Chris Hostetter wrote: : So if you have a defaults df in your solrconfig.xml it's going to : override the defaultSearchField in schema.xml : : Alright, thanks for clarifying that. Makes sense not that I know the default : in schema.xml is deprecated. ok -- but that's

Re: Who's cleaning the Fieldcache?

2013-08-14 Thread Chris Hostetter
: why? Those are my sort fields and they are occupying a lot of space (doubled : in this case but I see that sometimes I have three or four old segment : references) : : Is there something I can do to remove those old references? I tried to reload : the core and it seems the old references are

Re: Who's cleaning the Fieldcache?

2013-08-14 Thread Robert Muir
On Wed, Aug 14, 2013 at 5:29 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : why? Those are my sort fields and they are occupying a lot of space (doubled : in this case but I see that sometimes I have three or four old segment : references) : : Is there something I can do to remove

Re: Who's cleaning the Fieldcache?

2013-08-14 Thread Chris Hostetter
: FieldCaches are managed using a WeakHashMap - so once the IndexReader's : associated with those FieldCaches are no logner used, they will be garbage : collected when and if the JVMs garbage collector get arround to it. : : if they sit arround after you are done with them, they might look

Re: Distance sort on a multi-value field

2013-08-14 Thread Jeff Wartes
Hm, Give me all the stores that only have branches in this area might be a plausible use case for farthest distance. That's essentially a contains question though, so maybe that's already supported? I guess it depends on how contains/intersects/etc handle multi-values. I feel like multi-value

Re: Who's cleaning the Fieldcache?

2013-08-14 Thread Robert Muir
On Wed, Aug 14, 2013 at 5:58 PM, Chris Hostetter hossman_luc...@fucit.org wrote: : FieldCaches are managed using a WeakHashMap - so once the IndexReader's : associated with those FieldCaches are no logner used, they will be garbage : collected when and if the JVMs garbage collector get

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Kevin Osborn
Actually, I thought it worked last night, but that may have just been a fluke. Today, it is not working. This is what I have done. I have turned off autoCommit and softAutoCommit. My updates are not sending any softCommit messages. I am sending over data in chunks of 500 records. At the end of

Re: Indexing hangs when more than 1 server in a cluster

2013-08-14 Thread Kevin Osborn
I may have a bit of good news. The ulimit of open files was set to 4096. I just chose a random high limit (10) and it seems to be working better now. I still have more testing to do though, but the initial results are hopeful. On Wed, Aug 14, 2013 at 4:22 PM, Kevin Osborn

Re: Alternative searches

2013-08-14 Thread Chris Hostetter
: Can someone explain how one would go about providing alternative searches for a query… similar to Amazon. : For example say I search for Red Dump Truck : : - 0 results for Red Dump Truck : - 500 results for Red Truck : - 350 results for Dump Truck Well -- it depends on what you want to