Giving OpenSearcher as false

2013-08-19 Thread Prasi S
Hi, 1. What is the impact , use of giving opensearcher as true autoCommit maxTime${solr.autoCommit.maxTime:15000}/maxTime openSearchertrue/openSearcher 2. Giving the value as false , does this create index in the temp file and then commit? Regards, Prasi

Regarding mointoring the solr

2013-08-19 Thread prabu palanisamy
Hi My solr 3.5.0 indexed by wikipedia dump with java 1.6 is working perfectly. I run the solr server in my server CentOS release 5.7 (Final) and client Ubuntu 11.04 which access the solr server in my local system. The problem is that it is taking too much time. This problem does not arise when

Create term vector from text

2013-08-19 Thread Domma, Achim
Hi, the TermVectorComponent allows me to retrieve data about the terms of a document, including tf-idf. Is it possible to get this data for a text, but without storing it in SOLR? As far as I figured out, the AnalysisComponent comes close, but does not return the core specific frequencies.

Re: Giving OpenSearcher as false

2013-08-19 Thread Shalin Shekhar Mangar
Comments inline: On Mon, Aug 19, 2013 at 12:20 PM, Prasi S prasi1...@gmail.com wrote: Hi, 1. What is the impact , use of giving opensearcher as true autoCommit maxTime${solr.autoCommit.maxTime:15000}/maxTime openSearchertrue/openSearcher From the Solr reference guide:

Re: State sharing

2013-08-19 Thread Shalin Shekhar Mangar
As you noted, sharing S's ResponseBuilder is not possible because once the handler's process method is complete, the http action is deemed complete and the pipe is broken. You cannot send the client any further responses anymore. One way to solve this problem is to maintain the state of the job

Re: get term frequency, just only keywords search

2013-08-19 Thread danielitos85
Thanks Jack, but if my keyword search are two words? for example french fries ? how is the right syntax? -- View this message in context: http://lucene.472066.n3.nabble.com/get-term-frequency-just-only-keywords-search-tp4084510p4085399.html Sent from the Solr - User mailing list archive at

Re: Version Conflict on Atomic Update

2013-08-19 Thread Syao Work
Your _version_ does not match. On Fri, Aug 9, 2013 at 7:08 PM, Bruno René Santos brunor...@gmail.comwrote: Using the document interface on the Solr admin i try to update the following document: { responseHeader: { status: 0, QTime: 1, params: { indent: true, q: *:*, _: 1376064413493, wt:

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread vicky desai
Hi, I have created a new index. So reindexing shouldnt be the issue. Analysis page shows me correct result and match should be found as per the analysis page.But no output on actual query The Output of debug query is as follows str name=rawquerystringcontent:speedPost/str str

Re: Share splitting at 23 million documents - OOM

2013-08-19 Thread Bastian Mathes
Hi Greg, I am a colleague of Harald and had a look at his experiments last week. You are right, unpacking a fresh Solr 4.4, feeding a small number of documents (in my case 144) and trying to split the shard is not working. I get the same error message (maxValue must be non-negative) that was

Re: Indexing an XML file in Apache Solr

2013-08-19 Thread Abhiroop
Funnily just today itself I was looking at Lux for searching through my xml file. Now what I have inferred is that I need to format my xml to fit the format of Solr. Now do I have to manually code it or do i have some kind of parser on which the xml if fed is formatted to the Solr version? I

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread Erick Erickson
Well, the case of your parsedQuery field _name_ (i.e. content) does not match the case of your field definition, (i.e. Content). This may just be an artifact however. That said, the MultiPhraseQuery is probably coming from your request handler definition. Can we see that too? Erick On Mon, Aug

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread vicky desai
Hi Erik, These are the request handlers defined in solrconfig.xml requestHandler name=/analysis/field class=solr.FieldAnalysisRequestHandler / requestHandler name=standard class=solr.StandardRequestHandler default=true / requestHandler name=/update

Issue in Swap Space display at Solr Admin

2013-08-19 Thread Vladimir Vagaitsev
Hi, I've found an issue in displaying of Swap Space at Solr Admin page. When swap page is not used, the admin page shows a NaN percent of usage. Since used and total space are stored in double variables, the result of division of the used space (0.0Mb) by the total space (0.0Mb) is NaN. Maybe

SolrCloud Zookeeper Exception

2013-08-19 Thread Prasi S
Hi, I have setup solrcloud with 4.4 version. There is one external zookeeper and two instances of solr ( Total 4 shards - 2 shards in each instance) I was using dih to index from sql server. I twas indexing fine initially. Later when i shutdown solr and zookeeper's and then restarted them, I get

Re: Regarding mointoring the solr

2013-08-19 Thread Ados1984
Not sure of any solr specific tool but you can use jprofiler to see what is causing delay under the hood. Andy, On Aug 19, 2013, at 3:08 AM, prabu palanisamy pr...@serendio.com wrote: Hi My solr 3.5.0 indexed by wikipedia dump with java 1.6 is working perfectly. I run the solr server in

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread vicky desai
Hi, Another observation while testing Docs having the value for content field as below 1. content:speedPost 2. content:sPeedpost 3. content:speEdpost 4. content:speedposT matches the query q=content:speedPost. So basically if in the entire word there is one 1 letter that is camel cased then it

Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Furkan KAMACI
Hi; I want to write an analyzer that will prevent some special words. For example sentence to be indexed is: diet follower it will tokenize it as like that token 1) diet token 2) follower token 3) diet follower How can I do that with Solr?

Re: Regarding mointoring the solr

2013-08-19 Thread sivaprasad
You can look at this tool http://sematext.com/spm/solr-performance-monitoring/ -- View this message in context: http://lucene.472066.n3.nabble.com/Regarding-mointoring-the-solr-tp4085392p4085423.html Sent from the Solr - User mailing list archive at Nabble.com.

Facing Solr performance during query search

2013-08-19 Thread sivaprasad
Hi, Last week we configured Solr master and slave set up. All the Solr search requests are routed to slave. After this configuration, we are seeing drastic performance problems with Solr. Can any one explain what would be the reason? And, how to disable optimizing the index, warming the

Negation words

2013-08-19 Thread venkatesham.gu...@igate.com
I am searching with a keyword and if that keyword is attached to a negation(not, could not and etc) in the document that document should not be matched. For example I have a document text like I have not wheezed since I have been taking Spiriva. I am searching with a keyword wheeze should not

Use case of Spatial search

2013-08-19 Thread Shishir Jain
Hi, I have a very standard use case of Spatial search. Was trying to figure out how to do it in Solr, but couldn't figure out a standard way of doing it. Please point me to any document which explains this use case or how this specific use case can be implemented in Solr. The Use case is: There

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread Aloke Ghoshal
Hi Vicky, Please check you if you have a second multiValued field by the name content defined in your schema.xml. It is typically part of the default schema definition is different from the one you had initially posted had Content with a capital C. Here's the debugQuery on my system (with both

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread vicky desai
Hi Aloke, I have multiple fields in my schema which are of type text. i tried the same case on all the fields. Not working for me on any of them. If possible for u can u please post your dummy solrconfig.xml and schema.xml. I can replace them and check -- View this message in context:

Re: Negation words

2013-08-19 Thread Raymond Wiker
wheezed AND NOT not wheezed or +wheezed -not wheezed perhaps? Note: this assumes that you meant to search with the keyword wheezed and not wheeze. On Mon, Aug 19, 2013 at 2:38 PM, venkatesham.gu...@igate.com venkatesham.gu...@igate.com wrote: I am searching with a keyword and if that

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread Aloke Ghoshal
Here you go, it is the default 4.2.1 schema.xml ( http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_2_1/solr/example/solr/solr.xml), with the following additions: !-- Added these fields -- field name=Content type=text_general indexed=true stored=true multiValued=false/ field

Penalizing absent words

2013-08-19 Thread Rafael Calsaverini
Hi there, is there a way to penalize a document's score for lacking a particular term? It would be quite nice if I could add a negative term to the score, which is proportional to the idf of a word that is not present in a given field of that document. Thanks for your time, Rafael Calsaverini

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread Aloke Ghoshal
Location of the schema.xml: http://svn.apache.org/repos/asf/lucene/dev/tags/lucene_solr_4_2_1/solr/example/solr/collection1/conf/schema.xml On Mon, Aug 19, 2013 at 6:52 PM, Aloke Ghoshal alghos...@gmail.com wrote: Here you go, it is the default 4.2.1 schema.xml (

Re: Indexing an XML file in Apache Solr

2013-08-19 Thread Michael Sokolov
Abhiroop, I'm cc-ing the lux mailing list since this thread might not be of interest to all of solr-user; I'd suggest following up on that list. But to answer your actual question: see the documentation here http://luxdb.org/REST-API.html#LuxUpdateProcessor where it explains what to do.

Re: Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Jack Krupansky
Your example doesn't prevent any keywords. You need to elaborate the specific requirements with more detail. Given a long stream of text, what tokenization do you expect in the index? -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Monday, August 19, 2013 8:07 AM To:

Re: Negation words

2013-08-19 Thread Jack Krupansky
Solr has tools for lexical analysis of text, not deeper syntax and semantics of text. IOW, Solr supports keyword search, not natural language search. -- Jack Krupansky -Original Message- From: venkatesham.gu...@igate.com Sent: Monday, August 19, 2013 8:38 AM To:

Re: struggling with solr.WordDelimiterFilterFactory

2013-08-19 Thread vicky desai
Hi Aloke, After taking the schema.xml and solrconfig.xml with the changes u mentioned it worked fine. However simply making this changes in schema.xml doesnt work. So seems like there is an issue in some configuration in solrconfig.xml. I will figure that out and post it here. Anyways thanks a

Re: State sharing

2013-08-19 Thread Jack Krupansky
Generally, you shouldn't be trying to maintain, let alone share state in Solr itself. It sounds like you need an application layer between your application clients and Solr which could then maintain whatever state it needs. -- Jack Krupansky -Original Message- From: Peyman Faratin

Re: Problems installing Solr4 in Jetty9

2013-08-19 Thread Steve Rowe
https://issues.apache.org/jira/browse/SOLR-5173 On Aug 18, 2013, at 8:43 PM, Steve Rowe sar...@gmail.com wrote: bq. I thought that when Steve moved it from the test module to the core, he handled it so that it would not go out in the dist. Mea culpa. @Chris Collins, I think you're

Re: get term frequency, just only keywords search

2013-08-19 Thread Jack Krupansky
french fries is a phrase, not a term or a keyword. It consists of two terms or keywords, french and fries. They have to be treated separately. -- Jack Krupansky -Original Message- From: danielitos85 Sent: Monday, August 19, 2013 4:30 AM To: solr-user@lucene.apache.org Subject: Re:

Re: get term frequency, just only keywords search

2013-08-19 Thread danielitos85
there isn't a way to get termFreq about a search like french fries (sentence)? -- View this message in context: http://lucene.472066.n3.nabble.com/get-term-frequency-just-only-keywords-search-tp4084510p4085454.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Create term vector from text

2013-08-19 Thread Jack Krupansky
The Solr Terms Component will give you the terms in the index and the document frequency of each. https://cwiki.apache.org/confluence/display/solr/The+Terms+Component -- Jack Krupansky -Original Message- From: Domma, Achim Sent: Monday, August 19, 2013 3:09 AM To:

Re: Solr 4.3 and above core swap

2013-08-19 Thread richardg
directoryFactory name=DirectoryFactory class=${solr.directoryFactory:solr.NRTCachingDirectoryFactory}/ I commented out the lockType, so it should be using the default of native according to the documentation. There is nothing special about our file systems. Thanks --

Re: Penalizing absent words

2013-08-19 Thread Erik Hatcher
You could penalize by boosting documents that have the term. Can you give a concrete example? How dynamic is the absent words list? Erik On Aug 19, 2013, at 09:26 , Rafael Calsaverini wrote: Hi there, is there a way to penalize a document's score for lacking a particular term?

Re: get term frequency, just only keywords search

2013-08-19 Thread Jack Krupansky
Term frequency is about terms, nothing else. So, by definition, a phrase or any other collection of terms does not have a termfreq - in Lucene. -- Jack Krupansky -Original Message- From: danielitos85 Sent: Monday, August 19, 2013 9:59 AM To: solr-user@lucene.apache.org Subject: Re:

Re: get term frequency, just only keywords search

2013-08-19 Thread danielitos85
ok I undestand it (thanks) but if I search a sentence and type debugQuery=on, in the explain I obtain termFreq=2.0 and it right. Is it possible to obtain that parameter? -- View this message in context:

Re: Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Furkan KAMACI
Let's assume that my sentence is that: *Alice is a diet follower* My special keyword = *diet follower* Tokens will be: Token 1) Alice Token 2) is Token 3) a Token 4) diet Token 5) follower Token 6) *diet follower* 2013/8/19 Jack Krupansky j...@basetechnology.com Your example doesn't

Re: Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Jack Krupansky
Okay, but what is it that you are trying to prevent?? And, diet follower is a phrase, not a keyword or term. So, I'm still baffled as to what you are really trying to do. Trying explaining it in plain English. And given this same input, how would it be queried? -- Jack Krupansky

Re: Issue in Swap Space display at Solr Admin

2013-08-19 Thread Stefan Matheis
Vladimir Would you mind attaching the output of /solr/admin/system?wt=json ? The last about 20 lines should be enough .. i'm only interested in the system key which contains the memory informations. if that is completely missing .. or literally 0? - Stefan On Monday, August 19, 2013 at

RE: Regarding mointoring the solr

2013-08-19 Thread Boogie Shafer
re: monitoring performance trends we use a free option which is lightweight and works at collecting the general java stats info out of solr is using the sflow agent for java. in concert with a host sflowd setup you can gather the jvm and system stats in decently dense intervals (default is 30s)

Re: Prevent Some Keywords at Analyzer Step

2013-08-19 Thread Dan Davis
This is an interesting topic - my employer is a medical library and there are many keywords that may need to be aliased in various ways, and 2 or 3 word phrases that perhaps should be treated specially. Jack, can you give me an example of how to do that sort of thing?Perhaps I need to buy

Re: Regarding mointoring the solr

2013-08-19 Thread Shawn Heisey
On 8/19/2013 11:10 AM, Boogie Shafer wrote: the not often mentioned stats URL is another interface which you could scrape for stats (although i just noticed this url doesnt seem to work in my 4.4.0 test environment (it does work on the 4.2.1 hosts) so something may have changed, or my 4.4 env

spatial search, geofilt does not work

2013-08-19 Thread Mingfeng Yang
My solr index has a field called author_geo which contains the author's location, and when I am trying to get all docs whose author are within 10 km of 35.0,35.0 using the following query. curl '

Re: spatial search, geofilt does not work

2013-08-19 Thread Mingfeng Yang
BTW: my schema.xml contains the following related lines. fieldType name=location class=solr.LatLonType subFieldSuffix=_coordinate/ field name=author_geo type=location indexed=true stored=true/ dynamicField name=*_coordinate type=tdouble indexed=true stored=false/ On Mon, Aug 19, 2013 at 2:02

RE: Regarding mointoring the solr

2013-08-19 Thread Boogie Shafer
thanks for that. that URL with the corename explicity called out seems to work correctly on both 4.4 (using the new style config for solr.xml) and 4.2.1 (using the old style config.xml) From: Shawn Heisey s...@elyograg.org Sent: Monday, August 19,

custom hashing across cloud shards

2013-08-19 Thread Katie McCorkell
Hey All, If you don't specify numShards at the start, then you can do custom hashing, because Solr will just write the document to whatever shard you send it to. However, when I don't specify numshards, I'm having trouble creating more than one shard. It makes one shard and the others I add are

Re: Facing Solr performance during query search

2013-08-19 Thread Erick Erickson
Not until you tell us a lot more about your symptoms. What are your replication intervals? autowarm settings? how are you measuring drastic reductions? What have you tried in terms of diagnosing the problem? Please review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Mon, Aug

Re: Use case of Spatial search

2013-08-19 Thread Erick Erickson
I think you can do this by a combination of standard function queries, see: http://wiki.apache.org/solr/FunctionQuery#if and geodist, see: http://wiki.apache.org/solr/SpatialSearch#geodist_-_The_distance_function WARNING: I haven't tried this myself, but it seems like it would work. The trick is

Re: get term frequency, just only keywords search

2013-08-19 Thread Erick Erickson
There are a series of functions that can deal with _some_ relevance data, see: http://wiki.apache.org/solr/FunctionQuery#Relevance_Functions Best Erick On Mon, Aug 19, 2013 at 10:25 AM, danielitos85 danydany@gmail.comwrote: ok I undestand it (thanks) but if I search a sentence and type

Re: Percolate feature?

2013-08-19 Thread Chris Hostetter
: Let's talk about the real use case. We are marketplace that sells : products that users have listed. For certain popular, high risk or : restricted keywords we charge the seller an extra fee/ban the listing. : We now have sellers purposely misspelling their listings to circumvent : this fee.

Re: custom hashing across cloud shards

2013-08-19 Thread Erick Erickson
Right, you can't just tell Solr to create a single shard (i.e. by not specifying numshards) then expect to be able to do anything except index to a single shard. All the nodes will be replicas of the single shard. From there it really doesn't matter what you do, the documents will be routed to all

Re: get term frequency, just only keywords search

2013-08-19 Thread Jack Krupansky
The Lucene PhraseQuery goes through a lot of effort to calculate phrase frequency (phraseFreq) - but that is not the same as term frequency (don't confuse terms and phrases). Feel free to pick that number out of the debugQuery output, or from the XML variant of the explain output. For

What filter to use to search with spaces omitted/included between words?

2013-08-19 Thread Utkarsh Sengar
I have a field which consists of a store name. How can I make sure that these queries return relevant results when searched against this column: *Example1: Best Buy* q=best (tokenizer filter makes this work) q=bestbuy q=buy (tokenizer filter makes this work) q=best buy (lower case filter makes

Custom Sort(0.2*relervanceScore + 0.8*numberic_field_value)

2013-08-19 Thread 刘健
Hello: I want to get final search result sorted by (0.2*relervance score + 0.8* specified_numberic_field) . I have known that if I use “bf”in edismax (e.g. bf=field(value)), I can get a result sorted by(relervance sore + field(value)) ,but I don`t know how to Implement the result sorted

Re: Custom Sort(0.2*relervanceScore + 0.8*numberic_field_value)

2013-08-19 Thread Jack Krupansky
Edismax applies the multiplicative boost (boost) after applying the additive boost functions (bf). I think (0.2*relervance score + 0.8* specified_numberic_field) should be equivalent to: 0.2*(relevance score + (0.8/0.2)* specified_numeric_field) or 0.2*(relevance score + 4.0*

Re: Custom Sort(0.2*relervanceScore + 0.8*numberic_field_value)

2013-08-19 Thread 刘健
Thank you very much! Then could you tell me how to implement relervance_score*numberic_field/(relervance_score + numberic_field) ? I think it's better to sort by harmmean -- Original -- From: Jack Krupanskyj...@basetechnology.com; Date: Tue, Aug 20, 2013

Re: SOLR4 Spatial sorting and query string

2013-08-19 Thread David Smiley (@MITRE.org)
This is a known limitation. From CHANGES.txt: * SOLR-2345: Enhanced geodist() to work with an RPT field, provided that the field is referenced via 'sfield' and the query point is constant. (David Smiley) The reason why that limitation is there relates to the fact that the function query

Re: spatial search, geofilt does not work

2013-08-19 Thread David Smiley (@MITRE.org)
Thank goodness for Solr's feature of echo'ing params back in the response as it helps diagnose problems like this. In your case, the filter query that Solr is seeing isn't what you (seemed) to have given on the command line: fq:!geofilt sfield=author_geo Clearly wrong. Try escaping the braces

Re: Use case of Spatial search

2013-08-19 Thread David Smiley (@MITRE.org)
Shishir, Use the location_rpt type and index circles of the business and the distance they serve with this syntax: field name=myfieldNameCircle(lat,lon d=degreesRadius)/field Your query shape is then simply a point; use bbox query parser with d=0. This approach should scale *great* at query