Solr edismax NOT operator behavior

2012-07-27 Thread Alok Bhandari
Hello,

I am using Edismax parser and query submitted by application is of the
format 

price:1000 AND ( NOT ( launch_date:[2007-06-07T00:00:00.000Z TO
2009-04-07T23:59:59.999Z] AND product_type:electronic)).

Solr while executing gives unexpected result. I am suspecting it is because
of the AND ( NOT  portion of the query .
Please can any one explain me how this structure is handled.

I am using solr 3.6

Any help is appreciated ..

Thanks
Alok





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-edismax-NOT-operator-behavior-tp3997663.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: leaks in solr

2012-07-27 Thread roz dev
in my case, I see only 1 searcher, no field cache - still Old Gen is almost
full at 22 GB

Does it have to do with index or some other configuration

-Saroj

On Thu, Jul 26, 2012 at 7:41 PM, Lance Norskog goks...@gmail.com wrote:

 What does the Statistics page in the Solr admin say? There might be
 several searchers open: org.apache.solr.search.SolrIndexSearcher

 Each searcher holds open different generations of the index. If
 obsolete index files are held open, it may be old searchers. How big
 are the caches? How long does it take to autowarm them?

 On Thu, Jul 26, 2012 at 6:15 PM, Karthick Duraisamy Soundararaj
 karthick.soundara...@gmail.com wrote:
  Mark,
  We use solr 3.6.0 on freebsd 9. Over a period of time, it
  accumulates lots of space!
 
  On Thu, Jul 26, 2012 at 8:47 PM, roz dev rozde...@gmail.com wrote:
 
  Thanks Mark.
 
  We are never calling commit or optimize with openSearcher=false.
 
  As per logs, this is what is happening
 
 
 openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
 
  --
  But, We are going to use 4.0 Alpha and see if that helps.
 
  -Saroj
 
 
 
 
 
 
 
 
 
 
  On Thu, Jul 26, 2012 at 5:12 PM, Mark Miller markrmil...@gmail.com
  wrote:
 
   I'd take a look at this issue:
   https://issues.apache.org/jira/browse/SOLR-3392
  
   Fixed late April.
  
   On Jul 26, 2012, at 7:41 PM, roz dev rozde...@gmail.com wrote:
  
it was from 4/11/12
   
-Saroj
   
On Thu, Jul 26, 2012 at 4:21 PM, Mark Miller markrmil...@gmail.com
 
   wrote:
   
   
On Jul 26, 2012, at 3:18 PM, roz dev rozde...@gmail.com wrote:
   
Hi Guys
   
I am also seeing this problem.
   
I am using SOLR 4 from Trunk and seeing this issue repeat every
 day.
   
Any inputs about how to resolve this would be great
   
-Saroj
   
   
Trunk from what date?
   
- Mark
   
   
   
   
   
   
   
   
   
   
  
   - Mark Miller
   lucidimagination.com
  
  
  
  
  
  
  
  
  
  
  
  
 



 --
 Lance Norskog
 goks...@gmail.com



too many instances of org.tartarus.snowball.Among in the heap

2012-07-27 Thread roz dev
Hi All

I am trying to find out the reason for very high memory use and ran JMAP
-hist

It is showing that i have too many instances of org.tartarus.snowball.Among

Any ideas what is this for and why am I getting so many of them

num   #instances#bytes  Class description
--
*1:  467281101869124400  org.tartarus.snowball.Among
*
2:  5244210 1840458960  byte[]
3:  526519495969839368  char[]
4:  10008928864769280   int[]
5:  10250527410021080
java.util.LinkedHashMap$Entry
6:  4672811 268474232   org.tartarus.snowball.Among[]
*7:  8072312 258313984   java.util.HashMap$Entry*
8:  466514  246319392   org.apache.lucene.util.fst.FST$Arc[]
9:  1828542 237600432   java.util.HashMap$Entry[]
10: 3834312 153372480   java.util.TreeMap$Entry
11: 2684700 128865600
org.apache.lucene.util.fst.Builder$UnCompiledNode
12: 4712425 113098200   org.apache.lucene.util.BytesRef
13: 3484836 111514752   java.lang.String
14: 2636045 105441800   org.apache.lucene.index.FieldInfo
15: 1813561 101559416   java.util.LinkedHashMap
16: 6291619 100665904   java.lang.Integer
17: 2684700 85910400
org.apache.lucene.util.fst.Builder$Arc
18: 956998  84215824
org.apache.lucene.index.TermsHashPerField
19: 2892957 69430968
org.apache.lucene.util.AttributeSource$State
20: 2684700 64432800
org.apache.lucene.util.fst.Builder$Arc[]
21: 685595  60332360org.apache.lucene.util.fst.FST
22: 933451  59210944java.lang.Object[]
23: 957043  53594408org.apache.lucene.util.BytesRefHash
24: 591463  42585336
org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader
25: 424801  40780896
org.tartarus.snowball.ext.EnglishStemmer
26: 424801  40780896
org.apache.lucene.analysis.miscellaneous.WordDelimiterFilter
27: 1549670 37192080org.apache.lucene.index.Term
28: 849602  33984080
org.apache.lucene.analysis.miscellaneous.WordDelimiterFilter$WordDelimiterConcatenation
29: 424801  27187264
org.apache.lucene.analysis.core.WhitespaceTokenizer
30: 478499  26795944
org.apache.lucene.index.FreqProxTermsWriterPerField
31: 535521  25705008
org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray
32: 219081  24537072
org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter
33: 478499  22967952
org.apache.lucene.index.FieldInvertState
34: 956998  22967952
org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray
35: 478499  22967952
org.apache.lucene.index.TermVectorsConsumerPerField
36: 478499  22967952
org.apache.lucene.index.NormsConsumerPerField
37: 316582  22793904
org.apache.lucene.store.MMapDirectory$MMapIndexInput
38: 906708  21760992
org.apache.lucene.util.AttributeSource$State[]
39: 906708  21760992
org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl
40: 883588  21206112java.util.ArrayList
41: 438192  21033216
org.apache.lucene.store.RAMOutputStream
42: 860601  20654424java.lang.StringBuilder
43: 424801  20390448
org.apache.lucene.analysis.miscellaneous.WordDelimiterIterator
44: 424801  20390448
org.apache.lucene.analysis.core.StopFilter
45: 424801  20390448
org.apache.lucene.analysis.miscellaneous.KeywordMarkerFilter
46: 424801  20390448
org.apache.lucene.analysis.snowball.SnowballFilter
47: 839390  20145360
org.apache.lucene.index.DocumentsWriterDeleteQueue$TermNode


-Saroj


Re: Upgrade solr 1.4.1 to 3.6

2012-07-27 Thread alexander81
Yes, the index.
You know any link/documentation about upgrade solr 1.4.1 - 3.6?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrade-solr-1-4-1-to-3-6-tp3996952p3997678.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Skip first word

2012-07-27 Thread Finotti Simone
Hi Chantal,

if I understand correctly, this implies that I have to populate different 
fields according to their lenght. Since I'm not aware of any logical condition 
you can apply to copyField directive, it means that this logic has to be 
implementend by the process that populates the Solr core. Is this assumption 
correct?

That's kind of bad, because I'd like to have this kind of rules in the Solr 
configuration. Of course, if that's the only way... :)

Thank you 


Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com]
Inviato: giovedì 26 luglio 2012 18.32
Fine: solr-user@lucene.apache.org
Oggetto: Re: Skip first word

Hi,

use two fields:
1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2 for 
inputs of length  3,
2. the other one tokenized as appropriate with minsize=3 and longer for all 
longer inputs


Cheers,
Chantal


Am 26.07.2012 um 09:05 schrieb Finotti Simone:

 Hi Ahmet,
 business asked me to apply EdgeNGram with minGramSize=1 on the first term and 
 with minGramSize=3 on the latter terms.

 We are developing a search suggestion mechanism, the idea is that if the user 
 types D, the engine should suggest Dolce  Gabbana, but if we type G, 
 it should suggest other brands. Only if users type Gab it should suggest 
 Dolce  Gabbana.

 Thanks
 S
 
 Inizio: Ahmet Arslan [iori...@yahoo.com]
 Inviato: mercoledì 25 luglio 2012 18.10
 Fine: solr-user@lucene.apache.org
 Oggetto: Re: Skip first word

 is there a tokenizer and/or a combination of filter to
 remove the first term from a field?

 For example:
 The quick brown fox

 should be tokenized as:
 quick
 brown
 fox

 There is no such filter that i know of. Though, you can implement one with 
 modifying source code of LengthFilterFactory or StopFilterFactory. They both 
 remove tokens. Out of curiosity, what is the use case for this?











R: Skip first word

2012-07-27 Thread Finotti Simone
Could you elaborate it, please? 

thanks
S


Inizio: in.abdul [in.ab...@gmail.com]
Inviato: giovedì 26 luglio 2012 20.36
Fine: solr-user@lucene.apache.org
Oggetto: Re: Skip first word

That's is best option I had also used shingle filter factory . .
On Jul 26, 2012 10:03 PM, Chantal Ackermann-2 [via Lucene] 
ml-node+s472066n399748...@n3.nabble.com wrote:

 Hi,

 use two fields:
 1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2
 for inputs of length  3,
 2. the other one tokenized as appropriate with minsize=3 and longer for
 all longer inputs


 Cheers,
 Chantal


 Am 26.07.2012 um 09:05 schrieb Finotti Simone:

  Hi Ahmet,
  business asked me to apply EdgeNGram with minGramSize=1 on the first
 term and with minGramSize=3 on the latter terms.
 
  We are developing a search suggestion mechanism, the idea is that if the
 user types D, the engine should suggest Dolce  Gabbana, but if we type
 G, it should suggest other brands. Only if users type Gab it should
 suggest Dolce  Gabbana.
 
  Thanks
  S
  
  Inizio: Ahmet Arslan [[hidden 
  email]http://user/SendEmail.jtp?type=nodenode=3997480i=0]

  Inviato: mercoledì 25 luglio 2012 18.10
  Fine: [hidden email]http://user/SendEmail.jtp?type=nodenode=3997480i=1
  Oggetto: Re: Skip first word
 
  is there a tokenizer and/or a combination of filter to
  remove the first term from a field?
 
  For example:
  The quick brown fox
 
  should be tokenized as:
  quick
  brown
  fox
 
  There is no such filter that i know of. Though, you can implement one
 with modifying source code of LengthFilterFactory or StopFilterFactory.
 They both remove tokens. Out of curiosity, what is the use case for this?
 
 
 
 



 --
  If you reply to this email, your message will be added to the discussion
 below:
 http://lucene.472066.n3.nabble.com/Skip-first-word-tp3997277p3997480.html
  To unsubscribe from Lucene, click 
 herehttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=472066code=aW4uYWJkdWxAZ21haWwuY29tfDQ3MjA2NnwxMDczOTUyNDEw
 .
 NAMLhttp://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml





-
THANKS AND REGARDS,
SYED ABDUL KATHER
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Skip-first-word-tp3997277p3997509.html
Sent from the Solr - User mailing list archive at Nabble.com.




Re: Skip first word

2012-07-27 Thread Chantal Ackermann
Hi Simone,

no I meant that you populate the two fields with the same input - best done via 
copyField directive.

The first field will contain ngrams of size 1 and 2. The other field will 
contain ngrams of size 3 and longer (you might want to set a decent maxsize 
there).

The query for the autocomplete list uses the first field when the input (typed 
in by the user) is one or two characters long. Your example was: D, G, or 
than Do or Ga. The result would search only on the single token field that 
contains for the input Dolce  Gabbana only the ngrams D and Do. So, only 
the input D or Do would result in a hit on Dolce  Gabbana.
Once the user has typed in the third letter: Dol or Gab, you query the 
second, more tokenized field which would contain for Dolce  Gabbana the 
ngrams Dol Dolc Dolce Gab Gabb Gabba etc.
Both inputs Gab and Dol would then return Dolce  Gabbana.

1. First  field type:

tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=2 
side=front/

2. Secong field type:

tokenizer class=solr.WhitespaceTokenizerFactory/
!-- maybe add WordDelimiter etc. --
filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=10 
side=front/

3. field declarations:

field name=short_prefix type=short_ngram … /
field name=long_prefix type=long_ngram … /

copyField source=short_prefix dest=long_prefix /


Chantal

Am 27.07.2012 um 11:05 schrieb Finotti Simone:

 Hi Chantal,
 
 if I understand correctly, this implies that I have to populate different 
 fields according to their lenght. Since I'm not aware of any logical 
 condition you can apply to copyField directive, it means that this logic has 
 to be implementend by the process that populates the Solr core. Is this 
 assumption correct?
 
 That's kind of bad, because I'd like to have this kind of rules in the Solr 
 configuration. Of course, if that's the only way... :)
 
 Thank you 
 
 
 Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com]
 Inviato: giovedì 26 luglio 2012 18.32
 Fine: solr-user@lucene.apache.org
 Oggetto: Re: Skip first word
 
 Hi,
 
 use two fields:
 1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2 for 
 inputs of length  3,
 2. the other one tokenized as appropriate with minsize=3 and longer for all 
 longer inputs
 
 
 Cheers,
 Chantal
 
 
 Am 26.07.2012 um 09:05 schrieb Finotti Simone:
 
 Hi Ahmet,
 business asked me to apply EdgeNGram with minGramSize=1 on the first term 
 and with minGramSize=3 on the latter terms.
 
 We are developing a search suggestion mechanism, the idea is that if the 
 user types D, the engine should suggest Dolce  Gabbana, but if we type 
 G, it should suggest other brands. Only if users type Gab it should 
 suggest Dolce  Gabbana.
 
 Thanks
 S
 
 Inizio: Ahmet Arslan [iori...@yahoo.com]
 Inviato: mercoledì 25 luglio 2012 18.10
 Fine: solr-user@lucene.apache.org
 Oggetto: Re: Skip first word
 
 is there a tokenizer and/or a combination of filter to
 remove the first term from a field?
 
 For example:
 The quick brown fox
 
 should be tokenized as:
 quick
 brown
 fox
 
 There is no such filter that i know of. Though, you can implement one with 
 modifying source code of LengthFilterFactory or StopFilterFactory. They both 
 remove tokens. Out of curiosity, what is the use case for this?
 
 
 
 
 
 
 
 
 



dynamic EdgeNGramFilter

2012-07-27 Thread Alexander Helhorn

hi

is there a possibility to configure the minGramSize (EdgeNGramFilter) 
dynamically while searching a term.


all my content is indexed with minGramSize=3 and that is ok but when I 
want to search a term like *communic*... solr should not return results 
like *com*puter, *com*mander, *com*a, ...


I know I can avoid this when I use quotes like communic but isn't 
there a better way? It would be nice when I could tell solr (for 
instance with a query parameter) which amout of characters must be 
idendical with the search term -- dynamic minGramSize.


I hope someone can help me.

--
Mit freundlichen Grüßen
Alexander Helhorn
BA-Student/IT-Service

Kommunale Immobilien Jena
Paradiesstr. 6
07743 Jena

Tel.: 0 36 41 49- 55 11
Fax:  0 36 41 49- 11 55 11
E-Mail: alexander.helh...@jena.de
Internet: www.kij.de




__ Information from ESET Mail Security, version of virus signature 
database 7333 (20120727) __

The message was checked by ESET Mail Security.
http://www.eset.com



Re: Skip first word

2012-07-27 Thread Finotti Simone
Brilliant!
Thank you very much :)


Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com]
Inviato: venerdì 27 luglio 2012 11.20
Fine: solr-user@lucene.apache.org
Oggetto: Re: Skip first word

Hi Simone,

no I meant that you populate the two fields with the same input - best done via 
copyField directive.

The first field will contain ngrams of size 1 and 2. The other field will 
contain ngrams of size 3 and longer (you might want to set a decent maxsize 
there).

The query for the autocomplete list uses the first field when the input (typed 
in by the user) is one or two characters long. Your example was: D, G, or 
than Do or Ga. The result would search only on the single token field that 
contains for the input Dolce  Gabbana only the ngrams D and Do. So, only 
the input D or Do would result in a hit on Dolce  Gabbana.
Once the user has typed in the third letter: Dol or Gab, you query the 
second, more tokenized field which would contain for Dolce  Gabbana the 
ngrams Dol Dolc Dolce Gab Gabb Gabba etc.
Both inputs Gab and Dol would then return Dolce  Gabbana.

1. First  field type:

tokenizer class=solr.KeywordTokenizerFactory/
filter class=solr.EdgeNGramFilterFactory minGramSize=1 maxGramSize=2 
side=front/

2. Secong field type:

tokenizer class=solr.WhitespaceTokenizerFactory/
!-- maybe add WordDelimiter etc. --
filter class=solr.EdgeNGramFilterFactory minGramSize=3 maxGramSize=10 
side=front/

3. field declarations:

field name=short_prefix type=short_ngram … /
field name=long_prefix type=long_ngram … /

copyField source=short_prefix dest=long_prefix /


Chantal

Am 27.07.2012 um 11:05 schrieb Finotti Simone:

 Hi Chantal,

 if I understand correctly, this implies that I have to populate different 
 fields according to their lenght. Since I'm not aware of any logical 
 condition you can apply to copyField directive, it means that this logic has 
 to be implementend by the process that populates the Solr core. Is this 
 assumption correct?

 That's kind of bad, because I'd like to have this kind of rules in the Solr 
 configuration. Of course, if that's the only way... :)

 Thank you

 
 Inizio: Chantal Ackermann [c.ackerm...@it-agenten.com]
 Inviato: giovedì 26 luglio 2012 18.32
 Fine: solr-user@lucene.apache.org
 Oggetto: Re: Skip first word

 Hi,

 use two fields:
 1. KeywordTokenizer (= single token) with ngram minsize=1 and maxsize=2 for 
 inputs of length  3,
 2. the other one tokenized as appropriate with minsize=3 and longer for all 
 longer inputs


 Cheers,
 Chantal


 Am 26.07.2012 um 09:05 schrieb Finotti Simone:

 Hi Ahmet,
 business asked me to apply EdgeNGram with minGramSize=1 on the first term 
 and with minGramSize=3 on the latter terms.

 We are developing a search suggestion mechanism, the idea is that if the 
 user types D, the engine should suggest Dolce  Gabbana, but if we type 
 G, it should suggest other brands. Only if users type Gab it should 
 suggest Dolce  Gabbana.

 Thanks
 S
 
 Inizio: Ahmet Arslan [iori...@yahoo.com]
 Inviato: mercoledì 25 luglio 2012 18.10
 Fine: solr-user@lucene.apache.org
 Oggetto: Re: Skip first word

 is there a tokenizer and/or a combination of filter to
 remove the first term from a field?

 For example:
 The quick brown fox

 should be tokenized as:
 quick
 brown
 fox

 There is no such filter that i know of. Though, you can implement one with 
 modifying source code of LengthFilterFactory or StopFilterFactory. They both 
 remove tokens. Out of curiosity, what is the use case for this?
















Solr - customize Fragment using hl.fragmenter and hl.regex.pattern

2012-07-27 Thread meghana
 0 down vote favorite


I want solr highlight in specific format.

Below is string format for which i need to provide highlighting feature
---
130s: LISTEN! LISTEN! 138s: [THUMP] 143s: WHAT IS THAT? 144s: HEAR THAT?
152s: EVERYBODY, SHH. SHH. 156s: STAY UP THERE. 163s: [BOAT CREAKING] 165s:
WHAT IS THAT? 167s: [SCREAMING] 191s: COME ON! 192s: OH, GOD! 193s: AAH!
249s: OK. WE'VE HAD SOME PROBLEMS 253s: AT THE FACILITY. 253s: WHAT WE'RE
ATTEMPTING TO ACHIEVE 256s: HERE HAS NEVER BEEN DONE. 256s: WE'RE THIS CLOSE
259s: TO THE REACTIVATION 259s: OF A HUMAN BRAIN CELL. 260s: DOCTOR, THE 200
MILLION 264s: I'VE SUNK INTO THIS COMPANY 264s: IS DUE IN GREAT PART 266s:
TO YOUR RESEARCH.
---

after user search I want to provide user fragment in below format

*Previous Line of Highlight + Line containing Highlight + Next Line of
Highlight*

For. E.g. user searched for term hear , then one typical highlight fragment
should be like below

*str143s: WHAT IS THAT? 144s: emHEAR/em THAT? 152s: EVERYBODY, SHH.
SHH./str*

above is my ultimate plan , but right now I am trying to get fragment as,
which start with ns: where n is numner between 0 to 

i use hl.regex.slop = 0.6 and my hl.fragsize=120 and below is regex for
that.

*\b(?=\s*\d{1,4}s:){50,200} *

using above regular expression my fragment always do not start with ns:

Please suggest me on this , how can i achieve ultimate plan

Thanks




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-customize-Fragment-using-hl-fragmenter-and-hl-regex-pattern-tp3997693.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Skip first word

2012-07-27 Thread Chantal Ackermann
Your're welcome :-)
C


Re: too many instances of org.tartarus.snowball.Among in the heap

2012-07-27 Thread Bernd Fehling
It is something from internally of the snowball analyzer (stemmer).

To find out more you should take a heapdump and look into it with
Memory Analyzer (MAT) http://www.eclipse.org/mat/

Regards,
Bernd


Am 27.07.2012 09:53, schrieb roz dev:
 Hi All
 
 I am trying to find out the reason for very high memory use and ran JMAP
 -hist
 
 It is showing that i have too many instances of org.tartarus.snowball.Among
 
 Any ideas what is this for and why am I getting so many of them
 
 num   #instances#bytes  Class description
 --
 *1:  467281101869124400  org.tartarus.snowball.Among
 *
 2:  5244210 1840458960  byte[]
 3:  526519495969839368  char[]
 4:  10008928864769280   int[]
 5:  10250527410021080
 java.util.LinkedHashMap$Entry
 6:  4672811 268474232   org.tartarus.snowball.Among[]
 *7:  8072312 258313984   java.util.HashMap$Entry*
 8:  466514  246319392   org.apache.lucene.util.fst.FST$Arc[]
 9:  1828542 237600432   java.util.HashMap$Entry[]
 10: 3834312 153372480   java.util.TreeMap$Entry
 11: 2684700 128865600
 org.apache.lucene.util.fst.Builder$UnCompiledNode
 12: 4712425 113098200   org.apache.lucene.util.BytesRef
 13: 3484836 111514752   java.lang.String
 14: 2636045 105441800   org.apache.lucene.index.FieldInfo
 15: 1813561 101559416   java.util.LinkedHashMap
 16: 6291619 100665904   java.lang.Integer
 17: 2684700 85910400
 org.apache.lucene.util.fst.Builder$Arc
 18: 956998  84215824
 org.apache.lucene.index.TermsHashPerField
 19: 2892957 69430968
 org.apache.lucene.util.AttributeSource$State
 20: 2684700 64432800
 org.apache.lucene.util.fst.Builder$Arc[]
 21: 685595  60332360org.apache.lucene.util.fst.FST
 22: 933451  59210944java.lang.Object[]
 23: 957043  53594408org.apache.lucene.util.BytesRefHash
 24: 591463  42585336
 org.apache.lucene.codecs.BlockTreeTermsReader$FieldReader
 25: 424801  40780896
 org.tartarus.snowball.ext.EnglishStemmer
 26: 424801  40780896
 org.apache.lucene.analysis.miscellaneous.WordDelimiterFilter
 27: 1549670 37192080org.apache.lucene.index.Term
 28: 849602  33984080
 org.apache.lucene.analysis.miscellaneous.WordDelimiterFilter$WordDelimiterConcatenation
 29: 424801  27187264
 org.apache.lucene.analysis.core.WhitespaceTokenizer
 30: 478499  26795944
 org.apache.lucene.index.FreqProxTermsWriterPerField
 31: 535521  25705008
 org.apache.lucene.index.FreqProxTermsWriterPerField$FreqProxPostingsArray
 32: 219081  24537072
 org.apache.lucene.codecs.BlockTreeTermsWriter$TermsWriter
 33: 478499  22967952
 org.apache.lucene.index.FieldInvertState
 34: 956998  22967952
 org.apache.lucene.index.TermsHashPerField$PostingsBytesStartArray
 35: 478499  22967952
 org.apache.lucene.index.TermVectorsConsumerPerField
 36: 478499  22967952
 org.apache.lucene.index.NormsConsumerPerField
 37: 316582  22793904
 org.apache.lucene.store.MMapDirectory$MMapIndexInput
 38: 906708  21760992
 org.apache.lucene.util.AttributeSource$State[]
 39: 906708  21760992
 org.apache.lucene.analysis.tokenattributes.OffsetAttributeImpl
 40: 883588  21206112java.util.ArrayList
 41: 438192  21033216
 org.apache.lucene.store.RAMOutputStream
 42: 860601  20654424java.lang.StringBuilder
 43: 424801  20390448
 org.apache.lucene.analysis.miscellaneous.WordDelimiterIterator
 44: 424801  20390448
 org.apache.lucene.analysis.core.StopFilter
 45: 424801  20390448
 org.apache.lucene.analysis.miscellaneous.KeywordMarkerFilter
 46: 424801  20390448
 org.apache.lucene.analysis.snowball.SnowballFilter
 47: 839390  20145360
 org.apache.lucene.index.DocumentsWriterDeleteQueue$TermNode
 
 
 -Saroj
 

-- 
*
Bernd FehlingUniversitätsbibliothek Bielefeld
Dipl.-Inform. (FH)LibTec - Bibliothekstechnologie
Universitätsstr. 25 und Wissensmanagement
33615 Bielefeld
Tel. +49 521 106-4060   bernd.fehling(at)uni-bielefeld.de

BASE - Bielefeld Academic Search Engine - www.base-search.net
*




Re: too many instances of org.tartarus.snowball.Among in the heap

2012-07-27 Thread Alexandre Rafalovitch
Try taking a couple of thread dumps and see where in the stack the
snowball classes show up. That might give you a clue.

Did you customize the parameters to the stemmer? If so, maybe it has
problems with the file you gave it.

Just some generic thoughts that might help.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Fri, Jul 27, 2012 at 3:53 AM, roz dev rozde...@gmail.com wrote:
 Hi All

 I am trying to find out the reason for very high memory use and ran JMAP
 -hist

 It is showing that i have too many instances of org.tartarus.snowball.Among

 Any ideas what is this for and why am I getting so many of them

 num   #instances#bytes  Class description
 --
 *1:  467281101869124400  org.tartarus.snowball.Among
 *
 2:  5244210 1840458960  byte[]


Re: leaks in solr

2012-07-27 Thread Karthick Duraisamy Soundararaj
I have tons of these open.
searcherName : Searcher@24be0446 main
caching : true
numDocs : 1331167
maxDoc : 1338549
reader : SolrIndexReader{this=5585c0de,r=ReadOnlyDirectoryReader@5585c0de
,refCnt=1,segments=18}
readerDir : org.apache.lucene.store.NIOFSDirectory@
/usr/local/solr/highlander/data/..@2f2d9d89
indexVersion : 1336499508709
openedAt : Fri Jul 27 09:45:16 EDT 2012
registeredAt : Fri Jul 27 09:45:19 EDT 2012
warmupTime : 0

In my custom handler, I have the following code
I have the following problem
Although in my custom handler, I have the following implementation(its not
the full code but it gives an overall idea of the implementation) and it

  class CustomHandler extends SearchHandler {

void handleRequestBody(SolrQueryRequest req,SolrQueryResponse
rsp)

 SolrCore core= req.getCore();
 vectorSimpleOrderedMapObject requestParams =
new   vectorSimpleOrderedMapObject();
/*parse the params such a way that
requestParams[i] -= parameter of the ith
request
  */
..

  try {
   vectorLocalSolrQueryRequests subQueries = new
 vectorLocalSolrQueryRequests(solrcore, requestParams[i]);

   for(i=0;isubQueryCount;i++) {
  ResponseBuilder rb = new ResponseBuilder()
  rb.req = req;
   
  handlerRequestBody(req,rsp,rb); //this would
call search handler's handler request body, whose signature, i have modified
 }
 } finally {
  for(i=0; isubQueries.size();i++)
 subQueries.get(i).close();
 }
  }

*Search Handler Changes*
  class SearchHandler {
void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
rsp, ResponseBuilder rb, ArrayListComponent comps) {
   //  ResponseBuilder rb = new ResponseBuilder()  ;

   ..
 }
void handleRequestBody(SolrQueryRequest req, SolrQueryResponse)
{
 ResponseBuilder rb = new ResponseBuilder(req,rsp, new
ResponseBuilder());
 handleRequestBody(req, rsp, rb, comps) ;
 }
  }


I don see the index old index searcher geting closed after warming up the
new guy... Because I replicate every 5 mintues, it crashes in 2 hours..

On Fri, Jul 27, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:

 in my case, I see only 1 searcher, no field cache - still Old Gen is almost
 full at 22 GB

 Does it have to do with index or some other configuration

 -Saroj

 On Thu, Jul 26, 2012 at 7:41 PM, Lance Norskog goks...@gmail.com wrote:

  What does the Statistics page in the Solr admin say? There might be
  several searchers open: org.apache.solr.search.SolrIndexSearcher
 
  Each searcher holds open different generations of the index. If
  obsolete index files are held open, it may be old searchers. How big
  are the caches? How long does it take to autowarm them?
 
  On Thu, Jul 26, 2012 at 6:15 PM, Karthick Duraisamy Soundararaj
  karthick.soundara...@gmail.com wrote:
   Mark,
   We use solr 3.6.0 on freebsd 9. Over a period of time, it
   accumulates lots of space!
  
   On Thu, Jul 26, 2012 at 8:47 PM, roz dev rozde...@gmail.com wrote:
  
   Thanks Mark.
  
   We are never calling commit or optimize with openSearcher=false.
  
   As per logs, this is what is happening
  
  
 
 openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
  
   --
   But, We are going to use 4.0 Alpha and see if that helps.
  
   -Saroj
  
  
  
  
  
  
  
  
  
  
   On Thu, Jul 26, 2012 at 5:12 PM, Mark Miller markrmil...@gmail.com
   wrote:
  
I'd take a look at this issue:
https://issues.apache.org/jira/browse/SOLR-3392
   
Fixed late April.
   
On Jul 26, 2012, at 7:41 PM, roz dev rozde...@gmail.com wrote:
   
 it was from 4/11/12

 -Saroj

 On Thu, Jul 26, 2012 at 4:21 PM, Mark Miller 
 markrmil...@gmail.com
  
wrote:


 On Jul 26, 2012, at 3:18 PM, roz dev rozde...@gmail.com wrote:

 Hi Guys

 I am also seeing this problem.

 I am using SOLR 4 from Trunk and seeing this issue repeat every
  day.

 Any inputs about how to resolve this would be great

 -Saroj


 Trunk from what date?

 - Mark










   
- Mark Miller
lucidimagination.com
   
   
   
   
   
   
   
   
   
   
   
   
  
 
 
 
  --
  Lance Norskog
  goks...@gmail.com
 



how solr will apply regex fragmenter

2012-07-27 Thread meghana
I was looking on Regex fragment for customizing my highlight fragment, I was
wondering  how Regex fragment works within solr and googled for it , But
didn't found any results.  

Can anybody tell me how regex fragmenter works with in solr. 

And when regex fragmenter apply regex on fragments , do i first get fragment
using default solr operation and then apply regex on it. Or it directly
apply regex on Search term and then return fragment..











--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-solr-will-apply-regex-fragmenter-tp3997749.html
Sent from the Solr - User mailing list archive at Nabble.com.


Problem with Solr 4.0-ALPHA and JSON response

2012-07-27 Thread Federico Valeri
Hi all, I'm new to Solr, I have a problem with JSON format, this is my Java
client code:

PrintWriter out = res.getWriter();
res.setContentType(text/plain);
String query = req.getParameter(query);
SolrServer solr = new HttpSolrServer(solrServer);
ModifiableSolrParams params = new ModifiableSolrParams();
params.set(qt, /select);
params.set(q, contenuto:( + query + ));
params.set(hl, true);
params.set(hl.fl, id,contenuto,score);
params.set(wt, json);

QueryResponse response = solr.query(params);
log.debug(response.toString());
out.print(response.toString());
out.flush();

Now the problem is that I recieve the response but it doesn't trigger the
javascript callback function.
I see wt=javabin in SolrCore.execute log, even if I set wt=json in
paramters, is this normal?
This is the jQuery call to the server:

$.getJSON('solrServer.html', {query:
escape($('input[name=query]:visible').val())}, function(data){
var view = '';
for (var i=0; idata.response.docs.length; i++) {
view += 'p'+data.response.docs[i].contenuto+'/p';
}
$('#placeholder').html(view);
});

Thanks for reading.


Deduplication in SolrCloud

2012-07-27 Thread Daniel Brügge
Hi,

in my old Solr Setup I have used the deduplication feature in the update
chain
with couple of fields.

updateRequestProcessorChain name=dedupe
 processor class=solr.processor.SignatureUpdateProcessorFactory
bool name=enabledtrue/bool
 str name=signatureFieldsignature/str
bool name=overwriteDupesfalse/bool
 str name=fieldsuuid,type,url,content_hash/str
str
name=signatureClassorg.apache.solr.update.processor.Lookup3Signature/str
 /processor
processor class=solr.LogUpdateProcessorFactory /
 processor class=solr.RunUpdateProcessorFactory /
/updateRequestProcessorChain

This worked fine. When I now use this in my 2 shards SolrCloud setup when
inserting 150.000 documents,
I am always getting an error:

*INFO: end_commit_flush*
*Jul 27, 2012 3:29:36 PM org.apache.solr.common.SolrException log*
*SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError:
unable to create new native thread*
* at
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:456)
*
* at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:284)
*

I am inserting the documents via CSV import and curl command and split them
also into 50k chunks.

Without the dedupe chain, the import finishes after 40secs.

The curl command writes to one of my shards.


Do you have an idea why this happens? Should I reduce the fields to one? I
have read that not using the id as
dedupe fields could be an issue?


I have searched for deduplication with SolrCloud and I am wondering if it
is already working correctly? see e.g.
http://lucene.472066.n3.nabble.com/SolrCloud-deduplication-td3984657.html

Thanks  regards

Daniel


RE: Deduplication in SolrCloud

2012-07-27 Thread Markus Jelsma
This issue doesn't really describe your problem but a more general problem of 
distributed deduplication:
https://issues.apache.org/jira/browse/SOLR-3473
 
 
-Original message-
 From:Daniel Brügge daniel.brue...@googlemail.com
 Sent: Fri 27-Jul-2012 17:38
 To: solr-user@lucene.apache.org
 Subject: Deduplication in SolrCloud
 
 Hi,
 
 in my old Solr Setup I have used the deduplication feature in the update
 chain
 with couple of fields.
 
 updateRequestProcessorChain name=dedupe
  processor class=solr.processor.SignatureUpdateProcessorFactory
 bool name=enabledtrue/bool
  str name=signatureFieldsignature/str
 bool name=overwriteDupesfalse/bool
  str name=fieldsuuid,type,url,content_hash/str
 str
 name=signatureClassorg.apache.solr.update.processor.Lookup3Signature/str
  /processor
 processor class=solr.LogUpdateProcessorFactory /
  processor class=solr.RunUpdateProcessorFactory /
 /updateRequestProcessorChain
 
 This worked fine. When I now use this in my 2 shards SolrCloud setup when
 inserting 150.000 documents,
 I am always getting an error:
 
 *INFO: end_commit_flush*
 *Jul 27, 2012 3:29:36 PM org.apache.solr.common.SolrException log*
 *SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError:
 unable to create new native thread*
 * at
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:456)
 *
 * at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:284)
 *
 
 I am inserting the documents via CSV import and curl command and split them
 also into 50k chunks.
 
 Without the dedupe chain, the import finishes after 40secs.
 
 The curl command writes to one of my shards.
 
 
 Do you have an idea why this happens? Should I reduce the fields to one? I
 have read that not using the id as
 dedupe fields could be an issue?
 
 
 I have searched for deduplication with SolrCloud and I am wondering if it
 is already working correctly? see e.g.
 http://lucene.472066.n3.nabble.com/SolrCloud-deduplication-td3984657.html
 
 Thanks  regards
 
 Daniel
 


question(s) re lucene spatial toolkit aka LSP aka spatial4j

2012-07-27 Thread solr-user
hopefully someone is using the lucene spatial toolkit aka LSP aka spatial4j,
and can answer this question

we are using this spatial tool for doing searches.  overall, it seems to
work very well.  however, finding documentation is difficult.

I have a couple of questions:

1. I have a geohash field in my solr schema that contains indexed geographic
polygon data.  I want to find all docs where that polygon intersects a given
lat/long.  I was experimenting with returning distance in the resultset and
with sorting by distance and found that the following query works.  However,
I dont know what distance means in the query.  i.e. is it distance from
point to the polygon centroid, to the closest outer edge of the polygon, its
a useless random value, etc. Does anyone know??

http://solrserver:solrport/solr/core0/select?q=*:*fq={!v=$geoq%20cache=false}geoq=wkt_search:%22Intersects(Circle(-97.057%2047.924%20d=0.01))%22sort=query($geoq)+ascfl=catchment_wkt1_trimmed,school_name,latitude,longitude,dist:query($geoq,-1),loc_city,loc_state

2. some of the polygons, being geographic representations, are very big (ie
state/province polygons).  when solr starts processing a spatial query (like
the one above), I can see (INFO: Building Cache [xx]) it fills in some
sort of memory cache
(org.apache.lucene.spatial.strategy.util.ShapeFieldCache) of the indexed
polygon data.  We are encountering Java OOM issues when this occurs (even
when we booested the mem to 7GB). I know that some of the polygons can have
more than 2300 points, but heavy trimming isn't really an option due to
level of detail issues. Can we control this caching, or the indexing of the
polygons, in any way to reduce the memory requirements??



--
View this message in context: 
http://lucene.472066.n3.nabble.com/question-s-re-lucene-spatial-toolkit-aka-LSP-aka-spatial4j-tp3997757.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Bulk Indexing

2012-07-27 Thread Zhang, Lisheng
Hi,

Previously I asked a similar question and I have not fully implemented yet.

My plan is:
1) use Solr only for search, not for indexing
2) have a separate java process to index (calling lucene API directly, maybe
   can call Solr API, I need to check more details).

As other people pointed earlier, the problem with above plan is that Solr
does not know when to reload IndexSearcher (namely underlying IndexReader)
after indexing is done, since indexer and Solr are two separate processes?

My plan is to let Solr not to cache any IndexReader (each time when performing
search, just create a new IndexSearcher), because:

1) our app is made of many lucene indexed data folders (in Solr language, many
   cores), caching IndexSearcher would be too expensive.
2) in my experience, without caching search is still quite fast (this is 
   maybe partially due to the fact our indexed data is not large, per folder).

This is just my plan (not fully implemented yet).

Best regards, Lisheng

-Original Message-
From: Sohail Aboobaker [mailto:sabooba...@gmail.com]
Sent: Friday, July 27, 2012 6:56 AM
To: solr-user@lucene.apache.org
Subject: Bulk Indexing


Hi,

We have created a search service which is responsible for providing
interface between Solr and rest of our application. It basically takes one
document at a time and updates or adds it to appropriate index.

Now, in application, we have processes, that add products (our document are
based on products) in bulk using a data bulk load process. At this point,
we use the same search service to add the documents in a loop. These can be
up to 20,000 documents in one load.

In a recent solr user discussion, it seems like this is a no-no strategy
with red flags all around it.

What are other alternatives?

Thanks,

Regards,
Sohail Aboobaker.


Re: Bulk Indexing

2012-07-27 Thread Alexandre Rafalovitch
Haven't tried this but:
1) I think SOLR 4 supports on-the-fly core attach/detach/select. Can
somebody confirm this?
2) If 1) is true, run everything as two cores.
3) One core is live in production
4) Second core is detached from SOLR and attached to something like
SolrJ, which I believe can index without going over network
5) Once SolrJ finished bulk import indexing, switch the cores around

Or if you are not live, just use SolrJ to run the index and then
attached finished core to SOLR.

Regards,
   Alex.
Personal blog: http://blog.outerthoughts.com/
LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
- Time is the quality of nature that keeps events from happening all
at once. Lately, it doesn't seem to be working.  (Anonymous  - via GTD
book)


On Fri, Jul 27, 2012 at 9:55 AM, Sohail Aboobaker sabooba...@gmail.com wrote:
 Hi,

 We have created a search service which is responsible for providing
 interface between Solr and rest of our application. It basically takes one
 document at a time and updates or adds it to appropriate index.

 Now, in application, we have processes, that add products (our document are
 based on products) in bulk using a data bulk load process. At this point,
 we use the same search service to add the documents in a loop. These can be
 up to 20,000 documents in one load.

 In a recent solr user discussion, it seems like this is a no-no strategy
 with red flags all around it.

 What are other alternatives?

 Thanks,

 Regards,
 Sohail Aboobaker.


Solr not getting OpenText document name and metadata

2012-07-27 Thread eShard
Hi,
I'm currently using ManifoldCF (v.5.1) to crawl OpenText (v10.5) and the
output is sent to Solr (4.0 alpha).
All I see in the index is an id = to the opentext download URL and a version
(a big integer value).
What I don't see is the document name from OpenText or any of the Opentext
metadata.
Does anyone know how I can get this data? because I can't even search by
document name or by document extension! 
Only a few of the documents actually have a title in the solr index. but the
Opentext name of the document is nowhere to be found.
if I know some text within the document I can search for that.
I'm using the default schema with tika as the extraction handler
I'm also using uprefix = attr to get all of the ignored properties but most
of those are useless.
Please advise...



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-not-getting-OpenText-document-name-and-metadata-tp3997786.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Bulk Indexing

2012-07-27 Thread Sohail Aboobaker
We will be using Solr 3.x version. I was wondering if we do need to worry
about this as we have only 10k index entries at a time. It sounds like a
very low number and we have only document type at this point.

Should we worry about directly using SolrJ for indexing and searching for
this low volume simple schema?


Re: Solr edismax NOT operator behavior

2012-07-27 Thread Jack Krupansky
can any one explain - add the debugQuery=true option to your request and 
Solr will give an explanation, including the parsed query and the Lucene 
scoring of documents.


If you think Solr is wrong, show us a sample document that either is 
supposed to appear that doesn't, or doesn't appear and should. How are the 
results unexpected?


Then do simple queries, each using the id value for the unexplained document 
and each of the clauses in your expression.


-- Jack Krupansky

-Original Message- 
From: Alok Bhandari

Sent: Friday, July 27, 2012 1:55 AM
To: solr-user@lucene.apache.org
Subject: Solr edismax NOT operator behavior

Hello,

I am using Edismax parser and query submitted by application is of the
format

price:1000 AND ( NOT ( launch_date:[2007-06-07T00:00:00.000Z TO
2009-04-07T23:59:59.999Z] AND product_type:electronic)).

Solr while executing gives unexpected result. I am suspecting it is because
of the AND ( NOT  portion of the query .
Please can any one explain me how this structure is handled.

I am using solr 3.6

Any help is appreciated ..

Thanks
Alok





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-edismax-NOT-operator-behavior-tp3997663.html
Sent from the Solr - User mailing list archive at Nabble.com. 



RE: Bulk Indexing

2012-07-27 Thread Lan
I assume your're indexing on the same server that is used to execute search
queries. Adding 20K documents in bulk could cause the Solr Server to 'stop
the world' where the server would stop responding to queries.

My suggestion is 
- Setup master/slave to insulate your clients from 'stop the world' events
during indexing.
- Update in batches with a commit at the end of the batch.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Bulk-Indexing-tp3997745p3997815.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: leaks in solr

2012-07-27 Thread Karthick Duraisamy Soundararaj
Hello all,
While running in my eclipse and run a set of queries, this
works fine, but when I run it in test production server, the searchers are
leaked. Any hint would be appreciated. I have not used CoreContainer.

Considering that the SearchHandler is running fine, I am not able to think
of a reason why my extended version wouldnt work.. Does anyone have any
idea?

On Fri, Jul 27, 2012 at 10:19 AM, Karthick Duraisamy Soundararaj 
karthick.soundara...@gmail.com wrote:

 I have tons of these open.
 searcherName : Searcher@24be0446 main
 caching : true
 numDocs : 1331167
 maxDoc : 1338549
 reader : SolrIndexReader{this=5585c0de,r=ReadOnlyDirectoryReader@5585c0de
 ,refCnt=1,segments=18}
 readerDir : org.apache.lucene.store.NIOFSDirectory@
 /usr/local/solr/highlander/data/..@2f2d9d89
 indexVersion : 1336499508709
 openedAt : Fri Jul 27 09:45:16 EDT 2012
 registeredAt : Fri Jul 27 09:45:19 EDT 2012
 warmupTime : 0

 In my custom handler, I have the following code
 I have the following problem
 Although in my custom handler, I have the following implementation(its not
 the full code but it gives an overall idea of the implementation) and it

   class CustomHandler extends SearchHandler {

 void handleRequestBody(SolrQueryRequest req,SolrQueryResponse
 rsp)

  SolrCore core= req.getCore();
  vectorSimpleOrderedMapObject requestParams =
 new   vectorSimpleOrderedMapObject();
 /*parse the params such a way that
  requestParams[i] -= parameter of the ith
 request
   */
 ..

   try {
vectorLocalSolrQueryRequests subQueries = new
  vectorLocalSolrQueryRequests(solrcore, requestParams[i]);

for(i=0;isubQueryCount;i++) {
   ResponseBuilder rb = new ResponseBuilder()
   rb.req = req;

   handlerRequestBody(req,rsp,rb); //this would
 call search handler's handler request body, whose signature, i have modified
  }
  } finally {
   for(i=0; isubQueries.size();i++)
  subQueries.get(i).close();
  }
   }

 *Search Handler Changes*
   class SearchHandler {
 void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
 rsp, ResponseBuilder rb, ArrayListComponent comps) {
//  ResponseBuilder rb = new ResponseBuilder()  ;

..
  }
 void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse) {
  ResponseBuilder rb = new ResponseBuilder(req,rsp, new
 ResponseBuilder());
  handleRequestBody(req, rsp, rb, comps) ;
  }
   }


 I don see the index old index searcher geting closed after warming up the
 new guy... Because I replicate every 5 mintues, it crashes in 2 hours..

  On Fri, Jul 27, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:

 in my case, I see only 1 searcher, no field cache - still Old Gen is
 almost
 full at 22 GB

 Does it have to do with index or some other configuration

 -Saroj

 On Thu, Jul 26, 2012 at 7:41 PM, Lance Norskog goks...@gmail.com wrote:

  What does the Statistics page in the Solr admin say? There might be
  several searchers open: org.apache.solr.search.SolrIndexSearcher
 
  Each searcher holds open different generations of the index. If
  obsolete index files are held open, it may be old searchers. How big
  are the caches? How long does it take to autowarm them?
 
  On Thu, Jul 26, 2012 at 6:15 PM, Karthick Duraisamy Soundararaj
  karthick.soundara...@gmail.com wrote:
   Mark,
   We use solr 3.6.0 on freebsd 9. Over a period of time, it
   accumulates lots of space!
  
   On Thu, Jul 26, 2012 at 8:47 PM, roz dev rozde...@gmail.com wrote:
  
   Thanks Mark.
  
   We are never calling commit or optimize with openSearcher=false.
  
   As per logs, this is what is happening
  
  
 
 openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
  
   --
   But, We are going to use 4.0 Alpha and see if that helps.
  
   -Saroj
  
  
  
  
  
  
  
  
  
  
   On Thu, Jul 26, 2012 at 5:12 PM, Mark Miller markrmil...@gmail.com
   wrote:
  
I'd take a look at this issue:
https://issues.apache.org/jira/browse/SOLR-3392
   
Fixed late April.
   
On Jul 26, 2012, at 7:41 PM, roz dev rozde...@gmail.com wrote:
   
 it was from 4/11/12

 -Saroj

 On Thu, Jul 26, 2012 at 4:21 PM, Mark Miller 
 markrmil...@gmail.com
  
wrote:


 On Jul 26, 2012, at 3:18 PM, roz dev rozde...@gmail.com
 wrote:

 Hi Guys

 I am also seeing this problem.

 I am using SOLR 4 from Trunk and seeing this 

Re: leaks in solr

2012-07-27 Thread Karthick Duraisamy Soundararaj
Just to clarify, the leak happens everytime a new searcher is opened.

On Fri, Jul 27, 2012 at 8:28 PM, Karthick Duraisamy Soundararaj 
karthick.soundara...@gmail.com wrote:

 Hello all,
 While running in my eclipse and run a set of queries, this
 works fine, but when I run it in test production server, the searchers are
 leaked. Any hint would be appreciated. I have not used CoreContainer.

 Considering that the SearchHandler is running fine, I am not able to think
 of a reason why my extended version wouldnt work.. Does anyone have any
 idea?

 On Fri, Jul 27, 2012 at 10:19 AM, Karthick Duraisamy Soundararaj 
 karthick.soundara...@gmail.com wrote:

 I have tons of these open.
 searcherName : Searcher@24be0446 main
 caching : true
 numDocs : 1331167
 maxDoc : 1338549
 reader : SolrIndexReader{this=5585c0de,r=ReadOnlyDirectoryReader@5585c0de
 ,refCnt=1,segments=18}
 readerDir : org.apache.lucene.store.NIOFSDirectory@
 /usr/local/solr/highlander/data/..@2f2d9d89
 indexVersion : 1336499508709
 openedAt : Fri Jul 27 09:45:16 EDT 2012
 registeredAt : Fri Jul 27 09:45:19 EDT 2012
 warmupTime : 0

 In my custom handler, I have the following code
 I have the following problem
 Although in my custom handler, I have the following implementation(its
 not the full code but it gives an overall idea of the implementation) and it

   class CustomHandler extends SearchHandler {

 void handleRequestBody(SolrQueryRequest req,SolrQueryResponse
 rsp)

  SolrCore core= req.getCore();
  vectorSimpleOrderedMapObject requestParams =
 new   vectorSimpleOrderedMapObject();
 /*parse the params such a way that
  requestParams[i] -= parameter of the
 ith request
   */
 ..

   try {
vectorLocalSolrQueryRequests subQueries = new
  vectorLocalSolrQueryRequests(solrcore, requestParams[i]);

for(i=0;isubQueryCount;i++) {
   ResponseBuilder rb = new ResponseBuilder()
   rb.req = req;

   handlerRequestBody(req,rsp,rb); //this
 would call search handler's handler request body, whose signature, i have
 modified
  }
  } finally {
   for(i=0; isubQueries.size();i++)
  subQueries.get(i).close();
  }
   }

 *Search Handler Changes*
   class SearchHandler {
 void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse rsp, ResponseBuilder rb, ArrayListComponent comps) {
//  ResponseBuilder rb = new ResponseBuilder()  ;

..
  }
 void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse) {
  ResponseBuilder rb = new ResponseBuilder(req,rsp,
 new ResponseBuilder());
  handleRequestBody(req, rsp, rb, comps) ;
  }
   }


 I don see the index old index searcher geting closed after warming up the
 new guy... Because I replicate every 5 mintues, it crashes in 2 hours..

  On Fri, Jul 27, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:

 in my case, I see only 1 searcher, no field cache - still Old Gen is
 almost
 full at 22 GB

 Does it have to do with index or some other configuration

 -Saroj

 On Thu, Jul 26, 2012 at 7:41 PM, Lance Norskog goks...@gmail.com
 wrote:

  What does the Statistics page in the Solr admin say? There might be
  several searchers open: org.apache.solr.search.SolrIndexSearcher
 
  Each searcher holds open different generations of the index. If
  obsolete index files are held open, it may be old searchers. How big
  are the caches? How long does it take to autowarm them?
 
  On Thu, Jul 26, 2012 at 6:15 PM, Karthick Duraisamy Soundararaj
  karthick.soundara...@gmail.com wrote:
   Mark,
   We use solr 3.6.0 on freebsd 9. Over a period of time, it
   accumulates lots of space!
  
   On Thu, Jul 26, 2012 at 8:47 PM, roz dev rozde...@gmail.com wrote:
  
   Thanks Mark.
  
   We are never calling commit or optimize with openSearcher=false.
  
   As per logs, this is what is happening
  
  
 
 openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
  
   --
   But, We are going to use 4.0 Alpha and see if that helps.
  
   -Saroj
  
  
  
  
  
  
  
  
  
  
   On Thu, Jul 26, 2012 at 5:12 PM, Mark Miller markrmil...@gmail.com
 
   wrote:
  
I'd take a look at this issue:
https://issues.apache.org/jira/browse/SOLR-3392
   
Fixed late April.
   
On Jul 26, 2012, at 7:41 PM, roz dev rozde...@gmail.com wrote:
   
 it was from 4/11/12

 -Saroj

 On Thu, Jul 26, 2012 at 4:21 PM, Mark Miller 
 markrmil...@gmail.com
  
wrote:


Re: leaks in solr

2012-07-27 Thread Lance Norskog
A finally clause can throw exceptions. Can this throw an exception?
 subQueries.get(i).close();

 If so, each close() call should be in a try-catch block.

On Fri, Jul 27, 2012 at 5:28 PM, Karthick Duraisamy Soundararaj
karthick.soundara...@gmail.com wrote:
 Hello all,
 While running in my eclipse and run a set of queries, this
 works fine, but when I run it in test production server, the searchers are
 leaked. Any hint would be appreciated. I have not used CoreContainer.

 Considering that the SearchHandler is running fine, I am not able to think
 of a reason why my extended version wouldnt work.. Does anyone have any
 idea?

 On Fri, Jul 27, 2012 at 10:19 AM, Karthick Duraisamy Soundararaj 
 karthick.soundara...@gmail.com wrote:

 I have tons of these open.
 searcherName : Searcher@24be0446 main
 caching : true
 numDocs : 1331167
 maxDoc : 1338549
 reader : SolrIndexReader{this=5585c0de,r=ReadOnlyDirectoryReader@5585c0de
 ,refCnt=1,segments=18}
 readerDir : org.apache.lucene.store.NIOFSDirectory@
 /usr/local/solr/highlander/data/..@2f2d9d89
 indexVersion : 1336499508709
 openedAt : Fri Jul 27 09:45:16 EDT 2012
 registeredAt : Fri Jul 27 09:45:19 EDT 2012
 warmupTime : 0

 In my custom handler, I have the following code
 I have the following problem
 Although in my custom handler, I have the following implementation(its not
 the full code but it gives an overall idea of the implementation) and it

   class CustomHandler extends SearchHandler {

 void handleRequestBody(SolrQueryRequest req,SolrQueryResponse
 rsp)

  SolrCore core= req.getCore();
  vectorSimpleOrderedMapObject requestParams =
 new   vectorSimpleOrderedMapObject();
 /*parse the params such a way that
  requestParams[i] -= parameter of the ith
 request
   */
 ..

   try {
vectorLocalSolrQueryRequests subQueries = new
  vectorLocalSolrQueryRequests(solrcore, requestParams[i]);

for(i=0;isubQueryCount;i++) {
   ResponseBuilder rb = new ResponseBuilder()
   rb.req = req;

   handlerRequestBody(req,rsp,rb); //this would
 call search handler's handler request body, whose signature, i have modified
  }
  } finally {
   for(i=0; isubQueries.size();i++)
  subQueries.get(i).close();
  }
   }

 *Search Handler Changes*
   class SearchHandler {
 void handleRequestBody(SolrQueryRequest req, SolrQueryResponse
 rsp, ResponseBuilder rb, ArrayListComponent comps) {
//  ResponseBuilder rb = new ResponseBuilder()  ;

..
  }
 void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse) {
  ResponseBuilder rb = new ResponseBuilder(req,rsp, new
 ResponseBuilder());
  handleRequestBody(req, rsp, rb, comps) ;
  }
   }


 I don see the index old index searcher geting closed after warming up the
 new guy... Because I replicate every 5 mintues, it crashes in 2 hours..

  On Fri, Jul 27, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:

 in my case, I see only 1 searcher, no field cache - still Old Gen is
 almost
 full at 22 GB

 Does it have to do with index or some other configuration

 -Saroj

 On Thu, Jul 26, 2012 at 7:41 PM, Lance Norskog goks...@gmail.com wrote:

  What does the Statistics page in the Solr admin say? There might be
  several searchers open: org.apache.solr.search.SolrIndexSearcher
 
  Each searcher holds open different generations of the index. If
  obsolete index files are held open, it may be old searchers. How big
  are the caches? How long does it take to autowarm them?
 
  On Thu, Jul 26, 2012 at 6:15 PM, Karthick Duraisamy Soundararaj
  karthick.soundara...@gmail.com wrote:
   Mark,
   We use solr 3.6.0 on freebsd 9. Over a period of time, it
   accumulates lots of space!
  
   On Thu, Jul 26, 2012 at 8:47 PM, roz dev rozde...@gmail.com wrote:
  
   Thanks Mark.
  
   We are never calling commit or optimize with openSearcher=false.
  
   As per logs, this is what is happening
  
  
 
 openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
  
   --
   But, We are going to use 4.0 Alpha and see if that helps.
  
   -Saroj
  
  
  
  
  
  
  
  
  
  
   On Thu, Jul 26, 2012 at 5:12 PM, Mark Miller markrmil...@gmail.com
   wrote:
  
I'd take a look at this issue:
https://issues.apache.org/jira/browse/SOLR-3392
   
Fixed late April.
   
On Jul 26, 2012, at 7:41 PM, roz dev rozde...@gmail.com wrote:
   
 it was from 4/11/12

 -Saroj

 On Thu, 

Re: Deduplication in SolrCloud

2012-07-27 Thread Lance Norskog
Should the old Signature code be removed? Given that the goal is to
have everyone use SolrCloud, maybe this kind of landmine should be
removed?

On Fri, Jul 27, 2012 at 8:43 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 This issue doesn't really describe your problem but a more general problem of 
 distributed deduplication:
 https://issues.apache.org/jira/browse/SOLR-3473


 -Original message-
 From:Daniel Brügge daniel.brue...@googlemail.com
 Sent: Fri 27-Jul-2012 17:38
 To: solr-user@lucene.apache.org
 Subject: Deduplication in SolrCloud

 Hi,

 in my old Solr Setup I have used the deduplication feature in the update
 chain
 with couple of fields.

 updateRequestProcessorChain name=dedupe
  processor class=solr.processor.SignatureUpdateProcessorFactory
 bool name=enabledtrue/bool
  str name=signatureFieldsignature/str
 bool name=overwriteDupesfalse/bool
  str name=fieldsuuid,type,url,content_hash/str
 str
 name=signatureClassorg.apache.solr.update.processor.Lookup3Signature/str
  /processor
 processor class=solr.LogUpdateProcessorFactory /
  processor class=solr.RunUpdateProcessorFactory /
 /updateRequestProcessorChain

 This worked fine. When I now use this in my 2 shards SolrCloud setup when
 inserting 150.000 documents,
 I am always getting an error:

 *INFO: end_commit_flush*
 *Jul 27, 2012 3:29:36 PM org.apache.solr.common.SolrException log*
 *SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError:
 unable to create new native thread*
 * at
 org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:456)
 *
 * at
 org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:284)
 *

 I am inserting the documents via CSV import and curl command and split them
 also into 50k chunks.

 Without the dedupe chain, the import finishes after 40secs.

 The curl command writes to one of my shards.


 Do you have an idea why this happens? Should I reduce the fields to one? I
 have read that not using the id as
 dedupe fields could be an issue?


 I have searched for deduplication with SolrCloud and I am wondering if it
 is already working correctly? see e.g.
 http://lucene.472066.n3.nabble.com/SolrCloud-deduplication-td3984657.html

 Thanks  regards

 Daniel




-- 
Lance Norskog
goks...@gmail.com


Re: querying using filter query and lots of possible values

2012-07-27 Thread Chris Hostetter

: the list of IDs is constant for a longer time. I will take a look at
: these join thematic.
: Maybe another solution would be to really create a whole new
: collection or set of documents containing the aggregated documents (from the
: ids) from scratch and to execute queries on this collection. Then this
: would take
: some time, but maybe it's worth it because the querying will thank you.

Another avenue to consider...

http://lucene.apache.org/solr/api-4_0_0-ALPHA/org/apache/solr/schema/ExternalFileField.html

...would allow you to map values in your source_id to some numeric 
values (many to many) and these numeric values would then be accessible in 
functions -- so you could use something like fq={!frange ...} to select 
all docs with value 67 where your extenral file field says that value 67 
is mapped ot the following thousand source_id values.

the external field fields can then be modified at any time just by doing a 
commit on your index.



-Hoss


Re: leaks in solr

2012-07-27 Thread Karthick Duraisamy Soundararaj
First no. Because i do the following
for(i=0;isubqueries.size();i++) {
  subQueries.get(i).close();
}

Second, I dont see any exception until the first searcher leak happens.

On Fri, Jul 27, 2012 at 9:04 PM, Lance Norskog goks...@gmail.com wrote:

 A finally clause can throw exceptions. Can this throw an exception?
  subQueries.get(i).close();

  If so, each close() call should be in a try-catch block.

 On Fri, Jul 27, 2012 at 5:28 PM, Karthick Duraisamy Soundararaj
 karthick.soundara...@gmail.com wrote:
  Hello all,
  While running in my eclipse and run a set of queries, this
  works fine, but when I run it in test production server, the searchers
 are
  leaked. Any hint would be appreciated. I have not used CoreContainer.
 
  Considering that the SearchHandler is running fine, I am not able to
 think
  of a reason why my extended version wouldnt work.. Does anyone have any
  idea?
 
  On Fri, Jul 27, 2012 at 10:19 AM, Karthick Duraisamy Soundararaj 
  karthick.soundara...@gmail.com wrote:
 
  I have tons of these open.
  searcherName : Searcher@24be0446 main
  caching : true
  numDocs : 1331167
  maxDoc : 1338549
  reader :
 SolrIndexReader{this=5585c0de,r=ReadOnlyDirectoryReader@5585c0de
  ,refCnt=1,segments=18}
  readerDir : org.apache.lucene.store.NIOFSDirectory@
  /usr/local/solr/highlander/data/..@2f2d9d89
  indexVersion : 1336499508709
  openedAt : Fri Jul 27 09:45:16 EDT 2012
  registeredAt : Fri Jul 27 09:45:19 EDT 2012
  warmupTime : 0
 
  In my custom handler, I have the following code
  I have the following problem
  Although in my custom handler, I have the following implementation(its
 not
  the full code but it gives an overall idea of the implementation) and it
 
class CustomHandler extends SearchHandler {
 
  void handleRequestBody(SolrQueryRequest
 req,SolrQueryResponse
  rsp)
 
   SolrCore core= req.getCore();
   vectorSimpleOrderedMapObject requestParams
 =
  new   vectorSimpleOrderedMapObject();
  /*parse the params such a way that
   requestParams[i] -= parameter of the
 ith
  request
*/
  ..
 
try {
 vectorLocalSolrQueryRequests subQueries = new
   vectorLocalSolrQueryRequests(solrcore, requestParams[i]);
 
 for(i=0;isubQueryCount;i++) {
ResponseBuilder rb = new ResponseBuilder()
rb.req = req;
 
handlerRequestBody(req,rsp,rb); //this
 would
  call search handler's handler request body, whose signature, i have
 modified
   }
   } finally {
for(i=0; isubQueries.size();i++)
   subQueries.get(i).close();
   }
}
 
  *Search Handler Changes*
class SearchHandler {
  void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse
  rsp, ResponseBuilder rb, ArrayListComponent comps) {
 //  ResponseBuilder rb = new ResponseBuilder()  ;
 
 ..
   }
  void handleRequestBody(SolrQueryRequest req,
  SolrQueryResponse) {
   ResponseBuilder rb = new ResponseBuilder(req,rsp,
 new
  ResponseBuilder());
   handleRequestBody(req, rsp, rb, comps) ;
   }
}
 
 
  I don see the index old index searcher geting closed after warming up
 the
  new guy... Because I replicate every 5 mintues, it crashes in 2 hours..
 
   On Fri, Jul 27, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:
 
  in my case, I see only 1 searcher, no field cache - still Old Gen is
  almost
  full at 22 GB
 
  Does it have to do with index or some other configuration
 
  -Saroj
 
  On Thu, Jul 26, 2012 at 7:41 PM, Lance Norskog goks...@gmail.com
 wrote:
 
   What does the Statistics page in the Solr admin say? There might be
   several searchers open: org.apache.solr.search.SolrIndexSearcher
  
   Each searcher holds open different generations of the index. If
   obsolete index files are held open, it may be old searchers. How big
   are the caches? How long does it take to autowarm them?
  
   On Thu, Jul 26, 2012 at 6:15 PM, Karthick Duraisamy Soundararaj
   karthick.soundara...@gmail.com wrote:
Mark,
We use solr 3.6.0 on freebsd 9. Over a period of time, it
accumulates lots of space!
   
On Thu, Jul 26, 2012 at 8:47 PM, roz dev rozde...@gmail.com
 wrote:
   
Thanks Mark.
   
We are never calling commit or optimize with openSearcher=false.
   
As per logs, this is what is happening
   
   
  
 
 openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false}
   
--
   

Re: leaks in solr

2012-07-27 Thread Karthick Duraisamy Soundararaj
SimpleOrderedMapObject commonRequestParams; //This holds the common
request params.
VectorSimpleOrderedMapObject subQueryRequestParams;  // This holds the
request params of sub Queries

I use the above to create multiple localQueryRequests. To add a little more
information, I create new ResponseBuilder for each request

I also hold a reference to query component as a private member in my
CustomHandler. Considering that the component is initialized only once
during the start up, I assume this isnt a cause of concernt.

On Fri, Jul 27, 2012 at 9:49 PM, Karthick Duraisamy Soundararaj 
karthick.soundara...@gmail.com wrote:

 First no. Because i do the following
 for(i=0;isubqueries.size();i++) {
   subQueries.get(i).close();
 }

 Second, I dont see any exception until the first searcher leak happens.


 On Fri, Jul 27, 2012 at 9:04 PM, Lance Norskog goks...@gmail.com wrote:

 A finally clause can throw exceptions. Can this throw an exception?
  subQueries.get(i).close();

  If so, each close() call should be in a try-catch block.

 On Fri, Jul 27, 2012 at 5:28 PM, Karthick Duraisamy Soundararaj
 karthick.soundara...@gmail.com wrote:
  Hello all,
  While running in my eclipse and run a set of queries, this
  works fine, but when I run it in test production server, the searchers
 are
  leaked. Any hint would be appreciated. I have not used CoreContainer.
 
  Considering that the SearchHandler is running fine, I am not able to
 think
  of a reason why my extended version wouldnt work.. Does anyone have any
  idea?
 
  On Fri, Jul 27, 2012 at 10:19 AM, Karthick Duraisamy Soundararaj 
  karthick.soundara...@gmail.com wrote:
 
  I have tons of these open.
  searcherName : Searcher@24be0446 main
  caching : true
  numDocs : 1331167
  maxDoc : 1338549
  reader :
 SolrIndexReader{this=5585c0de,r=ReadOnlyDirectoryReader@5585c0de
  ,refCnt=1,segments=18}
  readerDir : org.apache.lucene.store.NIOFSDirectory@
  /usr/local/solr/highlander/data/..@2f2d9d89
  indexVersion : 1336499508709
  openedAt : Fri Jul 27 09:45:16 EDT 2012
  registeredAt : Fri Jul 27 09:45:19 EDT 2012
  warmupTime : 0
 
  In my custom handler, I have the following code
  I have the following problem
  Although in my custom handler, I have the following implementation(its
 not
  the full code but it gives an overall idea of the implementation) and
 it
 
class CustomHandler extends SearchHandler {
 
  void handleRequestBody(SolrQueryRequest
 req,SolrQueryResponse
  rsp)
 
   SolrCore core= req.getCore();
   vectorSimpleOrderedMapObject
 requestParams =
  new   vectorSimpleOrderedMapObject();
  /*parse the params such a way that
   requestParams[i] -= parameter of the
 ith
  request
*/
  ..
 
try {
 vectorLocalSolrQueryRequests subQueries = new
   vectorLocalSolrQueryRequests(solrcore, requestParams[i]);
 
 for(i=0;isubQueryCount;i++) {
ResponseBuilder rb = new
 ResponseBuilder()
rb.req = req;
 
handlerRequestBody(req,rsp,rb); //this
 would
  call search handler's handler request body, whose signature, i have
 modified
   }
   } finally {
for(i=0; isubQueries.size();i++)
   subQueries.get(i).close();
   }
}
 
  *Search Handler Changes*
class SearchHandler {
  void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse
  rsp, ResponseBuilder rb, ArrayListComponent comps) {
 //  ResponseBuilder rb = new ResponseBuilder()  ;
 
 ..
   }
  void handleRequestBody(SolrQueryRequest req,
  SolrQueryResponse) {
   ResponseBuilder rb = new ResponseBuilder(req,rsp,
 new
  ResponseBuilder());
   handleRequestBody(req, rsp, rb, comps) ;
   }
}
 
 
  I don see the index old index searcher geting closed after warming up
 the
  new guy... Because I replicate every 5 mintues, it crashes in 2 hours..
 
   On Fri, Jul 27, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:
 
  in my case, I see only 1 searcher, no field cache - still Old Gen is
  almost
  full at 22 GB
 
  Does it have to do with index or some other configuration
 
  -Saroj
 
  On Thu, Jul 26, 2012 at 7:41 PM, Lance Norskog goks...@gmail.com
 wrote:
 
   What does the Statistics page in the Solr admin say? There might
 be
   several searchers open: org.apache.solr.search.SolrIndexSearcher
  
   Each searcher holds open different generations of the index. If
   obsolete index files are held 

Re: leaks in solr

2012-07-27 Thread Karthick Duraisamy Soundararaj
subQueries.get(i).close() is nothing but pulling the refrence from the
vector and closing it. So yes. it wouldnt throw exception.

vectorLocalSolrQueryRequests subQueries

Please let me know if you need any more information

On Fri, Jul 27, 2012 at 10:14 PM, Karthick Duraisamy Soundararaj 
karthick.soundara...@gmail.com wrote:

 SimpleOrderedMapObject commonRequestParams; //This holds the common
 request params.
 VectorSimpleOrderedMapObject subQueryRequestParams;  // This holds the
 request params of sub Queries

 I use the above to create multiple localQueryRequests. To add a little
 more information, I create new ResponseBuilder for each request

 I also hold a reference to query component as a private member in my
 CustomHandler. Considering that the component is initialized only once
 during the start up, I assume this isnt a cause of concernt.

 On Fri, Jul 27, 2012 at 9:49 PM, Karthick Duraisamy Soundararaj 
 karthick.soundara...@gmail.com wrote:

 First no. Because i do the following
 for(i=0;isubqueries.size();i++) {
   subQueries.get(i).close();
 }

 Second, I dont see any exception until the first searcher leak happens.


 On Fri, Jul 27, 2012 at 9:04 PM, Lance Norskog goks...@gmail.com wrote:

 A finally clause can throw exceptions. Can this throw an exception?
  subQueries.get(i).close();

  If so, each close() call should be in a try-catch block.

 On Fri, Jul 27, 2012 at 5:28 PM, Karthick Duraisamy Soundararaj
 karthick.soundara...@gmail.com wrote:
  Hello all,
  While running in my eclipse and run a set of queries, this
  works fine, but when I run it in test production server, the searchers
 are
  leaked. Any hint would be appreciated. I have not used CoreContainer.
 
  Considering that the SearchHandler is running fine, I am not able to
 think
  of a reason why my extended version wouldnt work.. Does anyone have any
  idea?
 
  On Fri, Jul 27, 2012 at 10:19 AM, Karthick Duraisamy Soundararaj 
  karthick.soundara...@gmail.com wrote:
 
  I have tons of these open.
  searcherName : Searcher@24be0446 main
  caching : true
  numDocs : 1331167
  maxDoc : 1338549
  reader :
 SolrIndexReader{this=5585c0de,r=ReadOnlyDirectoryReader@5585c0de
  ,refCnt=1,segments=18}
  readerDir : org.apache.lucene.store.NIOFSDirectory@
  /usr/local/solr/highlander/data/..@2f2d9d89
  indexVersion : 1336499508709
  openedAt : Fri Jul 27 09:45:16 EDT 2012
  registeredAt : Fri Jul 27 09:45:19 EDT 2012
  warmupTime : 0
 
  In my custom handler, I have the following code
  I have the following problem
  Although in my custom handler, I have the following
 implementation(its not
  the full code but it gives an overall idea of the implementation) and
 it
 
class CustomHandler extends SearchHandler {
 
  void handleRequestBody(SolrQueryRequest
 req,SolrQueryResponse
  rsp)
 
   SolrCore core= req.getCore();
   vectorSimpleOrderedMapObject
 requestParams =
  new   vectorSimpleOrderedMapObject();
  /*parse the params such a way that
   requestParams[i] -= parameter of
 the ith
  request
*/
  ..
 
try {
 vectorLocalSolrQueryRequests subQueries = new
   vectorLocalSolrQueryRequests(solrcore, requestParams[i]);
 
 for(i=0;isubQueryCount;i++) {
ResponseBuilder rb = new
 ResponseBuilder()
rb.req = req;
 
handlerRequestBody(req,rsp,rb); //this
 would
  call search handler's handler request body, whose signature, i have
 modified
   }
   } finally {
for(i=0; isubQueries.size();i++)
   subQueries.get(i).close();
   }
}
 
  *Search Handler Changes*
class SearchHandler {
  void handleRequestBody(SolrQueryRequest req,
 SolrQueryResponse
  rsp, ResponseBuilder rb, ArrayListComponent comps) {
 //  ResponseBuilder rb = new ResponseBuilder()  ;
 
 ..
   }
  void handleRequestBody(SolrQueryRequest req,
  SolrQueryResponse) {
   ResponseBuilder rb = new
 ResponseBuilder(req,rsp, new
  ResponseBuilder());
   handleRequestBody(req, rsp, rb, comps) ;
   }
}
 
 
  I don see the index old index searcher geting closed after warming up
 the
  new guy... Because I replicate every 5 mintues, it crashes in 2
 hours..
 
   On Fri, Jul 27, 2012 at 3:36 AM, roz dev rozde...@gmail.com wrote:
 
  in my case, I see only 1 searcher, no field cache - still Old Gen is
  almost
  full at 22 GB
 
  Does it have to do with index or some other