Re: JVM Heap utilization Memory leaks with Solr

2009-08-20 Thread Rahul R
All these 3700 fields are single valued non-boolean fields. Thanks

Regards
Rahul

On Wed, Aug 19, 2009 at 8:33 PM, Fuad Efendi f...@efendi.ca wrote:


 Hi Rahul,

 JRockit could be used at least in a test environment to monitor JVM (and
 troubleshoot SOLR, licensed for-free for developers!); they have even
 Eclipse plugin now, and it is licensed by Oracle (BEA)... But, of course,
 in
 large companies test environment is in hands of testers :)


 But... 3700 fields will create (over time) 3700 arrays  each of size
 5,000,000!!! Even if most of fields are empty for most of documents...
 Applicable to non-tokenized single-valued non-boolean fields only, Lucene
 internals, FieldCache... and it won't be GC-collected after user log-off...
 prefer dedicated box for SOLR.

 -Fuad


 -Original Message-
 From: Rahul R [mailto:rahul.s...@gmail.com]
 Sent: August-19-09 6:19 AM
 To: solr-user@lucene.apache.org
  Subject: Re: JVM Heap utilization  Memory leaks with Solr

 Fuad,
 We have around 5 million documents and around 3700 fields. All documents
 will not have values for all the fields JRockit is not approved for use
 within my organization. But thanks for the info anyway.

 Regards
 Rahul

 On Tue, Aug 18, 2009 at 9:41 AM, Funtick f...@efendi.ca wrote:

 
  BTW, you should really prefer JRockit which really rocks!!!
 
  Mission Control has necessary toolongs; and JRockit produces _nice_
  exception stacktrace (explaining almost everything) in case of even OOM
  which SUN JVN still fails to produce.
 
 
  SolrServlet still catches Throwable:
 
 } catch (Throwable e) {
   SolrException.log(log,e);
   sendErr(500, SolrException.toStr(e), request, response);
 } finally {
 
 
 
 
 
  Rahul R wrote:
  
   Otis,
   Thank you for your response. I know there are a few variables here but
  the
   difference in memory utilization with and without shards somehow leads
 me
   to
   believe that the leak could be within Solr.
  
   I tried using a profiling tool - Yourkit. The trial version was free
 for
   15
   days. But I couldn't find anything of significance.
  
   Regards
   Rahul
  
  
   On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic
   otis_gospodne...@yahoo.com
   wrote:
  
   Hi Rahul,
  
   A) There are no known (to me) memory leaks.
   I think there are too many variables for a person to tell you what
   exactly
   is happening, plus you are dealing with the JVM here. :)
  
   Try jmap -histo:live PID-HERE | less and see what's using your memory.
  
   Otis
   --
   Sematext is hiring -- http://sematext.com/about/jobs.html?mls
   Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
  
  
  
   - Original Message 
From: Rahul R rahul.s...@gmail.com
To: solr-user@lucene.apache.org
Sent: Tuesday, August 4, 2009 1:09:06 AM
Subject: JVM Heap utilization  Memory leaks with Solr
   
I am trying to track memory utilization with my Application that
 uses
   Solr.
Details of the setup :
-3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr
 1.3.0
- Hardware : 12 CPU, 24 GB RAM
   
For testing during PSR I am using a smaller subset of the actual
 data
   that I
want to work with. Details of this smaller sub-set :
- 5 million records, 4.5 GB index size
   
Observations during PSR:
A) I have allocated 3.2 GB for the JVM(s) that I used. After all
 users
logout and doing a force GC, only 60 % of the heap is reclaimed. As
   part
   of
the logout process I am invalidating the HttpSession and doing a
   close()
   on
CoreContainer. From my application's side, I don't believe I am
  holding
   on
to any resource. I wanted to know if there are known issues
  surrounding
memory leaks with Solr ?
B) To further test this, I tried deploying with shards. 3.2 GB was
   allocated
to each JVM. All JVMs had 96 % free heap space after start up. I got
   varying
results with this.
Case 1 : Used 6 weblogic domains. My application was deployed one 1
   domain.
I split the 5 million index into 5 parts of 1 million each and used
   them
   as
shards. After multiple users used the system and doing a force GC,
   around
   94
- 96 % of heap was reclaimed in all the JVMs.
Case 2: Used 2 weblogic domains. My application was deployed on 1
   domain.
   On
the other, I deployed the entire 5 million part index as one shard.
   After
multiple users used the system and doing a gorce GC, around 76 % of
  the
   heap
was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM
  where
   my
application was running. This result further convinces me that my
application can be absolved of holding on to memory resources.
   
I am not sure how to interpret these results ? For searching, I am
   using
Without Shards : EmbeddedSolrServer
With Shards :CommonsHttpSolrServer
In terms of Solr objects this is what differs in my code between
  normal
search and shards

Re: JVM Heap utilization Memory leaks with Solr

2009-08-19 Thread Rahul R
Fuad,
We have around 5 million documents and around 3700 fields. All documents
will not have values for all the fields JRockit is not approved for use
within my organization. But thanks for the info anyway.

Regards
Rahul

On Tue, Aug 18, 2009 at 9:41 AM, Funtick f...@efendi.ca wrote:


 BTW, you should really prefer JRockit which really rocks!!!

 Mission Control has necessary toolongs; and JRockit produces _nice_
 exception stacktrace (explaining almost everything) in case of even OOM
 which SUN JVN still fails to produce.


 SolrServlet still catches Throwable:

} catch (Throwable e) {
  SolrException.log(log,e);
  sendErr(500, SolrException.toStr(e), request, response);
} finally {





 Rahul R wrote:
 
  Otis,
  Thank you for your response. I know there are a few variables here but
 the
  difference in memory utilization with and without shards somehow leads me
  to
  believe that the leak could be within Solr.
 
  I tried using a profiling tool - Yourkit. The trial version was free for
  15
  days. But I couldn't find anything of significance.
 
  Regards
  Rahul
 
 
  On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic
  otis_gospodne...@yahoo.com
  wrote:
 
  Hi Rahul,
 
  A) There are no known (to me) memory leaks.
  I think there are too many variables for a person to tell you what
  exactly
  is happening, plus you are dealing with the JVM here. :)
 
  Try jmap -histo:live PID-HERE | less and see what's using your memory.
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
   From: Rahul R rahul.s...@gmail.com
   To: solr-user@lucene.apache.org
   Sent: Tuesday, August 4, 2009 1:09:06 AM
   Subject: JVM Heap utilization  Memory leaks with Solr
  
   I am trying to track memory utilization with my Application that uses
  Solr.
   Details of the setup :
   -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
   - Hardware : 12 CPU, 24 GB RAM
  
   For testing during PSR I am using a smaller subset of the actual data
  that I
   want to work with. Details of this smaller sub-set :
   - 5 million records, 4.5 GB index size
  
   Observations during PSR:
   A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
   logout and doing a force GC, only 60 % of the heap is reclaimed. As
  part
  of
   the logout process I am invalidating the HttpSession and doing a
  close()
  on
   CoreContainer. From my application's side, I don't believe I am
 holding
  on
   to any resource. I wanted to know if there are known issues
 surrounding
   memory leaks with Solr ?
   B) To further test this, I tried deploying with shards. 3.2 GB was
  allocated
   to each JVM. All JVMs had 96 % free heap space after start up. I got
  varying
   results with this.
   Case 1 : Used 6 weblogic domains. My application was deployed one 1
  domain.
   I split the 5 million index into 5 parts of 1 million each and used
  them
  as
   shards. After multiple users used the system and doing a force GC,
  around
  94
   - 96 % of heap was reclaimed in all the JVMs.
   Case 2: Used 2 weblogic domains. My application was deployed on 1
  domain.
  On
   the other, I deployed the entire 5 million part index as one shard.
  After
   multiple users used the system and doing a gorce GC, around 76 % of
 the
  heap
   was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM
 where
  my
   application was running. This result further convinces me that my
   application can be absolved of holding on to memory resources.
  
   I am not sure how to interpret these results ? For searching, I am
  using
   Without Shards : EmbeddedSolrServer
   With Shards :CommonsHttpSolrServer
   In terms of Solr objects this is what differs in my code between
 normal
   search and shards search (distributed search)
  
   After looking at Case 1, I thought that the CommonsHttpSolrServer was
  more
   memory efficient but Case 2 proved me wrong. Or could there still be
  memory
   leaks in my application ? Any thoughts, suggestions would be welcome.
  
   Regards
   Rahul
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25018165.html
  Sent from the Solr - User mailing list archive at Nabble.com.




RE: JVM Heap utilization Memory leaks with Solr

2009-08-19 Thread Fuad Efendi

Hi Rahul,

JRockit could be used at least in a test environment to monitor JVM (and
troubleshoot SOLR, licensed for-free for developers!); they have even
Eclipse plugin now, and it is licensed by Oracle (BEA)... But, of course, in
large companies test environment is in hands of testers :)


But... 3700 fields will create (over time) 3700 arrays  each of size
5,000,000!!! Even if most of fields are empty for most of documents...
Applicable to non-tokenized single-valued non-boolean fields only, Lucene
internals, FieldCache... and it won't be GC-collected after user log-off...
prefer dedicated box for SOLR.

-Fuad


-Original Message-
From: Rahul R [mailto:rahul.s...@gmail.com] 
Sent: August-19-09 6:19 AM
To: solr-user@lucene.apache.org
Subject: Re: JVM Heap utilization  Memory leaks with Solr

Fuad,
We have around 5 million documents and around 3700 fields. All documents
will not have values for all the fields JRockit is not approved for use
within my organization. But thanks for the info anyway.

Regards
Rahul

On Tue, Aug 18, 2009 at 9:41 AM, Funtick f...@efendi.ca wrote:


 BTW, you should really prefer JRockit which really rocks!!!

 Mission Control has necessary toolongs; and JRockit produces _nice_
 exception stacktrace (explaining almost everything) in case of even OOM
 which SUN JVN still fails to produce.


 SolrServlet still catches Throwable:

} catch (Throwable e) {
  SolrException.log(log,e);
  sendErr(500, SolrException.toStr(e), request, response);
} finally {





 Rahul R wrote:
 
  Otis,
  Thank you for your response. I know there are a few variables here but
 the
  difference in memory utilization with and without shards somehow leads
me
  to
  believe that the leak could be within Solr.
 
  I tried using a profiling tool - Yourkit. The trial version was free for
  15
  days. But I couldn't find anything of significance.
 
  Regards
  Rahul
 
 
  On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic
  otis_gospodne...@yahoo.com
  wrote:
 
  Hi Rahul,
 
  A) There are no known (to me) memory leaks.
  I think there are too many variables for a person to tell you what
  exactly
  is happening, plus you are dealing with the JVM here. :)
 
  Try jmap -histo:live PID-HERE | less and see what's using your memory.
 
  Otis
  --
  Sematext is hiring -- http://sematext.com/about/jobs.html?mls
  Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR
 
 
 
  - Original Message 
   From: Rahul R rahul.s...@gmail.com
   To: solr-user@lucene.apache.org
   Sent: Tuesday, August 4, 2009 1:09:06 AM
   Subject: JVM Heap utilization  Memory leaks with Solr
  
   I am trying to track memory utilization with my Application that uses
  Solr.
   Details of the setup :
   -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
   - Hardware : 12 CPU, 24 GB RAM
  
   For testing during PSR I am using a smaller subset of the actual data
  that I
   want to work with. Details of this smaller sub-set :
   - 5 million records, 4.5 GB index size
  
   Observations during PSR:
   A) I have allocated 3.2 GB for the JVM(s) that I used. After all
users
   logout and doing a force GC, only 60 % of the heap is reclaimed. As
  part
  of
   the logout process I am invalidating the HttpSession and doing a
  close()
  on
   CoreContainer. From my application's side, I don't believe I am
 holding
  on
   to any resource. I wanted to know if there are known issues
 surrounding
   memory leaks with Solr ?
   B) To further test this, I tried deploying with shards. 3.2 GB was
  allocated
   to each JVM. All JVMs had 96 % free heap space after start up. I got
  varying
   results with this.
   Case 1 : Used 6 weblogic domains. My application was deployed one 1
  domain.
   I split the 5 million index into 5 parts of 1 million each and used
  them
  as
   shards. After multiple users used the system and doing a force GC,
  around
  94
   - 96 % of heap was reclaimed in all the JVMs.
   Case 2: Used 2 weblogic domains. My application was deployed on 1
  domain.
  On
   the other, I deployed the entire 5 million part index as one shard.
  After
   multiple users used the system and doing a gorce GC, around 76 % of
 the
  heap
   was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM
 where
  my
   application was running. This result further convinces me that my
   application can be absolved of holding on to memory resources.
  
   I am not sure how to interpret these results ? For searching, I am
  using
   Without Shards : EmbeddedSolrServer
   With Shards :CommonsHttpSolrServer
   In terms of Solr objects this is what differs in my code between
 normal
   search and shards search (distributed search)
  
   After looking at Case 1, I thought that the CommonsHttpSolrServer was
  more
   memory efficient but Case 2 proved me wrong. Or could there still be
  memory
   leaks in my application ? Any thoughts, suggestions would be welcome.
  
   Regards
   Rahul

Re: JVM Heap utilization Memory leaks with Solr

2009-08-17 Thread Funtick

Can you tell me please how many non-tokenized single-valued fields your
schema uses, and how many documents?
Thanks,
Fuad


Rahul R wrote:
 
 My primary issue is not Out of Memory error at run time. It is memory
 leaks:
 heap space not being released after doing a force GC also. So after
 sometime
 as progressively more heap gets utilized, I start running out of
 memory
 The verdict however seems unanimous that there are no known memory leak
 issues within Solr. I am still looking at my application to analyse the
 problem. Thank you.
 
 On Thu, Aug 13, 2009 at 10:58 PM, Fuad Efendi f...@efendi.ca wrote:
 
 Most OutOfMemoryException (if not 100%) happening with SOLR are because
 of

 http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache.
 html
 - it is used internally in Lucene to cache Field value and document ID.

 My very long-term observations: SOLR can run without any problems few
 days/months and unpredictable OOM happens just because someone tried
 sorted
 search which will populate array with IDs of ALL documents in the index.

 The only solution: calculate exactly amount of RAM needed for
 FieldCache...
 For instance, for 100,000,000 documents single instance of FieldCache may
 require 8*100,000,000 bytes (8 bytes per document ID?) which is almost
 1Gb
 (at least!)


 I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR
 instance (almost a year without any restart!)




 -Original Message-
 From: Rahul R [mailto:rahul.s...@gmail.com]
 Sent: August-13-09 1:25 AM
 To: solr-user@lucene.apache.org
  Subject: Re: JVM Heap utilization  Memory leaks with Solr

 *You should try to generate heap dumps and analyze the heap using a tool
 like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
 objects holding a large amount of memory*

 The tool that I used also allows to capture heap snap shots. Eclipse had
 a
 lot of pre-requisites. You need to apply some three or five patches
 before
 you can start using it My observations with this tool were that
 some
 Hashmaps were taking up a lot of space. Although I could not pin it down
 to
 the exact HashMap. These would either be weblogic's or Solr's I will
 anyway give eclipse's a try and see how it goes. Thanks for your input.

 Rahul

 On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht
 gun...@wagenknecht.orgwrote:

  Rahul R schrieb:
   I tried using a profiling tool - Yourkit. The trial version was free
 for
  15
   days. But I couldn't find anything of significance.
 
  You should try to generate heap dumps and analyze the heap using a tool
  like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
  objects holding a large amount of memory.
 
  -Gunnar
 
  --
  Gunnar Wagenknecht
  gun...@wagenknecht.org
  http://wagenknecht.org/
 
 



 
 

-- 
View this message in context: 
http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25017767.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: JVM Heap utilization Memory leaks with Solr

2009-08-17 Thread Funtick

BTW, you should really prefer JRockit which really rocks!!!

Mission Control has necessary toolongs; and JRockit produces _nice_
exception stacktrace (explaining almost everything) in case of even OOM
which SUN JVN still fails to produce.


SolrServlet still catches Throwable:

} catch (Throwable e) {
  SolrException.log(log,e);
  sendErr(500, SolrException.toStr(e), request, response);
} finally {





Rahul R wrote:
 
 Otis,
 Thank you for your response. I know there are a few variables here but the
 difference in memory utilization with and without shards somehow leads me
 to
 believe that the leak could be within Solr.
 
 I tried using a profiling tool - Yourkit. The trial version was free for
 15
 days. But I couldn't find anything of significance.
 
 Regards
 Rahul
 
 
 On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic
 otis_gospodne...@yahoo.com
 wrote:
 
 Hi Rahul,

 A) There are no known (to me) memory leaks.
 I think there are too many variables for a person to tell you what
 exactly
 is happening, plus you are dealing with the JVM here. :)

 Try jmap -histo:live PID-HERE | less and see what's using your memory.

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
  From: Rahul R rahul.s...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Tuesday, August 4, 2009 1:09:06 AM
  Subject: JVM Heap utilization  Memory leaks with Solr
 
  I am trying to track memory utilization with my Application that uses
 Solr.
  Details of the setup :
  -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
  - Hardware : 12 CPU, 24 GB RAM
 
  For testing during PSR I am using a smaller subset of the actual data
 that I
  want to work with. Details of this smaller sub-set :
  - 5 million records, 4.5 GB index size
 
  Observations during PSR:
  A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
  logout and doing a force GC, only 60 % of the heap is reclaimed. As
 part
 of
  the logout process I am invalidating the HttpSession and doing a
 close()
 on
  CoreContainer. From my application's side, I don't believe I am holding
 on
  to any resource. I wanted to know if there are known issues surrounding
  memory leaks with Solr ?
  B) To further test this, I tried deploying with shards. 3.2 GB was
 allocated
  to each JVM. All JVMs had 96 % free heap space after start up. I got
 varying
  results with this.
  Case 1 : Used 6 weblogic domains. My application was deployed one 1
 domain.
  I split the 5 million index into 5 parts of 1 million each and used
 them
 as
  shards. After multiple users used the system and doing a force GC,
 around
 94
  - 96 % of heap was reclaimed in all the JVMs.
  Case 2: Used 2 weblogic domains. My application was deployed on 1
 domain.
 On
  the other, I deployed the entire 5 million part index as one shard.
 After
  multiple users used the system and doing a gorce GC, around 76 % of the
 heap
  was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where
 my
  application was running. This result further convinces me that my
  application can be absolved of holding on to memory resources.
 
  I am not sure how to interpret these results ? For searching, I am
 using
  Without Shards : EmbeddedSolrServer
  With Shards :CommonsHttpSolrServer
  In terms of Solr objects this is what differs in my code between normal
  search and shards search (distributed search)
 
  After looking at Case 1, I thought that the CommonsHttpSolrServer was
 more
  memory efficient but Case 2 proved me wrong. Or could there still be
 memory
  leaks in my application ? Any thoughts, suggestions would be welcome.
 
  Regards
  Rahul


 
 

-- 
View this message in context: 
http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25018165.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: JVM Heap utilization Memory leaks with Solr

2009-08-16 Thread Rahul R
My primary issue is not Out of Memory error at run time. It is memory leaks:
heap space not being released after doing a force GC also. So after sometime
as progressively more heap gets utilized, I start running out of memory
The verdict however seems unanimous that there are no known memory leak
issues within Solr. I am still looking at my application to analyse the
problem. Thank you.

On Thu, Aug 13, 2009 at 10:58 PM, Fuad Efendi f...@efendi.ca wrote:

 Most OutOfMemoryException (if not 100%) happening with SOLR are because of

 http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache.
 html
 - it is used internally in Lucene to cache Field value and document ID.

 My very long-term observations: SOLR can run without any problems few
 days/months and unpredictable OOM happens just because someone tried sorted
 search which will populate array with IDs of ALL documents in the index.

 The only solution: calculate exactly amount of RAM needed for FieldCache...
 For instance, for 100,000,000 documents single instance of FieldCache may
 require 8*100,000,000 bytes (8 bytes per document ID?) which is almost 1Gb
 (at least!)


 I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR
 instance (almost a year without any restart!)




 -Original Message-
 From: Rahul R [mailto:rahul.s...@gmail.com]
 Sent: August-13-09 1:25 AM
 To: solr-user@lucene.apache.org
  Subject: Re: JVM Heap utilization  Memory leaks with Solr

 *You should try to generate heap dumps and analyze the heap using a tool
 like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
 objects holding a large amount of memory*

 The tool that I used also allows to capture heap snap shots. Eclipse had a
 lot of pre-requisites. You need to apply some three or five patches before
 you can start using it My observations with this tool were that
 some
 Hashmaps were taking up a lot of space. Although I could not pin it down to
 the exact HashMap. These would either be weblogic's or Solr's I will
 anyway give eclipse's a try and see how it goes. Thanks for your input.

 Rahul

 On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht
 gun...@wagenknecht.orgwrote:

  Rahul R schrieb:
   I tried using a profiling tool - Yourkit. The trial version was free
 for
  15
   days. But I couldn't find anything of significance.
 
  You should try to generate heap dumps and analyze the heap using a tool
  like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
  objects holding a large amount of memory.
 
  -Gunnar
 
  --
  Gunnar Wagenknecht
  gun...@wagenknecht.org
  http://wagenknecht.org/
 
 





RE: JVM Heap utilization Memory leaks with Solr

2009-08-13 Thread Fuad Efendi
Most OutOfMemoryException (if not 100%) happening with SOLR are because of
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache.
html
- it is used internally in Lucene to cache Field value and document ID. 

My very long-term observations: SOLR can run without any problems few
days/months and unpredictable OOM happens just because someone tried sorted
search which will populate array with IDs of ALL documents in the index.

The only solution: calculate exactly amount of RAM needed for FieldCache...
For instance, for 100,000,000 documents single instance of FieldCache may
require 8*100,000,000 bytes (8 bytes per document ID?) which is almost 1Gb
(at least!)


I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR
instance (almost a year without any restart!)




-Original Message-
From: Rahul R [mailto:rahul.s...@gmail.com] 
Sent: August-13-09 1:25 AM
To: solr-user@lucene.apache.org
Subject: Re: JVM Heap utilization  Memory leaks with Solr

*You should try to generate heap dumps and analyze the heap using a tool
like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
objects holding a large amount of memory*

The tool that I used also allows to capture heap snap shots. Eclipse had a
lot of pre-requisites. You need to apply some three or five patches before
you can start using it My observations with this tool were that some
Hashmaps were taking up a lot of space. Although I could not pin it down to
the exact HashMap. These would either be weblogic's or Solr's I will
anyway give eclipse's a try and see how it goes. Thanks for your input.

Rahul

On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht
gun...@wagenknecht.orgwrote:

 Rahul R schrieb:
  I tried using a profiling tool - Yourkit. The trial version was free for
 15
  days. But I couldn't find anything of significance.

 You should try to generate heap dumps and analyze the heap using a tool
 like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
 objects holding a large amount of memory.

 -Gunnar

 --
 Gunnar Wagenknecht
 gun...@wagenknecht.org
 http://wagenknecht.org/






Re: JVM Heap utilization Memory leaks with Solr

2009-08-12 Thread Gunnar Wagenknecht
Rahul R schrieb:
 I tried using a profiling tool - Yourkit. The trial version was free for 15
 days. But I couldn't find anything of significance.

You should try to generate heap dumps and analyze the heap using a tool
like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
objects holding a large amount of memory.

-Gunnar

-- 
Gunnar Wagenknecht
gun...@wagenknecht.org
http://wagenknecht.org/



Re: JVM Heap utilization Memory leaks with Solr

2009-08-12 Thread Rahul R
*You should try to generate heap dumps and analyze the heap using a tool
like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
objects holding a large amount of memory*

The tool that I used also allows to capture heap snap shots. Eclipse had a
lot of pre-requisites. You need to apply some three or five patches before
you can start using it My observations with this tool were that some
Hashmaps were taking up a lot of space. Although I could not pin it down to
the exact HashMap. These would either be weblogic's or Solr's I will
anyway give eclipse's a try and see how it goes. Thanks for your input.

Rahul

On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht
gun...@wagenknecht.orgwrote:

 Rahul R schrieb:
  I tried using a profiling tool - Yourkit. The trial version was free for
 15
  days. But I couldn't find anything of significance.

 You should try to generate heap dumps and analyze the heap using a tool
 like the Eclipse Memory Analyzer. Maybe it helps spotting a group of
 objects holding a large amount of memory.

 -Gunnar

 --
 Gunnar Wagenknecht
 gun...@wagenknecht.org
 http://wagenknecht.org/




Re: JVM Heap utilization Memory leaks with Solr

2009-08-04 Thread Otis Gospodnetic
Hi Rahul,

A) There are no known (to me) memory leaks.
I think there are too many variables for a person to tell you what exactly is 
happening, plus you are dealing with the JVM here. :)

Try jmap -histo:live PID-HERE | less and see what's using your memory.

Otis
--
Sematext is hiring -- http://sematext.com/about/jobs.html?mls
Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



- Original Message 
 From: Rahul R rahul.s...@gmail.com
 To: solr-user@lucene.apache.org
 Sent: Tuesday, August 4, 2009 1:09:06 AM
 Subject: JVM Heap utilization  Memory leaks with Solr
 
 I am trying to track memory utilization with my Application that uses Solr.
 Details of the setup :
 -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
 - Hardware : 12 CPU, 24 GB RAM
 
 For testing during PSR I am using a smaller subset of the actual data that I
 want to work with. Details of this smaller sub-set :
 - 5 million records, 4.5 GB index size
 
 Observations during PSR:
 A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
 logout and doing a force GC, only 60 % of the heap is reclaimed. As part of
 the logout process I am invalidating the HttpSession and doing a close() on
 CoreContainer. From my application's side, I don't believe I am holding on
 to any resource. I wanted to know if there are known issues surrounding
 memory leaks with Solr ?
 B) To further test this, I tried deploying with shards. 3.2 GB was allocated
 to each JVM. All JVMs had 96 % free heap space after start up. I got varying
 results with this.
 Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain.
 I split the 5 million index into 5 parts of 1 million each and used them as
 shards. After multiple users used the system and doing a force GC, around 94
 - 96 % of heap was reclaimed in all the JVMs.
 Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On
 the other, I deployed the entire 5 million part index as one shard. After
 multiple users used the system and doing a gorce GC, around 76 % of the heap
 was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my
 application was running. This result further convinces me that my
 application can be absolved of holding on to memory resources.
 
 I am not sure how to interpret these results ? For searching, I am using
 Without Shards : EmbeddedSolrServer
 With Shards :CommonsHttpSolrServer
 In terms of Solr objects this is what differs in my code between normal
 search and shards search (distributed search)
 
 After looking at Case 1, I thought that the CommonsHttpSolrServer was more
 memory efficient but Case 2 proved me wrong. Or could there still be memory
 leaks in my application ? Any thoughts, suggestions would be welcome.
 
 Regards
 Rahul



Re: JVM Heap utilization Memory leaks with Solr

2009-08-04 Thread Rahul R
Otis,
Thank you for your response. I know there are a few variables here but the
difference in memory utilization with and without shards somehow leads me to
believe that the leak could be within Solr.

I tried using a profiling tool - Yourkit. The trial version was free for 15
days. But I couldn't find anything of significance.

Regards
Rahul


On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic otis_gospodne...@yahoo.com
 wrote:

 Hi Rahul,

 A) There are no known (to me) memory leaks.
 I think there are too many variables for a person to tell you what exactly
 is happening, plus you are dealing with the JVM here. :)

 Try jmap -histo:live PID-HERE | less and see what's using your memory.

 Otis
 --
 Sematext is hiring -- http://sematext.com/about/jobs.html?mls
 Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR



 - Original Message 
  From: Rahul R rahul.s...@gmail.com
  To: solr-user@lucene.apache.org
  Sent: Tuesday, August 4, 2009 1:09:06 AM
  Subject: JVM Heap utilization  Memory leaks with Solr
 
  I am trying to track memory utilization with my Application that uses
 Solr.
  Details of the setup :
  -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
  - Hardware : 12 CPU, 24 GB RAM
 
  For testing during PSR I am using a smaller subset of the actual data
 that I
  want to work with. Details of this smaller sub-set :
  - 5 million records, 4.5 GB index size
 
  Observations during PSR:
  A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
  logout and doing a force GC, only 60 % of the heap is reclaimed. As part
 of
  the logout process I am invalidating the HttpSession and doing a close()
 on
  CoreContainer. From my application's side, I don't believe I am holding
 on
  to any resource. I wanted to know if there are known issues surrounding
  memory leaks with Solr ?
  B) To further test this, I tried deploying with shards. 3.2 GB was
 allocated
  to each JVM. All JVMs had 96 % free heap space after start up. I got
 varying
  results with this.
  Case 1 : Used 6 weblogic domains. My application was deployed one 1
 domain.
  I split the 5 million index into 5 parts of 1 million each and used them
 as
  shards. After multiple users used the system and doing a force GC, around
 94
  - 96 % of heap was reclaimed in all the JVMs.
  Case 2: Used 2 weblogic domains. My application was deployed on 1 domain.
 On
  the other, I deployed the entire 5 million part index as one shard. After
  multiple users used the system and doing a gorce GC, around 76 % of the
 heap
  was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where
 my
  application was running. This result further convinces me that my
  application can be absolved of holding on to memory resources.
 
  I am not sure how to interpret these results ? For searching, I am using
  Without Shards : EmbeddedSolrServer
  With Shards :CommonsHttpSolrServer
  In terms of Solr objects this is what differs in my code between normal
  search and shards search (distributed search)
 
  After looking at Case 1, I thought that the CommonsHttpSolrServer was
 more
  memory efficient but Case 2 proved me wrong. Or could there still be
 memory
  leaks in my application ? Any thoughts, suggestions would be welcome.
 
  Regards
  Rahul




JVM Heap utilization Memory leaks with Solr

2009-08-03 Thread Rahul R
I am trying to track memory utilization with my Application that uses Solr.
Details of the setup :
 -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0
- Hardware : 12 CPU, 24 GB RAM

For testing during PSR I am using a smaller subset of the actual data that I
want to work with. Details of this smaller sub-set :
- 5 million records, 4.5 GB index size

Observations during PSR:
A) I have allocated 3.2 GB for the JVM(s) that I used. After all users
logout and doing a force GC, only 60 % of the heap is reclaimed. As part of
the logout process I am invalidating the HttpSession and doing a close() on
CoreContainer. From my application's side, I don't believe I am holding on
to any resource. I wanted to know if there are known issues surrounding
memory leaks with Solr ?
B) To further test this, I tried deploying with shards. 3.2 GB was allocated
to each JVM. All JVMs had 96 % free heap space after start up. I got varying
results with this.
Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain.
I split the 5 million index into 5 parts of 1 million each and used them as
shards. After multiple users used the system and doing a force GC, around 94
- 96 % of heap was reclaimed in all the JVMs.
Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On
the other, I deployed the entire 5 million part index as one shard. After
multiple users used the system and doing a gorce GC, around 76 % of the heap
was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my
application was running. This result further convinces me that my
application can be absolved of holding on to memory resources.

I am not sure how to interpret these results ? For searching, I am using
Without Shards : EmbeddedSolrServer
With Shards :CommonsHttpSolrServer
In terms of Solr objects this is what differs in my code between normal
search and shards search (distributed search)

After looking at Case 1, I thought that the CommonsHttpSolrServer was more
memory efficient but Case 2 proved me wrong. Or could there still be memory
leaks in my application ? Any thoughts, suggestions would be welcome.

Regards
Rahul