Re: JVM Heap utilization Memory leaks with Solr
All these 3700 fields are single valued non-boolean fields. Thanks Regards Rahul On Wed, Aug 19, 2009 at 8:33 PM, Fuad Efendi f...@efendi.ca wrote: Hi Rahul, JRockit could be used at least in a test environment to monitor JVM (and troubleshoot SOLR, licensed for-free for developers!); they have even Eclipse plugin now, and it is licensed by Oracle (BEA)... But, of course, in large companies test environment is in hands of testers :) But... 3700 fields will create (over time) 3700 arrays each of size 5,000,000!!! Even if most of fields are empty for most of documents... Applicable to non-tokenized single-valued non-boolean fields only, Lucene internals, FieldCache... and it won't be GC-collected after user log-off... prefer dedicated box for SOLR. -Fuad -Original Message- From: Rahul R [mailto:rahul.s...@gmail.com] Sent: August-19-09 6:19 AM To: solr-user@lucene.apache.org Subject: Re: JVM Heap utilization Memory leaks with Solr Fuad, We have around 5 million documents and around 3700 fields. All documents will not have values for all the fields JRockit is not approved for use within my organization. But thanks for the info anyway. Regards Rahul On Tue, Aug 18, 2009 at 9:41 AM, Funtick f...@efendi.ca wrote: BTW, you should really prefer JRockit which really rocks!!! Mission Control has necessary toolongs; and JRockit produces _nice_ exception stacktrace (explaining almost everything) in case of even OOM which SUN JVN still fails to produce. SolrServlet still catches Throwable: } catch (Throwable e) { SolrException.log(log,e); sendErr(500, SolrException.toStr(e), request, response); } finally { Rahul R wrote: Otis, Thank you for your response. I know there are a few variables here but the difference in memory utilization with and without shards somehow leads me to believe that the leak could be within Solr. I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. Regards Rahul On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Rahul, A) There are no known (to me) memory leaks. I think there are too many variables for a person to tell you what exactly is happening, plus you are dealing with the JVM here. :) Try jmap -histo:live PID-HERE | less and see what's using your memory. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message From: Rahul R rahul.s...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, August 4, 2009 1:09:06 AM Subject: JVM Heap utilization Memory leaks with Solr I am trying to track memory utilization with my Application that uses Solr. Details of the setup : -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0 - Hardware : 12 CPU, 24 GB RAM For testing during PSR I am using a smaller subset of the actual data that I want to work with. Details of this smaller sub-set : - 5 million records, 4.5 GB index size Observations during PSR: A) I have allocated 3.2 GB for the JVM(s) that I used. After all users logout and doing a force GC, only 60 % of the heap is reclaimed. As part of the logout process I am invalidating the HttpSession and doing a close() on CoreContainer. From my application's side, I don't believe I am holding on to any resource. I wanted to know if there are known issues surrounding memory leaks with Solr ? B) To further test this, I tried deploying with shards. 3.2 GB was allocated to each JVM. All JVMs had 96 % free heap space after start up. I got varying results with this. Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain. I split the 5 million index into 5 parts of 1 million each and used them as shards. After multiple users used the system and doing a force GC, around 94 - 96 % of heap was reclaimed in all the JVMs. Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On the other, I deployed the entire 5 million part index as one shard. After multiple users used the system and doing a gorce GC, around 76 % of the heap was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my application was running. This result further convinces me that my application can be absolved of holding on to memory resources. I am not sure how to interpret these results ? For searching, I am using Without Shards : EmbeddedSolrServer With Shards :CommonsHttpSolrServer In terms of Solr objects this is what differs in my code between normal search and shards
Re: JVM Heap utilization Memory leaks with Solr
Fuad, We have around 5 million documents and around 3700 fields. All documents will not have values for all the fields JRockit is not approved for use within my organization. But thanks for the info anyway. Regards Rahul On Tue, Aug 18, 2009 at 9:41 AM, Funtick f...@efendi.ca wrote: BTW, you should really prefer JRockit which really rocks!!! Mission Control has necessary toolongs; and JRockit produces _nice_ exception stacktrace (explaining almost everything) in case of even OOM which SUN JVN still fails to produce. SolrServlet still catches Throwable: } catch (Throwable e) { SolrException.log(log,e); sendErr(500, SolrException.toStr(e), request, response); } finally { Rahul R wrote: Otis, Thank you for your response. I know there are a few variables here but the difference in memory utilization with and without shards somehow leads me to believe that the leak could be within Solr. I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. Regards Rahul On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Rahul, A) There are no known (to me) memory leaks. I think there are too many variables for a person to tell you what exactly is happening, plus you are dealing with the JVM here. :) Try jmap -histo:live PID-HERE | less and see what's using your memory. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message From: Rahul R rahul.s...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, August 4, 2009 1:09:06 AM Subject: JVM Heap utilization Memory leaks with Solr I am trying to track memory utilization with my Application that uses Solr. Details of the setup : -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0 - Hardware : 12 CPU, 24 GB RAM For testing during PSR I am using a smaller subset of the actual data that I want to work with. Details of this smaller sub-set : - 5 million records, 4.5 GB index size Observations during PSR: A) I have allocated 3.2 GB for the JVM(s) that I used. After all users logout and doing a force GC, only 60 % of the heap is reclaimed. As part of the logout process I am invalidating the HttpSession and doing a close() on CoreContainer. From my application's side, I don't believe I am holding on to any resource. I wanted to know if there are known issues surrounding memory leaks with Solr ? B) To further test this, I tried deploying with shards. 3.2 GB was allocated to each JVM. All JVMs had 96 % free heap space after start up. I got varying results with this. Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain. I split the 5 million index into 5 parts of 1 million each and used them as shards. After multiple users used the system and doing a force GC, around 94 - 96 % of heap was reclaimed in all the JVMs. Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On the other, I deployed the entire 5 million part index as one shard. After multiple users used the system and doing a gorce GC, around 76 % of the heap was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my application was running. This result further convinces me that my application can be absolved of holding on to memory resources. I am not sure how to interpret these results ? For searching, I am using Without Shards : EmbeddedSolrServer With Shards :CommonsHttpSolrServer In terms of Solr objects this is what differs in my code between normal search and shards search (distributed search) After looking at Case 1, I thought that the CommonsHttpSolrServer was more memory efficient but Case 2 proved me wrong. Or could there still be memory leaks in my application ? Any thoughts, suggestions would be welcome. Regards Rahul -- View this message in context: http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25018165.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: JVM Heap utilization Memory leaks with Solr
Hi Rahul, JRockit could be used at least in a test environment to monitor JVM (and troubleshoot SOLR, licensed for-free for developers!); they have even Eclipse plugin now, and it is licensed by Oracle (BEA)... But, of course, in large companies test environment is in hands of testers :) But... 3700 fields will create (over time) 3700 arrays each of size 5,000,000!!! Even if most of fields are empty for most of documents... Applicable to non-tokenized single-valued non-boolean fields only, Lucene internals, FieldCache... and it won't be GC-collected after user log-off... prefer dedicated box for SOLR. -Fuad -Original Message- From: Rahul R [mailto:rahul.s...@gmail.com] Sent: August-19-09 6:19 AM To: solr-user@lucene.apache.org Subject: Re: JVM Heap utilization Memory leaks with Solr Fuad, We have around 5 million documents and around 3700 fields. All documents will not have values for all the fields JRockit is not approved for use within my organization. But thanks for the info anyway. Regards Rahul On Tue, Aug 18, 2009 at 9:41 AM, Funtick f...@efendi.ca wrote: BTW, you should really prefer JRockit which really rocks!!! Mission Control has necessary toolongs; and JRockit produces _nice_ exception stacktrace (explaining almost everything) in case of even OOM which SUN JVN still fails to produce. SolrServlet still catches Throwable: } catch (Throwable e) { SolrException.log(log,e); sendErr(500, SolrException.toStr(e), request, response); } finally { Rahul R wrote: Otis, Thank you for your response. I know there are a few variables here but the difference in memory utilization with and without shards somehow leads me to believe that the leak could be within Solr. I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. Regards Rahul On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Rahul, A) There are no known (to me) memory leaks. I think there are too many variables for a person to tell you what exactly is happening, plus you are dealing with the JVM here. :) Try jmap -histo:live PID-HERE | less and see what's using your memory. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message From: Rahul R rahul.s...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, August 4, 2009 1:09:06 AM Subject: JVM Heap utilization Memory leaks with Solr I am trying to track memory utilization with my Application that uses Solr. Details of the setup : -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0 - Hardware : 12 CPU, 24 GB RAM For testing during PSR I am using a smaller subset of the actual data that I want to work with. Details of this smaller sub-set : - 5 million records, 4.5 GB index size Observations during PSR: A) I have allocated 3.2 GB for the JVM(s) that I used. After all users logout and doing a force GC, only 60 % of the heap is reclaimed. As part of the logout process I am invalidating the HttpSession and doing a close() on CoreContainer. From my application's side, I don't believe I am holding on to any resource. I wanted to know if there are known issues surrounding memory leaks with Solr ? B) To further test this, I tried deploying with shards. 3.2 GB was allocated to each JVM. All JVMs had 96 % free heap space after start up. I got varying results with this. Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain. I split the 5 million index into 5 parts of 1 million each and used them as shards. After multiple users used the system and doing a force GC, around 94 - 96 % of heap was reclaimed in all the JVMs. Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On the other, I deployed the entire 5 million part index as one shard. After multiple users used the system and doing a gorce GC, around 76 % of the heap was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my application was running. This result further convinces me that my application can be absolved of holding on to memory resources. I am not sure how to interpret these results ? For searching, I am using Without Shards : EmbeddedSolrServer With Shards :CommonsHttpSolrServer In terms of Solr objects this is what differs in my code between normal search and shards search (distributed search) After looking at Case 1, I thought that the CommonsHttpSolrServer was more memory efficient but Case 2 proved me wrong. Or could there still be memory leaks in my application ? Any thoughts, suggestions would be welcome. Regards Rahul
Re: JVM Heap utilization Memory leaks with Solr
Can you tell me please how many non-tokenized single-valued fields your schema uses, and how many documents? Thanks, Fuad Rahul R wrote: My primary issue is not Out of Memory error at run time. It is memory leaks: heap space not being released after doing a force GC also. So after sometime as progressively more heap gets utilized, I start running out of memory The verdict however seems unanimous that there are no known memory leak issues within Solr. I am still looking at my application to analyse the problem. Thank you. On Thu, Aug 13, 2009 at 10:58 PM, Fuad Efendi f...@efendi.ca wrote: Most OutOfMemoryException (if not 100%) happening with SOLR are because of http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache. html - it is used internally in Lucene to cache Field value and document ID. My very long-term observations: SOLR can run without any problems few days/months and unpredictable OOM happens just because someone tried sorted search which will populate array with IDs of ALL documents in the index. The only solution: calculate exactly amount of RAM needed for FieldCache... For instance, for 100,000,000 documents single instance of FieldCache may require 8*100,000,000 bytes (8 bytes per document ID?) which is almost 1Gb (at least!) I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR instance (almost a year without any restart!) -Original Message- From: Rahul R [mailto:rahul.s...@gmail.com] Sent: August-13-09 1:25 AM To: solr-user@lucene.apache.org Subject: Re: JVM Heap utilization Memory leaks with Solr *You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory* The tool that I used also allows to capture heap snap shots. Eclipse had a lot of pre-requisites. You need to apply some three or five patches before you can start using it My observations with this tool were that some Hashmaps were taking up a lot of space. Although I could not pin it down to the exact HashMap. These would either be weblogic's or Solr's I will anyway give eclipse's a try and see how it goes. Thanks for your input. Rahul On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht gun...@wagenknecht.orgwrote: Rahul R schrieb: I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory. -Gunnar -- Gunnar Wagenknecht gun...@wagenknecht.org http://wagenknecht.org/ -- View this message in context: http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25017767.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: JVM Heap utilization Memory leaks with Solr
BTW, you should really prefer JRockit which really rocks!!! Mission Control has necessary toolongs; and JRockit produces _nice_ exception stacktrace (explaining almost everything) in case of even OOM which SUN JVN still fails to produce. SolrServlet still catches Throwable: } catch (Throwable e) { SolrException.log(log,e); sendErr(500, SolrException.toStr(e), request, response); } finally { Rahul R wrote: Otis, Thank you for your response. I know there are a few variables here but the difference in memory utilization with and without shards somehow leads me to believe that the leak could be within Solr. I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. Regards Rahul On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Rahul, A) There are no known (to me) memory leaks. I think there are too many variables for a person to tell you what exactly is happening, plus you are dealing with the JVM here. :) Try jmap -histo:live PID-HERE | less and see what's using your memory. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message From: Rahul R rahul.s...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, August 4, 2009 1:09:06 AM Subject: JVM Heap utilization Memory leaks with Solr I am trying to track memory utilization with my Application that uses Solr. Details of the setup : -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0 - Hardware : 12 CPU, 24 GB RAM For testing during PSR I am using a smaller subset of the actual data that I want to work with. Details of this smaller sub-set : - 5 million records, 4.5 GB index size Observations during PSR: A) I have allocated 3.2 GB for the JVM(s) that I used. After all users logout and doing a force GC, only 60 % of the heap is reclaimed. As part of the logout process I am invalidating the HttpSession and doing a close() on CoreContainer. From my application's side, I don't believe I am holding on to any resource. I wanted to know if there are known issues surrounding memory leaks with Solr ? B) To further test this, I tried deploying with shards. 3.2 GB was allocated to each JVM. All JVMs had 96 % free heap space after start up. I got varying results with this. Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain. I split the 5 million index into 5 parts of 1 million each and used them as shards. After multiple users used the system and doing a force GC, around 94 - 96 % of heap was reclaimed in all the JVMs. Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On the other, I deployed the entire 5 million part index as one shard. After multiple users used the system and doing a gorce GC, around 76 % of the heap was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my application was running. This result further convinces me that my application can be absolved of holding on to memory resources. I am not sure how to interpret these results ? For searching, I am using Without Shards : EmbeddedSolrServer With Shards :CommonsHttpSolrServer In terms of Solr objects this is what differs in my code between normal search and shards search (distributed search) After looking at Case 1, I thought that the CommonsHttpSolrServer was more memory efficient but Case 2 proved me wrong. Or could there still be memory leaks in my application ? Any thoughts, suggestions would be welcome. Regards Rahul -- View this message in context: http://www.nabble.com/JVM-Heap-utilization---Memory-leaks-with-Solr-tp24802380p25018165.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: JVM Heap utilization Memory leaks with Solr
My primary issue is not Out of Memory error at run time. It is memory leaks: heap space not being released after doing a force GC also. So after sometime as progressively more heap gets utilized, I start running out of memory The verdict however seems unanimous that there are no known memory leak issues within Solr. I am still looking at my application to analyse the problem. Thank you. On Thu, Aug 13, 2009 at 10:58 PM, Fuad Efendi f...@efendi.ca wrote: Most OutOfMemoryException (if not 100%) happening with SOLR are because of http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache. html - it is used internally in Lucene to cache Field value and document ID. My very long-term observations: SOLR can run without any problems few days/months and unpredictable OOM happens just because someone tried sorted search which will populate array with IDs of ALL documents in the index. The only solution: calculate exactly amount of RAM needed for FieldCache... For instance, for 100,000,000 documents single instance of FieldCache may require 8*100,000,000 bytes (8 bytes per document ID?) which is almost 1Gb (at least!) I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR instance (almost a year without any restart!) -Original Message- From: Rahul R [mailto:rahul.s...@gmail.com] Sent: August-13-09 1:25 AM To: solr-user@lucene.apache.org Subject: Re: JVM Heap utilization Memory leaks with Solr *You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory* The tool that I used also allows to capture heap snap shots. Eclipse had a lot of pre-requisites. You need to apply some three or five patches before you can start using it My observations with this tool were that some Hashmaps were taking up a lot of space. Although I could not pin it down to the exact HashMap. These would either be weblogic's or Solr's I will anyway give eclipse's a try and see how it goes. Thanks for your input. Rahul On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht gun...@wagenknecht.orgwrote: Rahul R schrieb: I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory. -Gunnar -- Gunnar Wagenknecht gun...@wagenknecht.org http://wagenknecht.org/
RE: JVM Heap utilization Memory leaks with Solr
Most OutOfMemoryException (if not 100%) happening with SOLR are because of http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/FieldCache. html - it is used internally in Lucene to cache Field value and document ID. My very long-term observations: SOLR can run without any problems few days/months and unpredictable OOM happens just because someone tried sorted search which will populate array with IDs of ALL documents in the index. The only solution: calculate exactly amount of RAM needed for FieldCache... For instance, for 100,000,000 documents single instance of FieldCache may require 8*100,000,000 bytes (8 bytes per document ID?) which is almost 1Gb (at least!) I didn't notice any memory leaks after I started to use 16Gb RAM for SOLR instance (almost a year without any restart!) -Original Message- From: Rahul R [mailto:rahul.s...@gmail.com] Sent: August-13-09 1:25 AM To: solr-user@lucene.apache.org Subject: Re: JVM Heap utilization Memory leaks with Solr *You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory* The tool that I used also allows to capture heap snap shots. Eclipse had a lot of pre-requisites. You need to apply some three or five patches before you can start using it My observations with this tool were that some Hashmaps were taking up a lot of space. Although I could not pin it down to the exact HashMap. These would either be weblogic's or Solr's I will anyway give eclipse's a try and see how it goes. Thanks for your input. Rahul On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht gun...@wagenknecht.orgwrote: Rahul R schrieb: I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory. -Gunnar -- Gunnar Wagenknecht gun...@wagenknecht.org http://wagenknecht.org/
Re: JVM Heap utilization Memory leaks with Solr
Rahul R schrieb: I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory. -Gunnar -- Gunnar Wagenknecht gun...@wagenknecht.org http://wagenknecht.org/
Re: JVM Heap utilization Memory leaks with Solr
*You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory* The tool that I used also allows to capture heap snap shots. Eclipse had a lot of pre-requisites. You need to apply some three or five patches before you can start using it My observations with this tool were that some Hashmaps were taking up a lot of space. Although I could not pin it down to the exact HashMap. These would either be weblogic's or Solr's I will anyway give eclipse's a try and see how it goes. Thanks for your input. Rahul On Wed, Aug 12, 2009 at 2:15 PM, Gunnar Wagenknecht gun...@wagenknecht.orgwrote: Rahul R schrieb: I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. You should try to generate heap dumps and analyze the heap using a tool like the Eclipse Memory Analyzer. Maybe it helps spotting a group of objects holding a large amount of memory. -Gunnar -- Gunnar Wagenknecht gun...@wagenknecht.org http://wagenknecht.org/
Re: JVM Heap utilization Memory leaks with Solr
Hi Rahul, A) There are no known (to me) memory leaks. I think there are too many variables for a person to tell you what exactly is happening, plus you are dealing with the JVM here. :) Try jmap -histo:live PID-HERE | less and see what's using your memory. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message From: Rahul R rahul.s...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, August 4, 2009 1:09:06 AM Subject: JVM Heap utilization Memory leaks with Solr I am trying to track memory utilization with my Application that uses Solr. Details of the setup : -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0 - Hardware : 12 CPU, 24 GB RAM For testing during PSR I am using a smaller subset of the actual data that I want to work with. Details of this smaller sub-set : - 5 million records, 4.5 GB index size Observations during PSR: A) I have allocated 3.2 GB for the JVM(s) that I used. After all users logout and doing a force GC, only 60 % of the heap is reclaimed. As part of the logout process I am invalidating the HttpSession and doing a close() on CoreContainer. From my application's side, I don't believe I am holding on to any resource. I wanted to know if there are known issues surrounding memory leaks with Solr ? B) To further test this, I tried deploying with shards. 3.2 GB was allocated to each JVM. All JVMs had 96 % free heap space after start up. I got varying results with this. Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain. I split the 5 million index into 5 parts of 1 million each and used them as shards. After multiple users used the system and doing a force GC, around 94 - 96 % of heap was reclaimed in all the JVMs. Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On the other, I deployed the entire 5 million part index as one shard. After multiple users used the system and doing a gorce GC, around 76 % of the heap was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my application was running. This result further convinces me that my application can be absolved of holding on to memory resources. I am not sure how to interpret these results ? For searching, I am using Without Shards : EmbeddedSolrServer With Shards :CommonsHttpSolrServer In terms of Solr objects this is what differs in my code between normal search and shards search (distributed search) After looking at Case 1, I thought that the CommonsHttpSolrServer was more memory efficient but Case 2 proved me wrong. Or could there still be memory leaks in my application ? Any thoughts, suggestions would be welcome. Regards Rahul
Re: JVM Heap utilization Memory leaks with Solr
Otis, Thank you for your response. I know there are a few variables here but the difference in memory utilization with and without shards somehow leads me to believe that the leak could be within Solr. I tried using a profiling tool - Yourkit. The trial version was free for 15 days. But I couldn't find anything of significance. Regards Rahul On Tue, Aug 4, 2009 at 7:35 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: Hi Rahul, A) There are no known (to me) memory leaks. I think there are too many variables for a person to tell you what exactly is happening, plus you are dealing with the JVM here. :) Try jmap -histo:live PID-HERE | less and see what's using your memory. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR - Original Message From: Rahul R rahul.s...@gmail.com To: solr-user@lucene.apache.org Sent: Tuesday, August 4, 2009 1:09:06 AM Subject: JVM Heap utilization Memory leaks with Solr I am trying to track memory utilization with my Application that uses Solr. Details of the setup : -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0 - Hardware : 12 CPU, 24 GB RAM For testing during PSR I am using a smaller subset of the actual data that I want to work with. Details of this smaller sub-set : - 5 million records, 4.5 GB index size Observations during PSR: A) I have allocated 3.2 GB for the JVM(s) that I used. After all users logout and doing a force GC, only 60 % of the heap is reclaimed. As part of the logout process I am invalidating the HttpSession and doing a close() on CoreContainer. From my application's side, I don't believe I am holding on to any resource. I wanted to know if there are known issues surrounding memory leaks with Solr ? B) To further test this, I tried deploying with shards. 3.2 GB was allocated to each JVM. All JVMs had 96 % free heap space after start up. I got varying results with this. Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain. I split the 5 million index into 5 parts of 1 million each and used them as shards. After multiple users used the system and doing a force GC, around 94 - 96 % of heap was reclaimed in all the JVMs. Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On the other, I deployed the entire 5 million part index as one shard. After multiple users used the system and doing a gorce GC, around 76 % of the heap was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my application was running. This result further convinces me that my application can be absolved of holding on to memory resources. I am not sure how to interpret these results ? For searching, I am using Without Shards : EmbeddedSolrServer With Shards :CommonsHttpSolrServer In terms of Solr objects this is what differs in my code between normal search and shards search (distributed search) After looking at Case 1, I thought that the CommonsHttpSolrServer was more memory efficient but Case 2 proved me wrong. Or could there still be memory leaks in my application ? Any thoughts, suggestions would be welcome. Regards Rahul
JVM Heap utilization Memory leaks with Solr
I am trying to track memory utilization with my Application that uses Solr. Details of the setup : -3rd party Software : Solaris 10, Weblogic 10, jdk_150_14, Solr 1.3.0 - Hardware : 12 CPU, 24 GB RAM For testing during PSR I am using a smaller subset of the actual data that I want to work with. Details of this smaller sub-set : - 5 million records, 4.5 GB index size Observations during PSR: A) I have allocated 3.2 GB for the JVM(s) that I used. After all users logout and doing a force GC, only 60 % of the heap is reclaimed. As part of the logout process I am invalidating the HttpSession and doing a close() on CoreContainer. From my application's side, I don't believe I am holding on to any resource. I wanted to know if there are known issues surrounding memory leaks with Solr ? B) To further test this, I tried deploying with shards. 3.2 GB was allocated to each JVM. All JVMs had 96 % free heap space after start up. I got varying results with this. Case 1 : Used 6 weblogic domains. My application was deployed one 1 domain. I split the 5 million index into 5 parts of 1 million each and used them as shards. After multiple users used the system and doing a force GC, around 94 - 96 % of heap was reclaimed in all the JVMs. Case 2: Used 2 weblogic domains. My application was deployed on 1 domain. On the other, I deployed the entire 5 million part index as one shard. After multiple users used the system and doing a gorce GC, around 76 % of the heap was reclaimed in the shard JVM. And 96 % was reclaimed in the JVM where my application was running. This result further convinces me that my application can be absolved of holding on to memory resources. I am not sure how to interpret these results ? For searching, I am using Without Shards : EmbeddedSolrServer With Shards :CommonsHttpSolrServer In terms of Solr objects this is what differs in my code between normal search and shards search (distributed search) After looking at Case 1, I thought that the CommonsHttpSolrServer was more memory efficient but Case 2 proved me wrong. Or could there still be memory leaks in my application ? Any thoughts, suggestions would be welcome. Regards Rahul