On 7/9/2014 6:02 AM, yuvaraj ponnuswamy wrote:
> Hi,
>
> I am getting the OutofMemory Error: "java.lang.OutOfMemoryError: Java heap 
> space" often in production due to the particular Treemap is taking more 
> memory in the JVM.
>
> When i looked into the config files I am having the entity called 
> UserQryDocument where i am fetching the data from certain tables.
> Again i have a sub entiry called "UserLocation" where i am using the 
> CachedSqlEntityProcessor to get the fields from Cache. It seems like it has 
> the total of 2,00,000 records total.
> processor="CachedSqlEntityProcessor" cacheKey="user_pin" 
> cacheLookup="UserQueryDocumentNonAuthor.DocKey">
>
> Like this i have some other different entity and there also i am using this 
> CachedSqlEntityProcessor in the sub entity.
>
> But when i looked into the Heap Dump : java_pid57.hprof i am able to see the 
> TreeMap is causing the problem.
>
> But not able to find which entity is causing this issue.I am using the IBM 
> Heap Ananlyser to look into the Dump.
>
> Can you please let me know is there any other way we can find out which 
> entity is causing this issue or any other tool to analyse and debug the Out 
> of Memory Issue to find the exact entity is causing this issue.
>
> I have attched the entity in dataconfig.xml and heap Anayser screen shot.

JDBC drivers have a habit of loading the entire resultset into RAM. 
Also, you are using the cached processor ... which will effectively do
the same thing.  With millions of DB rows, this is going to require a
LOT of heap memory.  You'll want to change your JDBC connection so that
it doesn't load the entire result set, and you may also need to turn off
entity caching in Solr.  You didn't mention what database you're using. 
Here's how to fix MySQL and SQL Server so they don't load the entire
result set.  The requirements for another database are likely to be
different:

https://wiki.apache.org/solr/DataImportHandlerFaq#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F

The best way to make DIH perform well is to use JOIN so that you can get
all your data with one entity and one SELECT query.  Let the database do
all the heavy lifting instead of having Solr send millions of queries. 
GROUP_CONCAT on the SQL side and a regexTransformer 'splitBy' can
sometimes be used to get multiple values into a field.

Thanks,
Shawn

Reply via email to