[jira] [Updated] (OFBIZ-10592) OutOfMemory and stucked JobPoller issue

Rohit Hukkeri (JIRA) Thu, 21 Feb 2019 22:32:19 -0800


     [ 
https://issues.apache.org/jira/browse/OFBIZ-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Rohit Hukkeri updated OFBIZ-10592:
----------------------------------
    Description: 
 
This installation is composed by two instances of OFBiz (v13.07.03), served via 
an Apache Tomcat webserver, along with a load balancer.
The database server is MariaDB.
 
We had the first problems, about 3 weeks ago, when suddenly, the front1 (ofbiz 
instance 1), stopped serving web requests; front2, instead, was still working 
correctly.
 
Obviously we checked the log files, and we saw that async services were 
failing; the failure was accompanied by this error line:
 
*_Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead limit 
exceeded_*
 
We analyzed the situation with our system specialists, and they told us that 
the application was highly stressing machine resources (cpu always at or near 
100%, RAM usage rapidly increasing), until the jvm run out of memory.
This "resource-high-consumption situation", occurred only when ofbiz1 instance 
was started with the JobPoller enabled; if the JobPoller was not enabled, ofbiz 
run with low resource usage. 
 
We then focused on the db, to check first of all the dimensions; the result was 
disconcerting; 45GB, mainly divided on four tables: SERVER_HIT (about 18 GB), 
VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR (about 2 GB).
All the other tables had a size in the order of few MB each.
 
The first thing we did, was to clear all those tables, reducing considerably 
the db size.
After the cleaning, we tried to start ofbiz1 again, with the JobPoller 
component enabled; this caused a lot of old scheduled/queued jobs, to execute.
Except than for the start-up time, the resource usage of the machine, 
stabilized around normal to low values (cpu 1-10%).
Ofbiz seemed to work (web request was served), but we noticed that the 
JobPoller did not schedule or run jobs, anymore. 
The number of job in "Pending" state in the JobSandbox entity was small (about 
20); no Queued, no Failed, no jobs in other states.
In addition to this, unfortunately, after few hours, jvm run out of memory 
again.
 
Our jvm has an heap maximum size of 20GB ( we have 32GB on the  machine), so 
it's not so small, I think.
The next step we're going to do is set-up locally the application over the same 
production db to see what happens.
 
Now that I explained the situation, I am going to ask if, in your 
opinion/experience:
 
Could the JobPoller component be the root (and only) cause of the OutOfMemory 
of the jvm?
 
Could this issue be related to OFBIZ-5710?
 
Dumping and analyzing the heap of the jvm could help in some way to understand 
what or who fills the memory or is this operation a waste of time?
 
Is there something that we did not considered or missed during the whole 
process of problem analysis?
 
 
I really thank you all for your attention and your help; any suggestion or 
advice would really be greatly appreciated.
 
Kind regards,
Giulio

  was:
 
This installation is composed by two instances of OFBiz (v13.07.03), served via 
an Apache Tomcat webserver, along with a load balancer.
The database server is MariaDB.
 
We had the first problems, about 3 weeks ago, when suddenly, the front1 (ofbiz 
instance 1), stopped serving web requests; front2, instead, was still working 
correctly.
 
Obviously we checked the log files, and we saw that async services were 
failing; the failure was accompanied by this error line:
 
*_Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead limit 
exceeded_*
 
We analyzed the situation with our system specialists, and they told us that 
the application was highly stressing machine resources (cpu always at or near 
100%, RAM usage rapidly increasing), until the jvm run out of memory.
This "resource-high-consumption situation", occurred only when ofbiz1 instance 
was started with the JobPoller enabled; if the JobPoller was not enabled, ofbiz 
run with low resource usage. 
 
We then focused on the db, to check first of all the dimensions; the result was 
disconcerting; 45GB, mainly divided on four tables: SERVER_HIT (about 18 GB), 
VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR (about 2 GB).
All the other tables had a size in the order of few MB each.
 
The first thing we did, was to clear all those tables, reducing considerably 
the db size.
After the cleaning, we tried to start ofbiz1 again, with the JobPoller 
component enabled; this caused a lot of old scheduled/queued jobs, to execute.
Except than for the start-up time, the resource usage of the machine, 
stabilized around normal to low values (cpu 1-10%).
Ofbiz seemed to work (web request was served), but we noticed taht the 
JobPoller did not schedule or run jobs, anymore. 
The number of job in "Pending" state in the JobSandbox entity was small (about 
20); no Queued, no Failed, no jobs in other states.
In addition to this, unfortunately, after few hours, jvm run out of memory 
again.
 
Our jvm has an heap maximum size of 20GB ( we have 32GB on the  machine), so 
it's not so small, I think.
The next step we're going to do is set-up locally the application over the same 
production db to see what happens.
 
Now that I explained the situation, I am going to ask if, in your 
opinion/experience:
 
Could the JobPoller component be the root (and only) cause of the OutOfMemory 
of the jvm?
 
Could this issue be related to OFBIZ-5710?
 
Dumping and analyzing the heap of the jvm could help in some way to understand 
what or who fills the memory or is this operation a waste of time?
 
Is there something that we did not considered or missed during the whole 
process of problem analysis?
 
 
I really thank you all for your attention and your help; any suggestion or 
advice would really be greatly appreciated.
 
Kind regards,
Giulio


> OutOfMemory and stucked JobPoller issue
> ---------------------------------------
>
>                 Key: OFBIZ-10592
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-10592
>             Project: OFBiz
>          Issue Type: Bug
>          Components: ALL APPLICATIONS
>    Affects Versions: Release Branch 13.07
>         Environment: Two instances Ofbiz installation on two machines, 
> connected to an Apache HTTPD instance which acts as a proxy.
> Request to ofbiz instances are handled and load balanced.
> OFBiz version : 13.07.03 with Multitenant enabled
> OS: Ubuntu Linux 16.04 LTS
> RDBMS: MariaDB v10.1.24
>            Reporter: Giulio Speri
>            Assignee: Giulio Speri
>            Priority: Critical
>         Attachments: alloc_tree_600k_12102018.png, 
> jvm_ofbiz1_profi_telem.png, jvm_prof_ofbiz1_telem2.png, 
> ofbiz1_jvm_profil_nojobpoller.png, recorder_object_600k_12102018.png, 
> telemetry_ovrl_600k_12102018.png
>
>
>  
> This installation is composed by two instances of OFBiz (v13.07.03), served 
> via an Apache Tomcat webserver, along with a load balancer.
> The database server is MariaDB.
>  
> We had the first problems, about 3 weeks ago, when suddenly, the front1 
> (ofbiz instance 1), stopped serving web requests; front2, instead, was still 
> working correctly.
>  
> Obviously we checked the log files, and we saw that async services were 
> failing; the failure was accompanied by this error line:
>  
> *_Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead limit 
> exceeded_*
>  
> We analyzed the situation with our system specialists, and they told us that 
> the application was highly stressing machine resources (cpu always at or near 
> 100%, RAM usage rapidly increasing), until the jvm run out of memory.
> This "resource-high-consumption situation", occurred only when ofbiz1 
> instance was started with the JobPoller enabled; if the JobPoller was not 
> enabled, ofbiz run with low resource usage. 
>  
> We then focused on the db, to check first of all the dimensions; the result 
> was disconcerting; 45GB, mainly divided on four tables: SERVER_HIT (about 18 
> GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR (about 2 
> GB).
> All the other tables had a size in the order of few MB each.
>  
> The first thing we did, was to clear all those tables, reducing considerably 
> the db size.
> After the cleaning, we tried to start ofbiz1 again, with the JobPoller 
> component enabled; this caused a lot of old scheduled/queued jobs, to execute.
> Except than for the start-up time, the resource usage of the machine, 
> stabilized around normal to low values (cpu 1-10%).
> Ofbiz seemed to work (web request was served), but we noticed that the 
> JobPoller did not schedule or run jobs, anymore. 
> The number of job in "Pending" state in the JobSandbox entity was small 
> (about 20); no Queued, no Failed, no jobs in other states.
> In addition to this, unfortunately, after few hours, jvm run out of memory 
> again.
>  
> Our jvm has an heap maximum size of 20GB ( we have 32GB on the  machine), so 
> it's not so small, I think.
> The next step we're going to do is set-up locally the application over the 
> same production db to see what happens.
>  
> Now that I explained the situation, I am going to ask if, in your 
> opinion/experience:
>  
> Could the JobPoller component be the root (and only) cause of the OutOfMemory 
> of the jvm?
>  
> Could this issue be related to OFBIZ-5710?
>  
> Dumping and analyzing the heap of the jvm could help in some way to 
> understand what or who fills the memory or is this operation a waste of time?
>  
> Is there something that we did not considered or missed during the whole 
> process of problem analysis?
>  
>  
> I really thank you all for your attention and your help; any suggestion or 
> advice would really be greatly appreciated.
>  
> Kind regards,
> Giulio



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (OFBIZ-10592) OutOfMemory and stucked JobPoller issue

Reply via email to