[
https://issues.apache.org/jira/browse/OODT-383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris A. Mattmann resolved OODT-383.
------------------------------------
Resolution: Won't Fix
- moving to Avro should address this.
> Workflow Manager Client - Add Connection Limit Option
> -----------------------------------------------------
>
> Key: OODT-383
> URL: https://issues.apache.org/jira/browse/OODT-383
> Project: OODT
> Issue Type: New Feature
> Components: workflow manager
> Environment: centOS 5/6
> Reporter: Cameron Goodale
> Assignee: Cameron Goodale
> Priority: Minor
> Fix For: 0.11
>
> Attachments: modscag-v2-job-runner.py
>
>
> When using the wmgr-client to run thousands of jobs it is pretty easy to
> overwhelm the xml-rpc connection pool to the workflow manager. I was using a
> simple python script to submit 10K jobs and the workflow manager couldn't
> handle the jobs quickly enough and many jobs were dropped as a result.
> One fix I implemented in my Python code was to use lsof to check the number
> of ESTABLISHED connections to the workflow manager. If the workflow manager
> had more than say 30 connections, my program would go to sleep and try
> submitting jobs later.
> I would like to enhance the wmgr-client shell script with an option to limit
> the number of connections to the wmgr, by default this limit would not be set.
> If the connection limit is reached the wmgr-client would sleep for 10
> seconds, and re-check the number of connections. This loop would continue
> until the number of connections dropped below the specified limit. Once the
> connection count drops below the target number, the wmgr-client would resume
> submitting jobs to the wmgr.
> On my production server I was using lsof to gather the number of connections
> to the wmgr. I am not sure if we can always rely on lsof being installed on
> all machines, so we might need to use a more universal method (maybe in Java).
> here is the lsof command I used with some grep and wc sprinkled in:
> {{/usr/sbin/lsof -i :9001 | grep ESTABLISHED | wc}}
> This assumes you are running wmgr on localhost:9001 and lsof is installed at
> /usr/sbin/lsof
> Any other thoughts or ideas to work this out would be appreciated.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)