Re: UIMA DUCC limit max memory of node

2016-11-01 Thread Eddie Epstein
Hi,

You are right that ducc.agent.node.metrics.fake.memory.size will override
the agent's computation of total usable memory. This must be set as a java
property on the agent. To set this for all agents, add the following line
to site.ducc.properties in the resources folder and restart DUCC.
ducc.agent.jvm.args= -Xmx500M -Dducc.agent.node.metrics.fake.
memory.size=N
where N is in KB.

DUCC uses cgset -r cpu.shares=M to control a containers CPU. M is computed
as
   container-size-in-byes / total-memory-size-in-KB. So the maximum value
for M in a DUCC container is 1024.

cpu.shares controls CPU usage in a relative way. A container with
cpu.shares=1024 will potentially get 2x the CPU than a container with 512
shares. Note that if a container is using less than its share, other
containers will be allowed to get more than their share.

For newer OS, e.g. RHEL7, processes not put into a specific container are
put into the default container with cpu.shares = 1024. So if you break a
machine in half using fake.memory, and if DUCC were to fill its half up
with work, then the two halves of the box would have equal shares. Sounds
good for your scenario.

However, note that cpu.shares works for CPUs, not cores. so things may not
be so nice if hyperthreading is enabled. For example, consider a machine
with 32 cores and 2-way hyperthreading. A process burning 32 CPUs may
pretty much "max out" the machine even though there are 32 unused CPUs
available. To limit the DUCC half of a machine to only half the real
machine resources would require changing agent code to use the "*Ceiling
Enforcement Tunable Parameters*" which are absolute.

Eddie


On Tue, Nov 1, 2016 at 9:31 AM, Daniel Baumartz <
bauma...@stud.uni-frankfurt.de> wrote:

> Hi Eddie,
>
> ok, I will try to explain with more detail, maybe this is not how ducc is
> being used normally. We want to set up some nodes which are not exclusively
> used for ducc. For example, one of the nodes may have 100 GB, but we want
> the usable memory for ducc to only be 50 GB, not all free memory. (We also
> want to limit the CPU usage, for example only use 32 of 64 cores, but we
> have not tried to set this up yet.)
>
> We could not find any setting to achieve this behavior, so we tried using
> cgroups to limit the max usable memory for ducc. This did not work because
> ducc gets its memory info from /proc/meminfo which ignores the cgroups
> settings. After reading through the code it seems only setting
> "ducc.agent.node.metrics.fake.memory.size" (not setting up test mode) is
> doing something similar to what we want: "Comment from
> NodeMemInfoCollector.java: if running ducc in simulation mode skip memory
> adjustment. Report free memory = fakeMemorySize". But I am not sure if we
> can use this safely since it is for testing.
>
> So we basically want to give ducc an upper limit of usable memory.
>
> I hope it is a bit more clear what we want to achieve.
>
> Thanks again,
> Daniel
>
>
> Zitat von Eddie Epstein :
>
>
> Hi Daniel,
>>
>> For each node Ducc sums RSS for all "system" user processes and excludes
>> that from Ducc usable memory on the node. System users are defined by a
>> ducc.properties setting with default value:
>> ducc.agent.node.metrics.sys.gid.max = 500
>>
>> Ducc's simulation mode is intended for creating a scaled out cluster of
>> fake nodes for testing purposes.
>>
>> The only mechanism available for reserving additional memory is to have
>> Ducc run some dummy process that stays up forever. This could be a Ducc
>> service that is automatically started when Ducc starts. This could get
>> complicated for a heterogeneous set of machines and/or Ducc classes.
>>
>> Can you be more precise of what features you are looking for limiting
>> resource use of Ducc machines?
>>
>> Thanks,
>> Eddie
>>
>>
>> On Mon, Oct 31, 2016 at 10:03 AM, Daniel Baumartz <
>> bauma...@stud.uni-frankfurt.de> wrote:
>>
>> Hi,
>>>
>>> I am trying to set up nodes for Ducc that should not use all the memory
>>> on
>>> the machine. I tried to limit the memory with cgroups, but it seems Ducc
>>> is
>>> getting the memory info from /proc/meminfo which ignores the cgroups
>>> settings.
>>>
>>> Did I miss an option to specify the max usable memory? Could I safely use
>>> "ducc.agent.node.metrics.fake.memory.size" from the simulation settings?
>>> Or is there a better way to do this?
>>>
>>> Thanks,
>>> Daniel
>>>
>>>
>>>
>
>
>


Re: Broken conections after ACTIVEMQ restart

2016-11-01 Thread Jaroslaw Cwiklik
Nelson, using the same snapshot I see a different behavior of UIMA-AS
service when a broker is bounced:

12:21:35.452 - 27:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleTempQueueFailure:
WARNING: Service: Test Aggregate TAE Runtime Exception
12:21:35.452 - 27:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleTempQueueFailure:
WARNING: Jms Listener Failed. Endpoint:
temp-queue://ID:bluejws65-33772-1478017289483-1:1\
:1 Managed By: tcp://localhost:61616 Reason: javax.jms.JMSException:
java.io.EOFException
12:21:35.453 - 36:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Service: Test Aggregate TAE Runtime Exception
12:21:35.453 - 36:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Jms Listener Failed. Endpoint: SecondLevelTaeQueue Managed By:
tcp://localhost:61616 Reason\
: javax.jms.JMSException: java.io.EOFException
12:21:35.453 - 27:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:Test Aggregate TAE Listener Unable To Connect To
Broker: tcp\
://localhost:61616 Retrying Until Successful ...
12:21:35.453 - 36:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:Test Aggregate TAE Listener Unable To Connect To
Broker: tcp\
://localhost:61616 Retrying Until Successful ...
12:21:35.454 - 46:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer$3.destroy:
WARNING:
org.apache.activemq.ConnectionFailedException: The JMS connection has
failed: java.io.EOFException
at
org.apache.activemq.ActiveMQConnection.checkClosedOrFailed(ActiveMQConnection.java:1448)
at
org.apache.activemq.ActiveMQConnection.doStop(ActiveMQConnection.java:580)
at
org.apache.activemq.ActiveMQConnection.stop(ActiveMQConnection.java:569)
at
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer$3.run(UimaDefaultMessageListenerContainer.java:1134)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:403)
at
org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:268)
at
org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:240)
at
org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:232)
at
org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:215)
at java.lang.Thread.run(Thread.java:780)

12:21:48.687 - 23:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:Test Aggregate TAE Listener Recovered Connection
to Broker: \
tcp://localhost:61616 - Ready to Process Again
12:21:48.709 - 36:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:Test Aggregate TAE Listener Recovered Connection
to Broker: \
tcp://localhost:61616 - Ready to Process Again
12:21:48.814 - 27:
org.apache.uima.adapter.jms.activemq.JmsInputChannel.createListenerOnTempQueue(1059):
INFO: Test Aggregate
TAE-JmsInputChannel.createListenerOnTempQueue()-starting new Listener
12:21:48.866 - 27:
org.apache.uima.adapter.jms.activemq.JmsInputChannel.createListenerOnTempQueue:
INFO: Service:Test Aggregate TAE Unable to refresh temp destination -
retrying in 5 seconds until success \
...
12:21:53.869 - 27:
org.apache.uima.adapter.jms.activemq.JmsInputChannel.createListenerOnTempQueue:
INFO: Service:Test Aggregate TAE succesfully refreshed temp
destination:temp-queue://ID:bluejws65-33772-14\
78017289483-1:3:1 - FreeCas Queue:true
12:21:53.869 - 27:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:Test Aggregate TAE Listener Recovered Connection
to Broker: \
tcp://localhost:61616 - Ready to Process Again




When running the scenario can you please check if UIMA_HOME points to the
new code (the snapshot). Perhaps it points to an older version of UIMA-AS.
Not sure if this is the case but worth checking.

Jerry

On Tue, Nov 1, 2016 at 9:17 AM, nelson rivera 
wrote:

> I tried the snapshot that  you gave me, and I understand perfectly the
> explanation of recovery queue that you explain, but to me not work,
> and y get the same behavior. With the last snapshop, the uima-as log
> never show recovery of new queue after restart broker (restart =
> execute  /bin/startBroker.sh Starting in foreground, then pressing
> CTRL+C, then execute startBroker.sh again), only shows:
>
> 08:58:13.825 - 19:
> org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerCont
> ainer.onException:
> WARNING: Jms Listener Failed. Endpoint:
> ID:nelson-H81-M1-35307-1478005075345-1:1:1 Managed By:
> tcp://localhost:61616 Reason: 

Re: UIMA DUCC limit max memory of node

2016-11-01 Thread Daniel Baumartz

Hi Eddie,

ok, I will try to explain with more detail, maybe this is not how ducc  
is being used normally. We want to set up some nodes which are not  
exclusively used for ducc. For example, one of the nodes may have 100  
GB, but we want the usable memory for ducc to only be 50 GB, not all  
free memory. (We also want to limit the CPU usage, for example only  
use 32 of 64 cores, but we have not tried to set this up yet.)


We could not find any setting to achieve this behavior, so we tried  
using cgroups to limit the max usable memory for ducc. This did not  
work because ducc gets its memory info from /proc/meminfo which  
ignores the cgroups settings. After reading through the code it seems  
only setting "ducc.agent.node.metrics.fake.memory.size" (not setting  
up test mode) is doing something similar to what we want: "Comment  
from NodeMemInfoCollector.java: if running ducc in simulation mode  
skip memory adjustment. Report free memory = fakeMemorySize". But I am  
not sure if we can use this safely since it is for testing.


So we basically want to give ducc an upper limit of usable memory.

I hope it is a bit more clear what we want to achieve.

Thanks again,
Daniel


Zitat von Eddie Epstein :


Hi Daniel,

For each node Ducc sums RSS for all "system" user processes and excludes
that from Ducc usable memory on the node. System users are defined by a
ducc.properties setting with default value:
ducc.agent.node.metrics.sys.gid.max = 500

Ducc's simulation mode is intended for creating a scaled out cluster of
fake nodes for testing purposes.

The only mechanism available for reserving additional memory is to have
Ducc run some dummy process that stays up forever. This could be a Ducc
service that is automatically started when Ducc starts. This could get
complicated for a heterogeneous set of machines and/or Ducc classes.

Can you be more precise of what features you are looking for limiting
resource use of Ducc machines?

Thanks,
Eddie


On Mon, Oct 31, 2016 at 10:03 AM, Daniel Baumartz <
bauma...@stud.uni-frankfurt.de> wrote:


Hi,

I am trying to set up nodes for Ducc that should not use all the memory on
the machine. I tried to limit the memory with cgroups, but it seems Ducc is
getting the memory info from /proc/meminfo which ignores the cgroups
settings.

Did I miss an option to specify the max usable memory? Could I safely use
"ducc.agent.node.metrics.fake.memory.size" from the simulation settings?
Or is there a better way to do this?

Thanks,
Daniel








Re: Broken conections after ACTIVEMQ restart

2016-11-01 Thread nelson rivera
I tried the snapshot that  you gave me, and I understand perfectly the
explanation of recovery queue that you explain, but to me not work,
and y get the same behavior. With the last snapshop, the uima-as log
never show recovery of new queue after restart broker (restart =
execute  /bin/startBroker.sh Starting in foreground, then pressing
CTRL+C, then execute startBroker.sh again), only shows:

08:58:13.825 - 19:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Jms Listener Failed. Endpoint:
ID:nelson-H81-M1-35307-1478005075345-1:1:1 Managed By:
tcp://localhost:61616 Reason: javax.jms.JMSException:
java.io.EOFException
08:58:13.825 - 19:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleTempQueueFailure:
WARNING: Service: XClusterAnalyzerAggregate Runtime Exception
08:58:13.825 - 19:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleTempQueueFailure:
WARNING: Jms Listener Failed. Endpoint:
temp-queue://ID:nelson-H81-M1-35307-1478005075345-1:1:1 Managed By:
tcp://localhost:61616 Reason: javax.jms.JMSException:
java.io.EOFException
08:58:13.825 - 19:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener Unable To
Connect To Broker: tcp://localhost:61616 Retrying ...
08:58:13.839 - 15:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Service: XClusterAnalyzerAggregate Runtime Exception
08:58:13.844 - 46:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Jms Listener Failed. Endpoint: XClusterAnalyzerAggregate
Managed By: tcp://localhost:61616 Reason: javax.jms.JMSException:
java.io.EOFException
08:58:13.844 - 46:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener Unable To
Connect To Broker: tcp://localhost:61616 Retrying ...
08:58:13.876 - 19:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener
Established Connection to Broker: tcp://localhost:61616
08:58:16.719 - 46:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener
Established Connection to Broker: tcp://localhost:61616
08:59:16.887 - 15:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Jms Listener Failed. Endpoint: XClusterAnalyzerAggregate
Managed By: tcp://localhost:61616 Reason:
javax.jms.IllegalStateException: The Consumer is closed
08:59:16.887 - 15:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener Unable To
Connect To Broker: tcp://localhost:61616 Retrying ...
08:59:16.900 - 15:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener
Established Connection to Broker: tcp://localhost:61616
08:59:16.917 - 14:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Service: XClusterAnalyzerAggregate Runtime Exception
08:59:16.917 - 14:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Jms Listener Failed. Endpoint: XClusterAnalyzerAggregate
Managed By: tcp://localhost:61616 Reason:
javax.jms.IllegalStateException: The Consumer is closed
08:59:16.917 - 14:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener Unable To
Connect To Broker: tcp://localhost:61616 Retrying ...
08:59:16.948 - 14:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener
Established Connection to Broker: tcp://localhost:61616
09:01:17.93 - 14:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.onException:
WARNING: Jms Listener Failed. Endpoint: XClusterAnalyzerAggregate
Managed By: tcp://localhost:61616 Reason:
javax.jms.IllegalStateException: The Consumer is closed
09:01:17.93 - 14:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener Unable To
Connect To Broker: tcp://localhost:61616 Retrying ...
09:01:17.109 - 14:
org.apache.uima.adapter.jms.activemq.UimaDefaultMessageListenerContainer.handleListenerSetupFailure:
WARNING: Uima AS Service:XClusterAnalyzerAggregate Listener
Established Connection to Broker: tcp://localhost:61616



And in the console output of service:
Could not