Re: Accessing uima as pipeline from a REST interface

2014-02-26 Thread Mihaela M
The REST service will be called by multiple clients, concurrently (I have a web 
application that calls this service). On the web server, for each request a new 
thread is created that will use the service instance to call the functionality 
and have the results returned back. If I create only one instance of the 
UimaAsynchronousEngine and call the sendAndReceive() method on it, that is also 
synchornized and blocking, wouldn't this mean that the requests from the client 
will be treated in a serial manner not in parallel? 
Because my understanding is that because the client blocks until the reply and 
because that method is synchronized, it won't be able to send further CASes to 
the pipeline (from other threads) and have them processed in parallel. Please 
correct me if I'm wrong.

Thanks,
Mihaela




On Wednesday, February 26, 2014 4:48 PM, Jaroslaw Cwiklik  
wrote:
 
Mihaela, does your REST service provide threading to handle client
requests? If so, you can consider using a shared instance of
UimaAsynchronousEngine
client. Each thread would call sendAndReceive() and block until reply
comes. This would be the most efficient way of doing this I think.


Jerry C



On Tue, Feb 25, 2014 at 2:15 PM, Mihaela M  wrote:

> Hello,
>
> I have an Uima As pipeline that has more annotators running in parallel.
> On top of this I want to build a REST service that would invoke the
> pipeline for a given text and return the annotations found back to the
> client. The REST service should support a high number of concurrent
> requests.
>
>
> Because of the need of having a synchronous call to the pipeline I thought
> I should use UimaAsynchronousEngine's sendAndReceive(CAS) method, but
> because this method is synchronized and blocks until the pipeline returns
> the reply, if I have only one instance of UimaAsynchronousEngine and the
> processing time is not that good all the calls of the web service will be
> handled synchronously, not in parallel.
>
> In this case , is it feasible to create a pool of UimaAsynchronousEngine
> clients (the pool size will match the CAS pool size of the uima as
> pipeline, which will add also more running instances of each annotator) and
> in the web service have one of the available clients in the pool reused to
> call the uima as pipeline synchronously? I know that each such client opens
> some connections to the ActiveMQ broker (at least two) so I expect to add
> some overhead to the message broker and my web server but I don't know how
> bad could it be.
>
>
> If I tune the pipeline so that is supports high throughput, what would be
> the best approach for adding this REST client with high throughput as well?
>
> I'm looking forward for any feedback or suggestions.
>
> Thanks,
> Mihaela

Accessing uima as pipeline from a REST interface

2014-02-25 Thread Mihaela M
Hello,

I have an Uima As pipeline that has more annotators running in parallel. On top 
of this I want to build a REST service that would invoke the pipeline for a 
given text and return the annotations found back to the client. The REST 
service should support a high number of concurrent requests.


Because of the need of having a synchronous call to the pipeline I thought I 
should use UimaAsynchronousEngine's sendAndReceive(CAS) method, but because 
this method is synchronized and blocks until the pipeline returns the reply, if 
I have only one instance of UimaAsynchronousEngine and the processing time is 
not that good all the calls of the web service will be handled synchronously, 
not in parallel.

In this case , is it feasible to create a pool of UimaAsynchronousEngine 
clients (the pool size will match the CAS pool size of the uima as pipeline, 
which will add also more running instances of each annotator) and in the web 
service have one of the available clients in the pool reused to call the uima 
as pipeline synchronously? I know that each such client opens some connections 
to the ActiveMQ broker (at least two) so I expect to add some overhead to the 
message broker and my web server but I don't know how bad could it be. 


If I tune the pipeline so that is supports high throughput, what would be the 
best approach for adding this REST client with high throughput as well?

I'm looking forward for any feedback or suggestions.

Thanks,
Mihaela

Re: uima-as 2.3.1 - java.io.IOException: Frame size of 147 MB larger than max allowed 100 MB

2014-02-13 Thread Mihaela M
ghtEncodingEnabled=true, StackTraceEnabled=true}, 
magic=[A,c,t,i,v,e,M,Q]} and remote: WireFormatInfo { version=9, 
properties={CacheSize=1024, MaxFrameSize=209715200, CacheEnabled=true, 
SizePrefixDisabled=false, TcpNoDelayEnabled=true, 
MaxInactivityDurationInitalDelay=1, MaxInactivityDuration=0, 
TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]}
02/13/2014 05:36:37.147 [DEBUG] [ActiveMQ Transport: 
tcp://localhost/127.0.0.1:61616] 
org.apache.activemq.transport.WireFormatNegotiator - Received WireFormat: 
WireFormatInfo { version=9, properties={CacheSize=1024, MaxFrameSize=209715200, 
CacheEnabled=true, SizePrefixDisabled=false, TcpNoDelayEnabled=true, 
MaxInactivityDurationInitalDelay=1, MaxInactivityDuration=0, 
TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]}
02/13/2014 05:36:37.147 [DEBUG] [ActiveMQ Transport: 
tcp://localhost/127.0.0.1:61616] 
org.apache.activemq.transport.WireFormatNegotiator - 
tcp://localhost/127.0.0.1:61616 before negotiation: OpenWireFormat{version=9, 
cacheEnabled=false, stackTraceEnabled=false, tightEncodingEnabled=false, 
sizePrefixDisabled=false, maxFrameSize=104857600}
02/13/2014 05:36:37.147 [DEBUG] [ActiveMQ Transport: 
tcp://localhost/127.0.0.1:61616] 
org.apache.activemq.transport.WireFormatNegotiator - 
tcp://localhost/127.0.0.1:61616 after negotiation: OpenWireFormat{version=9, 
cacheEnabled=true, stackTraceEnabled=true, tightEncodingEnabled=true, 
sizePrefixDisabled=false, maxFrameSize=104857600}

I have logged also the name of the thread and it seems that, at least for the 
remote primitive annotator, not the main thread starts negotiating using the 
default maxFrameSize of 100 MB. The thread name contains "meta". Could it be 
the thread used for checking the state of the component? Is it possible that 
this connection is not using the brokerUrl passed, but the default one?

Any feedback is appreciated. 

Thanks,
Mihaela




On Wednesday, February 12, 2014 4:43 PM, Jaroslaw Cwiklik  
wrote:
 
It seems like the ActimeMQ documentation 
(http://activemq.apache.org/configuring-wire-formats.html)
is wrong with respect to the default maxFrameSize being MAX_LONG. I checked 
ActiveMQ source code and the default is 100 MB:

publicfinalclassOpenWireFormatimplementsWireFormat{publicstaticfinalintDEFAULT_VERSION=CommandTypes.PROTOCOL_STORE_VERSION;publicstaticfinalintDEFAULT_WIRE_VERSION=CommandTypes.PROTOCOL_VERSION;publicstaticfinalintDEFAULT_MAX_FRAME_SIZE=100*1024*1024;//100
 MB   
<-staticfinalbyteNULL_TYPE=CommandTypes.NULL;privatestaticfinalintMARSHAL_CACHE_SIZE=Short.MAX_VALUE/2;privatestaticfinalintMARSHAL_CACHE_FREE_SPACE=100;

The UIMA-AS doesnt set this value so the default is being used unless 
overriden. It seems to me that
either your service or a client is not overriding the default. Please check 
your deployment descriptors to make sure
that you changing the default in the brokerURL. 

Jerry



On Wed, Feb 12, 2014 at 9:21 AM, Mihaela M  wrote:

Hello,
>
>I have upgraded uima-as to version 2.4.2 but I still encounter an issue with 
>the wireFormat.maxFrameSize setting for the ActiveMQ broker.
>1. I have updated the configuration for transport connector in activemq.xml 
>file:
>
>            uri="tcp://127.0.0.1:61616?wireFormat.maxInactivityDuration=0&wireFormat.maxFrameSize=209715200&jms.useCompression=true"/>
>
>2. I have set the brokerURL attribute in uima-as deployment descriptors to 
>value: 
>"tcp://127.0.0.1:61616?wireFormat.maxInactivityDuration=0&wireFormat.maxFrameSize=209715200&jms.useCompression=true"
>3. I have set the TRACE level for logger org.apache.activemq.transport
>
>After performing all the above settings I noticed that when I started the 
>pipeline, for each remote delegate, multiple negotiations are performed by 
>org.apache.activemq.transport.WireFormatNegotiator. All use the maxFrameSize 
>of 200 MB that I specified, except one negotiation that is done using 
>maxFrameSize of 100 MB.
>I do not understand from where does come this limitation of 100 MB. Does exist 
>in the UIMA client? By default I saw that ActiveMQ is using MAX_LONG for 
>maxFrameSize so I really don't know from where does come this 100 MB setting 
>for maxFrameSize.
>
>Does anyone have an idea why is happening this? Could somebody tell me a 
>starting point for looking in the uima code?
>
>
>On the other hand does anybody know whether there are some limitations when 
>using the "binary" serializer for remote delegates instead of "xmi" 
>serializer? I found in one jira issue 
>(https://issues.apache.org/jira/browse/UIMA-1196) that for the "binary" 
>serializer is mandatory that all uima AS services use a common type system. Is 
>this still an issue in uima-as 2.4.2?
>
>Thank you!
>Mihaela
&g

Re: uima-as 2.3.1 - java.io.IOException: Frame size of 147 MB larger than max allowed 100 MB

2014-02-12 Thread Mihaela M
Hello,

I have upgraded uima-as to version 2.4.2 but I still encounter an issue with 
the wireFormat.maxFrameSize setting for the ActiveMQ broker.
1. I have updated the configuration for transport connector in activemq.xml 
file:

            

2. I have set the brokerURL attribute in uima-as deployment descriptors to 
value: 
"tcp://127.0.0.1:61616?wireFormat.maxInactivityDuration=0&wireFormat.maxFrameSize=209715200&jms.useCompression=true"
3. I have set the TRACE level for logger org.apache.activemq.transport

After performing all the above settings I noticed that when I started the 
pipeline, for each remote delegate, multiple negotiations are performed by 
org.apache.activemq.transport.WireFormatNegotiator. All use the maxFrameSize of 
200 MB that I specified, except one negotiation that is done using maxFrameSize 
of 100 MB.
I do not understand from where does come this limitation of 100 MB. Does exist 
in the UIMA client? By default I saw that ActiveMQ is using MAX_LONG for 
maxFrameSize so I really don't know from where does come this 100 MB setting 
for maxFrameSize.

Does anyone have an idea why is happening this? Could somebody tell me a 
starting point for looking in the uima code?


On the other hand does anybody know whether there are some limitations when 
using the "binary" serializer for remote delegates instead of "xmi" serializer? 
I found in one jira issue (https://issues.apache.org/jira/browse/UIMA-1196) 
that for the "binary" serializer is mandatory that all uima AS services use a 
common type system. Is this still an issue in uima-as 2.4.2?

Thank you!
Mihaela




On Monday, January 27, 2014 4:30 PM, Eddie Epstein  wrote:
 
On Thu, Jan 23, 2014 at 9:28 AM, Thomas Ginter wrote:

> It is likely then that your expansion is happening after the remote
> service is called or else is not yet big enough to be over the 100MB limit.
>

Also note that by default UIMA-AS [Java] services use a delta-CAS
interface. Only changes to the CAS
are returned from a service.

Besides deleting unnecessary FS from the final CAS to be returned, another
option to consider is to use compression on JMS messages:
jms.useCompression=true
This decoration can be added to the broker configuration file,
   $UIMA_HOME/amq/conf/activemq-nojournal.xml

as
   
which will cause messages in all queues to be compressed.

Eddie

Re: uima-as 2.3.1 - java.io.IOException: Frame size of 147 MB larger than max allowed 100 MB

2014-01-22 Thread Mihaela M
1. I will upgrade uima-as and review the annotations gathered in the CAS, but 
is it a way to have the CAS reset before sending it to the client? In my case I 
only want to get the status of the processing, not all the annotations found, 
because they were handled by the consumers configured in the pipeline anyway.

2. Do you know whether the aggregates communicate with the clients the same as 
with the remote CAS consumers? I wonder why it did not complain while sending 
the exploded CAS to the remote consumer, but it did when communicating with the 
client.

Thank you!
Mihaela



On Wednesday, January 22, 2014 7:07 PM, Thomas Ginter  
wrote:
 
Mihaela,

There are two things that you should probably do in order to get started with 
these issues.

1.  Upgrade to UIMA-AS 2.4.2 which uses a newer version of ActiveMQ and 
contains numerous bug fixes for UIMA-AS related to how the JMS queues are 
handled.
2.  The UIMA-AS framework adds very little as far as overhead space for the CAS 
objects which means the vast majority of the size expansion from 48KB to 147MB 
is coming from annotations/metadata being added by your service.  Increasing 
the frame size in ActiveMQ may allow your CAS objects to be transferred in JMS 
but it is more important to find out what is causing this dramatic expansion 
and whether or not the service can be written differently so that the expansion 
is much smaller.

Thanks,

Thomas Ginter
801-448-7676
thomas.gin...@utah.edu





On Jan 22, 2014, at 9:44 AM, Mihaela M  wrote:

> Hello,
> 
> I have a uima pipeline that uses uima-as 2.3.1 which has one aggregator with 
> one local annotator, one remote consumer and one remote annotator. It 
> actually has more components but I will get into exactly the configuration 
> only if needed.
> I have developed also a UIMA client for it using class: 
> UimaAsynchronousEngine, method sendCas (async as far I understood) and a 
> callback listener that waits for the processing to complete.
> 
> 1. I have noticed that the CAS returned, in general is quite big. Is it a way 
> to send, at least to the client, a CAS that does not contain all the types 
> that the various annotators added? When could I remove those things from the 
> CAS?
> 2. I send a text message for processing which has 48 KB - it gets processed 
> successfully by the pipeline, but the pipeline fails to send a reply to the 
> client. The exception that I get is:
> 
> 01/21/2014 07:36:02.978 [ActiveMQ Transport:
> tcp://localhost/127.0.0.1:61616] [DEBUG] 
> org.apache.activemq.ActiveMQConnection
> - Async exception with no exception listener: java.io.IOException: Frame size
> of 147 MB larger than max allowed 100 MB
> java.io.IOException: Frame size of 147 MB larger than max
> allowed 100 MB
>                 at
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:277)
> ~[activemq-core-5.6.0.jar:5.6.0]
>                 at
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:229)
> ~[activemq-core-5.6.0.jar:5.6.0]
>                 at
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:221)
> ~[activemq-core-5.6.0.jar:5.6.0]
>                 at
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:204)
> ~[activemq-core-5.6.0.jar:5.6.0]
>                 at
> java.lang.Thread.run(Thread.java:662) [na:1.6.0_30]
> 01/21/2014 07:36:03.093 [ActiveMQ Connection Executor:
> tcp://localhost/127.0.0.1:61616] [DEBUG]
> org.apache.activemq.transport.tcp.TcpTransport - Stopping transport
> tcp://localhost/127.0.0.1:61616
> 
> As far as I understood, the client connects via JMS to the uima pipeline and 
> a temporary reply queue gets created where the reply from the pipeline should 
> be sent and then consumed by the client. After the above exception is thrown, 
> the connection to the pipeline gets closed and automatically the temp queue 
> gets deleted hence the client does not receive anymore the reply.
> 
> I am wondering why the error I was mentioning is not thrown while the 
> aggregator sends the CAS to the consumer, because the consumer is remote, 
> hence the communication between them is done through JMS queue as well, and I 
> think the aggregator has a reply queue as well for the consumer...
> 
> I want to mention that I tried to increase the maxFrameSize on AMQ broker but 
> without success. It seems to be a bug in AMQ 5.6 that uima as 2.3.1 is using.
> 
> Any feedback is appreciated.
> 
> Thank you,
> Mihaela

uima-as 2.3.1 - java.io.IOException: Frame size of 147 MB larger than max allowed 100 MB

2014-01-22 Thread Mihaela M
Hello,

I have a uima pipeline that uses uima-as 2.3.1 which has one aggregator with 
one local annotator, one remote consumer and one remote annotator. It actually 
has more components but I will get into exactly the configuration only if 
needed.
I have developed also a UIMA client for it using class: UimaAsynchronousEngine, 
method sendCas (async as far I understood) and a callback listener that waits 
for the processing to complete.

1. I have noticed that the CAS returned, in general is quite big. Is it a way 
to send, at least to the client, a CAS that does not contain all the types that 
the various annotators added? When could I remove those things from the CAS?
2. I send a text message for processing which has 48 KB - it gets processed 
successfully by the pipeline, but the pipeline fails to send a reply to the 
client. The exception that I get is:

01/21/2014 07:36:02.978 [ActiveMQ Transport:
tcp://localhost/127.0.0.1:61616] [DEBUG] org.apache.activemq.ActiveMQConnection
- Async exception with no exception listener: java.io.IOException: Frame size
of 147 MB larger than max allowed 100 MB
java.io.IOException: Frame size of 147 MB larger than max
allowed 100 MB
    at
org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:277)
~[activemq-core-5.6.0.jar:5.6.0]
    at
org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:229)
~[activemq-core-5.6.0.jar:5.6.0]
    at
org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:221)
~[activemq-core-5.6.0.jar:5.6.0]
    at
org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:204)
~[activemq-core-5.6.0.jar:5.6.0]
    at
java.lang.Thread.run(Thread.java:662) [na:1.6.0_30]
01/21/2014 07:36:03.093 [ActiveMQ Connection Executor:
tcp://localhost/127.0.0.1:61616] [DEBUG]
org.apache.activemq.transport.tcp.TcpTransport - Stopping transport
tcp://localhost/127.0.0.1:61616

As far as I understood, the client connects via JMS to the uima pipeline and a 
temporary reply queue gets created where the reply from the pipeline should be 
sent and then consumed by the client. After the above exception is thrown, the 
connection to the pipeline gets closed and automatically the temp queue gets 
deleted hence the client does not receive anymore the reply.

I am wondering why the error I was mentioning is not thrown while the 
aggregator sends the CAS to the consumer, because the consumer is remote, hence 
the communication between them is done through JMS queue as well, and I think 
the aggregator has a reply queue as well for the consumer...

I want to mention that I tried to increase the maxFrameSize on AMQ broker but 
without success. It seems to be a bug in AMQ 5.6 that uima as 2.3.1 is using.

Any feedback is appreciated.

Thank you,
Mihaela