Re: Image processing library

2017-05-14 Thread Aditya gholba
Hi Anath, thanks for showing interest in my project.

Currently I do not have any concrete latency measurements available but in
most cases its 1sec or lower while testing it locally on my laptop.
As you mentioned, the processing calls (overriding process method from
Toolkit ) are time consuming but none of the operators exceed the default
time out for processing an image.

Except the ASASSN operator, which is used in the telescope use case. This
operator uses a self made algorithm for image classification and hence has
high latency(trying to optimize it further) . In this scenario I have
extended the processing time for this operator using  TIMEOUT_WINDOW_COUNT
attribute as mentioned by Munagala.

Thank you for your inputs,
Aditya

On 13-May-2017 9:00 AM, "Munagala Ramanath" <amberar...@yahoo.com.invalid>
wrote:

The injunction that tuple processing should be "as fast as possible" is
based on anassumption and a fact:
1. In most cases, users want to maximize application throughput.2. If a
callback (like beginWindow(), process(), endWindow(), etc.) takes too
long,   the platform deems the operator hung and restarts it.
Neither imposes a hard constraint: If, for a particular class of
applications,it is OK to sacrifice throughput to allow some CPU intensive
computations to occur,that is certainly possible; the constraint of (2) can
be relaxed by simply increasingthe TIMEOUT_WINDOW_COUNT attribute, for some
or all operators.
Secondly, nothing prevents an operator from starting worker threads that
asynchronouslyperform CPU intensive computations. Naturally, careful
synchronization will be necessarybetween the main and worker threads to
ensure correctness and timelydelivery of results.
Ram

On Friday, May 12, 2017 6:38 PM, Ananth G <ananthg.a...@gmail.com>
wrote:


  I guess the use cases as documented look really compelling. There might
be more comments from code review perspective and below is more from a use
case perspective only.

I was wondering if you have any latency measurements for the tests you ran.

If the image processing calls ( in the process function overridden from the
Toolkit class ) are time consuming it might not be an ideal use case for a
streaming engine? A very old "blog" (2012)  talks about latencies anywhere
between tens of milliseconds to almost a second depending on the use case
and image size. Of course there were hardware improvements and those
numbers might no longer hold good and hence the question (of course the
latencies depend on hardware being used as well )

This brings me to the next question in general about Apex to the community
: what is considered an acceptable tolerance level in terms of latencies
for streaming compute engine like Apex. Is there a way to tune the
acceptable tolerance level depending on the use case ? I keep reading from
the mailing lists that the aspect of tuple processing is part of the main
thread and hence should be as fast as possible.

Regards
Ananth

> On 12 May 2017, at 9:05 pm, Aditya gholba <adi...@datatorrent.com> wrote:
>
> Hello,
> I have been working on an image processing library for Malhar and few of
> the operators are ready. I would like to merge them in Malhar contrib. You
> can read about the operators and the applications I have created so far
> here.
> <https://docs.google.com/document/d/19OrqHJ_QzbuB0XZ4bzdQ9yj
N2dGfDhsuMX6XUjDpqYw/edit>
>
> Link to my GitHub <https://github.com/adiv2/imIO4>
>
> All suggestions and opinions are welcome.
>
>
> Thanks,
> Aditya.


Re: Image processing library

2017-05-12 Thread Munagala Ramanath
The injunction that tuple processing should be "as fast as possible" is based 
on anassumption and a fact:
1. In most cases, users want to maximize application throughput.2. If a 
callback (like beginWindow(), process(), endWindow(), etc.) takes too long,   
the platform deems the operator hung and restarts it.
Neither imposes a hard constraint: If, for a particular class of 
applications,it is OK to sacrifice throughput to allow some CPU intensive 
computations to occur,that is certainly possible; the constraint of (2) can be 
relaxed by simply increasingthe TIMEOUT_WINDOW_COUNT attribute, for some or all 
operators.
Secondly, nothing prevents an operator from starting worker threads that 
asynchronouslyperform CPU intensive computations. Naturally, careful 
synchronization will be necessarybetween the main and worker threads to ensure 
correctness and timelydelivery of results.
Ram 

On Friday, May 12, 2017 6:38 PM, Ananth G <ananthg.a...@gmail.com> wrote:
 

  I guess the use cases as documented look really compelling. There might be 
more comments from code review perspective and below is more from a use case 
perspective only.

I was wondering if you have any latency measurements for the tests you ran. 

If the image processing calls ( in the process function overridden from the 
Toolkit class ) are time consuming it might not be an ideal use case for a 
streaming engine? A very old "blog" (2012)  talks about latencies anywhere 
between tens of milliseconds to almost a second depending on the use case and 
image size. Of course there were hardware improvements and those numbers might 
no longer hold good and hence the question (of course the latencies depend on 
hardware being used as well ) 

This brings me to the next question in general about Apex to the community : 
what is considered an acceptable tolerance level in terms of latencies for 
streaming compute engine like Apex. Is there a way to tune the acceptable 
tolerance level depending on the use case ? I keep reading from the mailing 
lists that the aspect of tuple processing is part of the main thread and hence 
should be as fast as possible. 

Regards
Ananth

> On 12 May 2017, at 9:05 pm, Aditya gholba <adi...@datatorrent.com> wrote:
> 
> Hello,
> I have been working on an image processing library for Malhar and few of
> the operators are ready. I would like to merge them in Malhar contrib. You
> can read about the operators and the applications I have created so far
> here.
> <https://docs.google.com/document/d/19OrqHJ_QzbuB0XZ4bzdQ9yjN2dGfDhsuMX6XUjDpqYw/edit>
> 
> Link to my GitHub <https://github.com/adiv2/imIO4>
> 
> All suggestions and opinions are welcome.
> 
> 
> Thanks,
> Aditya.

   

Re: Image processing library

2017-05-12 Thread Ananth G
 I guess the use cases as documented look really compelling. There might be 
more comments from code review perspective and below is more from a use case 
perspective only.

I was wondering if you have any latency measurements for the tests you ran. 

If the image processing calls ( in the process function overridden from the 
Toolkit class ) are time consuming it might not be an ideal use case for a 
streaming engine? A very old "blog" (2012)  talks about latencies anywhere 
between tens of milliseconds to almost a second depending on the use case and 
image size. Of course there were hardware improvements and those numbers might 
no longer hold good and hence the question (of course the latencies depend on 
hardware being used as well ) 

This brings me to the next question in general about Apex to the community : 
what is considered an acceptable tolerance level in terms of latencies for 
streaming compute engine like Apex. Is there a way to tune the acceptable 
tolerance level depending on the use case ? I keep reading from the mailing 
lists that the aspect of tuple processing is part of the main thread and hence 
should be as fast as possible. 

Regards
Ananth

> On 12 May 2017, at 9:05 pm, Aditya gholba <adi...@datatorrent.com> wrote:
> 
> Hello,
> I have been working on an image processing library for Malhar and few of
> the operators are ready. I would like to merge them in Malhar contrib. You
> can read about the operators and the applications I have created so far
> here.
> <https://docs.google.com/document/d/19OrqHJ_QzbuB0XZ4bzdQ9yjN2dGfDhsuMX6XUjDpqYw/edit>
> 
> Link to my GitHub <https://github.com/adiv2/imIO4>
> 
> All suggestions and opinions are welcome.
> 
> 
> Thanks,
> Aditya.


Image processing library

2017-05-12 Thread Aditya gholba
Hello,
I have been working on an image processing library for Malhar and few of
the operators are ready. I would like to merge them in Malhar contrib. You
can read about the operators and the applications I have created so far
here.
<https://docs.google.com/document/d/19OrqHJ_QzbuB0XZ4bzdQ9yjN2dGfDhsuMX6XUjDpqYw/edit>

Link to my GitHub <https://github.com/adiv2/imIO4>

All suggestions and opinions are welcome.


Thanks,
Aditya.