Re: can operators emit on a different from the operator itself thread?

2016-10-12 Thread Amol Kekre
Vlad, I agree that the check should be ON by default. Ability to turn it off for entire app is fine, per port is not needed. Thks Amol On Wed, Oct 12, 2016 at 10:34 PM, Tushar Gosavi wrote: > +1 for on by default and ability to turn it off for entire application. > > - Tushar. > > > On Thu, Oct

Re: can operators emit on a different from the operator itself thread?

2016-10-12 Thread Tushar Gosavi
+1 for on by default and ability to turn it off for entire application. - Tushar. On Thu, Oct 13, 2016 at 11:00 AM, Pradeep A. Dalvi wrote: > +1 for ON by default > +1 for disabling it for all output ports > > With the kind of issues we have observed being faced by developers in the > past, I s

Re: can operators emit on a different from the operator itself thread?

2016-10-12 Thread Pradeep A. Dalvi
+1 for ON by default +1 for disabling it for all output ports With the kind of issues we have observed being faced by developers in the past, I strongly believe this check should be ON by default. However at the same time I feel, it shall be one-time check, mostly in Development phase and before g

Re: can operators emit on a different from the operator itself thread?

2016-10-12 Thread Vlad Rozov
I run jmh test and check takes 1ns on my MacBook Pro and on the lab machine. This corresponds to 3% degradation at 30 million events/second. I think we can move forward with the check ON by default. Do we need an ability to turn OFF check for a specific operator and/or port? My thought is that

Re: can operators emit on a different from the operator itself thread?

2016-10-12 Thread Amol Kekre
In case there turns out to be a penalty, we can introduce a "check for thread affinity" mode that triggers this check. My initial thought is to make this check ON by default. We should wait till benchmarks are available before discussing adding this check. Thks Amol On Wed, Oct 12, 2016 at 11:07

Re: can operators emit on a different from the operator itself thread?

2016-10-12 Thread Sanjay Pujare
A JIRA has been created for adding this thread affinity check https://issues.apache.org/jira/browse/APEXCORE-510 . I have made this enhancement in a branch https://github.com/sanjaypujare/apex-core/tree/malhar-510.thread_affinity and I have been benchmarking the performance with this change. I will

Re: can operators emit on a different from the operator itself thread?

2016-08-12 Thread Sanjay Pujare
You are right, I was subconsciously thinking about the THREAD_LOCAL case with a single container and a simple DAG and in that case Vlad’s assumption might not be valid but may be it is. On 8/11/16, 11:47 AM, "Munagala Ramanath" wrote: If I understand Vlad correctly, what he is saying is th

Re: can operators emit on a different from the operator itself thread?

2016-08-11 Thread Vlad Rozov
Correct, except that it is Sink not an Operator that will need to save current thread during setup(). Sink does not need access to an Operator, it is sufficient to rely on the platform to call setup() method on the Operator thread. Vlad On 8/11/16 11:47, Munagala Ramanath wrote: If I unders

Re: can operators emit on a different from the operator itself thread?

2016-08-11 Thread Munagala Ramanath
If I understand Vlad correctly, what he is saying is that each operator saves currentThread in its own setup() and checks it in its own output methods. The threads in different operators are running potentially on different nodes and/or processes and there will be no connection between them. Ram

Re: can operators emit on a different from the operator itself thread?

2016-08-11 Thread Sanjay Pujare
Name check is expensive, agreed, but there isn’t anything else currently. Ideally the stram engine (considering that it is an engine providing resources like threads etc) should use a ThreadFactory or a ThreadGroup to create operator threads so identification and adding functionality is easier.

Re: can operators emit on a different from the operator itself thread?

2016-08-11 Thread Vlad Rozov
Name verification is too expensive, it will be sufficient to store currentThread during setup() and verify that it is the same during emit. Checks should be supported not only for DefaultOutputPort, so we may have it implemented in various Sinks. Vlad On 8/11/16 10:21, Sanjay Pujare wrote: T

Re: can operators emit on a different from the operator itself thread?

2016-08-11 Thread Tushar Gosavi
+1 in case thread local emitting from separate thread would cause next operator to process data in different thread which can cause unrelated problems if next operator is not thread safe. On Thu, Aug 11, 2016 at 10:51 PM, Sanjay Pujare wrote: > Thinking more about this – all of the “operator” thr

Re: can operators emit on a different from the operator itself thread?

2016-08-11 Thread Sanjay Pujare
Thinking more about this – all of the “operator” threads are created by the Stram engine with appropriate names. So we can put checks in the DefaultOutputPort.emit() or in the various implementations of Sink.put() that the current-thread is one created by the Stram engine (by verifying the name)

Re: can operators emit on a different from the operator itself thread?

2016-08-10 Thread Amol Kekre
+1 on debug proposal. Even if tuples lands up within the window, it breaks all guarantees. A rerun (after restart from a checkpoint) can have tuples in different windows from this thread. A separate thread simply exposes users to unwarranted risks. Thks Amol On Wed, Aug 10, 2016 at 6:05 PM, Vlad

Re: can operators emit on a different from the operator itself thread?

2016-08-10 Thread Vlad Rozov
Tuples emitted between end and begin windows is only one of possible behaviors that emitting tuples on a separate from the operator thread may introduce. It will be good to have both checks in place at run-time and if checking for the operator thread for every emitted tuple is too expensive, we

Re: can operators emit on a different from the operator itself thread?

2016-08-10 Thread Sanjay Pujare
If the goal is to do this validation through static analysis of operator code, I guess it is possible but is going to be non-trivial. And there could be false positives and false negatives. Also I suppose this discussion applies to processor operators (those having both in and out ports) so Ram

Re: can operators emit on a different from the operator itself thread?

2016-08-10 Thread Amol Kekre
Sanjay just reminded me of my typo -> I meant between end_window and start_window :) Thks Amol On Wed, Aug 10, 2016 at 2:36 PM, Sanjay Pujare wrote: > If the goal is to do this validation through static analysis of operator > code, I guess it is possible but is going to be non-trivial. And ther

Re: can operators emit on a different from the operator itself thread?

2016-08-10 Thread Amol Kekre
Send too soon. A quicker way would be to catch emit happening between start_window and end_window and flag an error. Catching "another thread" for every tuple may have a huge performance hit. Thks Amol On Wed, Aug 10, 2016 at 2:31 PM, Amol Kekre wrote: > > Currently user can code it that way.

Re: can operators emit on a different from the operator itself thread?

2016-08-10 Thread Amol Kekre
Currently user can code it that way. IMHO Apex should catch this and flag error. Thks Amol On Wed, Aug 10, 2016 at 2:04 PM, Ashwin Chandra Putta < ashwinchand...@gmail.com> wrote: > In a separate thread I mean. > > Regards, > Ashwin. > > On Wed, Aug 10, 2016 at 2:01 PM, Ashwin Chandra Putta < >

Re: can operators emit on a different from the operator itself thread?

2016-08-10 Thread Ashwin Chandra Putta
In a separate thread I mean. Regards, Ashwin. On Wed, Aug 10, 2016 at 2:01 PM, Ashwin Chandra Putta < ashwinchand...@gmail.com> wrote: > + dev@apex.apache.org > - us...@apex.apache.org > > This is one of those best practices that we learn by experience during > operator development. It will save

Re: can operators emit on a different from the operator itself thread?

2016-08-10 Thread Ashwin Chandra Putta
+ dev@apex.apache.org - us...@apex.apache.org This is one of those best practices that we learn by experience during operator development. It will save a lot of time during operator development if we can catch and throw validation error when someone emits tuples in a non separate thread. Regards,