If I understand Vlad correctly, what he is saying is that each operator
saves currentThread in
its own setup() and checks it in its own output methods. The threads in
different operators are
running potentially on different nodes and/or processes and there will be
no connection between them.

Ram

On Thu, Aug 11, 2016 at 11:41 AM, Sanjay Pujare <san...@datatorrent.com>
wrote:

> Name check is expensive, agreed, but there isn’t anything else currently.
> Ideally the stram engine (considering that it is an engine providing
> resources like threads etc) should use a ThreadFactory or a ThreadGroup to
> create operator threads so identification and adding functionality is
> easier.
>
> The idea of checking for the same thread between setup() and emit() won’t
> work because the emit() check will have to be in the Sink hierarchy and
> AFAIK a Sink object doesn’t have access to the corresponding operator,
> right? Another more fundamental problem probably is that these threads
> don’t have to match. The emit() for any operator (or rather a Sink related
> to an operator) is ultimately triggered by an emitTuple() on the topmost
> input operator in that path which happens in that input operator’s thread
> which doesn’t have to match the thread calling setup() in the downstream
> operators, right?
>
>
> On 8/11/16, 10:59 AM, "Vlad Rozov" <v.ro...@datatorrent.com> wrote:
>
>     Name verification is too expensive, it will be sufficient to store
>     currentThread during setup() and verify that it is the same during
> emit.
>     Checks should be supported not only for DefaultOutputPort, so we may
>     have it implemented in various Sinks.
>
>     Vlad
>
>     On 8/11/16 10:21, Sanjay Pujare wrote:
>     > Thinking more about this – all of the “operator” threads are created
> by the Stram engine with appropriate names. So we can put checks in the
> DefaultOutputPort.emit() or in the various implementations of Sink.put()
> that the current-thread is one created by the Stram engine (by verifying
> the name).
>     >
>     > We can even use a special Thread object for operator threads so the
> above detection is easier.
>     >
>     >
>     >
>     > On 8/10/16, 6:11 PM, "Amol Kekre" <a...@datatorrent.com> wrote:
>     >
>     >      +1 on debug proposal. Even if tuples lands up within the
> window, it breaks
>     >      all guarantees. A rerun (after restart from a checkpoint) can
> have tuples
>     >      in different windows from this thread. A separate thread simply
> exposes
>     >      users to unwarranted risks.
>     >
>     >      Thks
>     >      Amol
>     >
>     >
>     >      On Wed, Aug 10, 2016 at 6:05 PM, Vlad Rozov <
> v.ro...@datatorrent.com> wrote:
>     >
>     >      > Tuples emitted between end and begin windows is only one of
> possible
>     >      > behaviors that emitting tuples on a separate from the
> operator thread may
>     >      > introduce. It will be good to have both checks in place at
> run-time and if
>     >      > checking for the operator thread for every emitted tuple is
> too expensive,
>     >      > we may have it enabled only in DEBUG or mode with more checks
> in place.
>     >      >
>     >      > Vlad
>     >      >
>     >      >
>     >      > Sanjay just reminded me of my typo -> I meant between
> end_window and
>     >      >> start_window :)
>     >      >>
>     >      >> Thks
>     >      >> Amol
>     >      >>
>     >      >> On Wed, Aug 10, 2016 at 2:36 PM, Sanjay Pujare <
> san...@datatorrent.com>
>     >      >> wrote:
>     >      >>
>     >      >> If the goal is to do this validation through static analysis
> of operator
>     >      >>> code, I guess it is possible but is going to be
> non-trivial. And there
>     >      >>> could be false positives and false negatives.
>     >      >>>
>     >      >>> Also I suppose this discussion applies to processor
> operators (those
>     >      >>> having both in and out ports) so Ram’s example of
> JdbcPollInputOperator
>     >      >>> may
>     >      >>> not be applicable here?
>     >      >>>
>     >      >>> On 8/10/16, 2:04 PM, "Ashwin Chandra Putta" <
> ashwinchand...@gmail.com>
>     >      >>> wrote:
>     >      >>>
>     >      >>>      In a separate thread I mean.
>     >      >>>
>     >      >>>      Regards,
>     >      >>>      Ashwin.
>     >      >>>
>     >      >>>      On Wed, Aug 10, 2016 at 2:01 PM, Ashwin Chandra Putta <
>     >      >>>      ashwinchand...@gmail.com> wrote:
>     >      >>>
>     >      >>>      > + dev@apex.apache.org
>     >      >>>      > - us...@apex.apache.org
>     >      >>>      >
>     >      >>>      > This is one of those best practices that we learn by
> experience
>     >      >>> during
>     >      >>>      > operator development. It will save a lot of time
> during operator
>     >      >>>      > development if we can catch and throw validation
> error when
>     >      >>> someone
>     >      >>> emits
>     >      >>>      > tuples in a non separate thread.
>     >      >>>      >
>     >      >>>      > Regards,
>     >      >>>      > Ashwin
>     >      >>>      >
>     >      >>>      > On Wed, Aug 10, 2016 at 1:57 PM, Munagala Ramanath <
>     >      >>> r...@datatorrent.com>
>     >      >>>      > wrote:
>     >      >>>      >
>     >      >>>      >> For cases where use of a different thread is
> needed, it can write
>     >      >>> tuples
>     >      >>>      >> to a queue from where the operator thread pulls
> them --
>     >      >>>      >> JdbcPollInputOperator in Malhar has an example.
>     >      >>>      >>
>     >      >>>      >> Ram
>     >      >>>      >>
>     >      >>>      >> On Wed, Aug 10, 2016 at 1:50 PM, hsy...@gmail.com <
>     >      >>> hsy...@gmail.com
>     >      >>>      >> wrote:
>     >      >>>      >>
>     >      >>>      >>> Hey Vlad,
>     >      >>>      >>>
>     >      >>>      >>> Thanks for bringing this up. Is there an easy way
> to detect
>     >      >>> unexpected
>     >      >>>      >>> use of emit method without hurt the performance.
> Or at least if
>     >      >>> we
>     >      >>> can
>     >      >>>      >>> detect this in debug mode.
>     >      >>>      >>>
>     >      >>>      >>> Regards,
>     >      >>>      >>> Siyuan
>     >      >>>      >>>
>     >      >>>      >>> On Wed, Aug 10, 2016 at 11:27 AM, Vlad Rozov <
>     >      >>> v.ro...@datatorrent.com>
>     >      >>>      >>> wrote:
>     >      >>>      >>>
>     >      >>>      >>>> The short answer is no, creating worker thread to
> emit tuples
>     >      >>> is
>     >      >>> not
>     >      >>>      >>>> supported by Apex and will lead to an undefined
> behavior.
>     >      >>> Operators in Apex
>     >      >>>      >>>> have strong thread affinity and all interaction
> with the
>     >      >>> platform
>     >      >>> must
>     >      >>>      >>>> happen on the operator thread.
>     >      >>>      >>>>
>     >      >>>      >>>> Vlad
>     >      >>>      >>>>
>     >      >>>      >>>
>     >      >>>      >>>
>     >      >>>      >>
>     >      >>>      >
>     >      >>>      >
>     >      >>>      > --
>     >      >>>      >
>     >      >>>      > Regards,
>     >      >>>      > Ashwin.
>     >      >>>      >
>     >      >>>
>     >      >>>
>     >      >>>
>     >      >>>      --
>     >      >>>
>     >      >>>      Regards,
>     >      >>>      Ashwin.
>     >      >>>
>     >      >>>
>     >      >>>
>     >      >>>
>     >      >>>
>     >      >
>     >
>     >
>     >
>
>
>
>
>

Reply via email to