Name verification is too expensive, it will be sufficient to store currentThread during setup() and verify that it is the same during emit. Checks should be supported not only for DefaultOutputPort, so we may have it implemented in various Sinks.

Vlad

On 8/11/16 10:21, Sanjay Pujare wrote:
Thinking more about this – all of the “operator” threads are created by the 
Stram engine with appropriate names. So we can put checks in the 
DefaultOutputPort.emit() or in the various implementations of Sink.put() that 
the current-thread is one created by the Stram engine (by verifying the name).

We can even use a special Thread object for operator threads so the above 
detection is easier.



On 8/10/16, 6:11 PM, "Amol Kekre" <a...@datatorrent.com> wrote:

     +1 on debug proposal. Even if tuples lands up within the window, it breaks
     all guarantees. A rerun (after restart from a checkpoint) can have tuples
     in different windows from this thread. A separate thread simply exposes
     users to unwarranted risks.
Thks
     Amol
On Wed, Aug 10, 2016 at 6:05 PM, Vlad Rozov <v.ro...@datatorrent.com> wrote: > Tuples emitted between end and begin windows is only one of possible
     > behaviors that emitting tuples on a separate from the operator thread may
     > introduce. It will be good to have both checks in place at run-time and 
if
     > checking for the operator thread for every emitted tuple is too 
expensive,
     > we may have it enabled only in DEBUG or mode with more checks in place.
     >
     > Vlad
     >
     >
     > Sanjay just reminded me of my typo -> I meant between end_window and
     >> start_window :)
     >>
     >> Thks
     >> Amol
     >>
     >> On Wed, Aug 10, 2016 at 2:36 PM, Sanjay Pujare <san...@datatorrent.com>
     >> wrote:
     >>
     >> If the goal is to do this validation through static analysis of operator
     >>> code, I guess it is possible but is going to be non-trivial. And there
     >>> could be false positives and false negatives.
     >>>
     >>> Also I suppose this discussion applies to processor operators (those
     >>> having both in and out ports) so Ram’s example of JdbcPollInputOperator
     >>> may
     >>> not be applicable here?
     >>>
     >>> On 8/10/16, 2:04 PM, "Ashwin Chandra Putta" <ashwinchand...@gmail.com>
     >>> wrote:
     >>>
     >>>      In a separate thread I mean.
     >>>
     >>>      Regards,
     >>>      Ashwin.
     >>>
     >>>      On Wed, Aug 10, 2016 at 2:01 PM, Ashwin Chandra Putta <
     >>>      ashwinchand...@gmail.com> wrote:
     >>>
     >>>      > + dev@apex.apache.org
     >>>      > - us...@apex.apache.org
     >>>      >
     >>>      > This is one of those best practices that we learn by experience
     >>> during
     >>>      > operator development. It will save a lot of time during operator
     >>>      > development if we can catch and throw validation error when
     >>> someone
     >>> emits
     >>>      > tuples in a non separate thread.
     >>>      >
     >>>      > Regards,
     >>>      > Ashwin
     >>>      >
     >>>      > On Wed, Aug 10, 2016 at 1:57 PM, Munagala Ramanath <
     >>> r...@datatorrent.com>
     >>>      > wrote:
     >>>      >
     >>>      >> For cases where use of a different thread is needed, it can 
write
     >>> tuples
     >>>      >> to a queue from where the operator thread pulls them --
     >>>      >> JdbcPollInputOperator in Malhar has an example.
     >>>      >>
     >>>      >> Ram
     >>>      >>
     >>>      >> On Wed, Aug 10, 2016 at 1:50 PM, hsy...@gmail.com <
     >>> hsy...@gmail.com
     >>>      >> wrote:
     >>>      >>
     >>>      >>> Hey Vlad,
     >>>      >>>
     >>>      >>> Thanks for bringing this up. Is there an easy way to detect
     >>> unexpected
     >>>      >>> use of emit method without hurt the performance. Or at least 
if
     >>> we
     >>> can
     >>>      >>> detect this in debug mode.
     >>>      >>>
     >>>      >>> Regards,
     >>>      >>> Siyuan
     >>>      >>>
     >>>      >>> On Wed, Aug 10, 2016 at 11:27 AM, Vlad Rozov <
     >>> v.ro...@datatorrent.com>
     >>>      >>> wrote:
     >>>      >>>
     >>>      >>>> The short answer is no, creating worker thread to emit tuples
     >>> is
     >>> not
     >>>      >>>> supported by Apex and will lead to an undefined behavior.
     >>> Operators in Apex
     >>>      >>>> have strong thread affinity and all interaction with the
     >>> platform
     >>> must
     >>>      >>>> happen on the operator thread.
     >>>      >>>>
     >>>      >>>> Vlad
     >>>      >>>>
     >>>      >>>
     >>>      >>>
     >>>      >>
     >>>      >
     >>>      >
     >>>      > --
     >>>      >
     >>>      > Regards,
     >>>      > Ashwin.
     >>>      >
     >>>
     >>>
     >>>
     >>>      --
     >>>
     >>>      Regards,
     >>>      Ashwin.
     >>>
     >>>
     >>>
     >>>
     >>>
     >


Reply via email to