[jira] [Commented] (APEXCORE-590) Failed to restart application on MapR

2016-12-16 Thread Pradeep A. Dalvi (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15754469#comment-15754469
 ] 

Pradeep A. Dalvi commented on APEXCORE-590:
---

This is happening due to f.getPath() returning path in format "maprfs:///", whereas origAppDir contains path in format "maprfs:/". This results in string replace failing to change origAppDir to 
newAppDir and targetPath is same as origAppPath for the subdirectories.

> Failed to restart application on MapR
> -
>
> Key: APEXCORE-590
> URL: https://issues.apache.org/jira/browse/APEXCORE-590
> Project: Apache Apex Core
>  Issue Type: Bug
>Reporter: Pradeep A. Dalvi
>Assignee: Pradeep A. Dalvi
>
> For restarting application, we try to copy previous app state i.e. 
> checkpoints directory from original app. However checkpoints are not being 
> copied due to incorrect check of source and destination directory path.
> 16/12/16 13:28:32 ERROR fs.MapRFileSystem: Failed to delete path 
> maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0006/checkpoints,
>  error: No such file or directory (2)
> 16/12/16 13:28:32 INFO stram.FSRecoveryHandler: Creating 
> maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0006/recovery/log
> 16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/events 
> as it already exists under 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/events
> 16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/recovery
>  as it already exists under 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/recovery
> 16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/checkpoints
>  as it already exists under 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/checkpoints
> 16/12/16 13:28:32 INFO stram.StramClient: Set the environment for the 
> application master



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (APEXCORE-590) Failed to restart application on MapR

2016-12-16 Thread Pradeep A. Dalvi (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep A. Dalvi updated APEXCORE-590:
--
Description: 
For restarting application, we try to copy previous app state i.e. checkpoints 
directory from original app. However checkpoints are not being copied due to 
incorrect check of source and destination directory path.

16/12/16 13:28:32 ERROR fs.MapRFileSystem: Failed to delete path 
maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0006/checkpoints,
 error: No such file or directory (2)
16/12/16 13:28:32 INFO stram.FSRecoveryHandler: Creating 
maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0006/recovery/log
16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/events 
as it already exists under 
maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/events
16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/recovery 
as it already exists under 
maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/recovery
16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/checkpoints
 as it already exists under 
maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/checkpoints
16/12/16 13:28:32 INFO stram.StramClient: Set the environment for the 
application master

  was:
For restarting application, we try to copy previous app state i.e. checkpoints 
directory from original app. Before doing so, we try to delete 'checkpoints' 
directory from newly launched application directory. MapR-FS throws an 
exception since that directory is not present yet, hence we fail to copy 
checkpoints from original app.
Need to catch exception around fs.delete(checkpointPath, true); in 
copyInitialState function of StramClient.

16/12/16 12:48:03 INFO stram.StramClient: Restart from 
maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0004
16/12/16 12:48:03 INFO Configuration.deprecation: io.bytes.per.checksum is 
deprecated. Instead, use dfs.bytes-per-checksum
16/12/16 12:48:03 ERROR fs.MapRFileSystem: Failed to delete path 
maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0005/checkpoints,
 error: No such file or directory (2)
16/12/16 12:48:03 INFO stram.FSRecoveryHandler: Creating 
maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0005/recovery/lo


> Failed to restart application on MapR
> -
>
> Key: APEXCORE-590
> URL: https://issues.apache.org/jira/browse/APEXCORE-590
> Project: Apache Apex Core
>  Issue Type: Bug
>    Reporter: Pradeep A. Dalvi
>    Assignee: Pradeep A. Dalvi
>
> For restarting application, we try to copy previous app state i.e. 
> checkpoints directory from original app. However checkpoints are not being 
> copied due to incorrect check of source and destination directory path.
> 16/12/16 13:28:32 ERROR fs.MapRFileSystem: Failed to delete path 
> maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0006/checkpoints,
>  error: No such file or directory (2)
> 16/12/16 13:28:32 INFO stram.FSRecoveryHandler: Creating 
> maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0006/recovery/log
> 16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/events 
> as it already exists under 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/events
> 16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/recovery
>  as it already exists under 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/recovery
> 16/12/16 13:28:32 INFO stram.StramClient: Ignoring 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/checkpoints
>  as it already exists under 
> maprfs:///user/dtadmin/datatorrent/apps/application_1481890072066_0004/checkpoints
> 16/12/16 13:28:32 INFO stram.StramClient: Set the environment for the 
> application master



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (APEXCORE-590) Failed to restart application on MapR

2016-12-16 Thread Pradeep A. Dalvi (JIRA)
Pradeep A. Dalvi created APEXCORE-590:
-

 Summary: Failed to restart application on MapR
 Key: APEXCORE-590
 URL: https://issues.apache.org/jira/browse/APEXCORE-590
 Project: Apache Apex Core
  Issue Type: Bug
Reporter: Pradeep A. Dalvi
Assignee: Pradeep A. Dalvi


For restarting application, we try to copy previous app state i.e. checkpoints 
directory from original app. Before doing so, we try to delete 'checkpoints' 
directory from newly launched application directory. MapR-FS throws an 
exception since that directory is not present yet, hence we fail to copy 
checkpoints from original app.
Need to catch exception around fs.delete(checkpointPath, true); in 
copyInitialState function of StramClient.

16/12/16 12:48:03 INFO stram.StramClient: Restart from 
maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0004
16/12/16 12:48:03 INFO Configuration.deprecation: io.bytes.per.checksum is 
deprecated. Instead, use dfs.bytes-per-checksum
16/12/16 12:48:03 ERROR fs.MapRFileSystem: Failed to delete path 
maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0005/checkpoints,
 error: No such file or directory (2)
16/12/16 12:48:03 INFO stram.FSRecoveryHandler: Creating 
maprfs:/user/dtadmin/datatorrent/apps/application_1481890072066_0005/recovery/lo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Apex internal documentation.

2016-11-28 Thread Pradeep A. Dalvi
+1

Following might also be good add in Startup of Application:
 - Handling initial communication with StrAM before & after application is
in Running state (trackingURL w/o & w/ SSL and non-secure & secure mode)


Re: Proposing an operator for log parsing.

2016-11-17 Thread Pradeep A. Dalvi
+1 for the feature

On Thu, 17 Nov 2016 at 16:56 Shraddha Jog  wrote:

> Dear community,
>
> We would like to add operator in malhar for parsing different types of
> logs.
> Idea of this operator is to read log data records of known formats such as
> Syslog, common log, combined log, extended log etc from the upstream in a
> DAG, parse/validate it based on the configured format and emit the
> validated POJO to the downstream.
>
> We are not targeting log formats from particular library as such but the
> default formats for common log, combined log, extended log and sys log.
> Also if user has some specific log format then that could be provided in a
> property and operator will parse the log according to the given format.
>
> Properties:
> LogFileFormat : Property to define data format for the log data record
> being read at the Input Port. It can be either from the above four default
> log formats or a json specifying fields and regular expression. More
> details can be found in the document.
>
> Ports :
> 1. ParsedLog: This port shall emit the parsed/validated POJO object created
> based on the log format configured by the user.
>
> 2. ErrorPort: This port shall emit the error log data record.
>
> Proposed design can be found here
> <
> https://docs.google.com/document/d/1RoTOUx_0chwTSahGxiIXlgACgRNfXiVv17hMezxFz74/edit?usp=sharing
> >
> .
>
> Thanks,
> Shraddha
>
> --
> This e-mail, including any attached files, may contain confidential and
> privileged information for the sole use of the intended recipient. Any
> review, use, distribution, or disclosure by others is strictly prohibited.
> If you are not the intended recipient (or authorized to receive information
> for the intended recipient), please contact the sender by reply e-mail and
> delete all copies of this message.
>
>


Re: [DISCUSSION] Custom Control Tuples

2016-11-02 Thread Pradeep A. Dalvi
As a rule of thumb in any real time operating system, control tuples should
always be handled using Priority Queues.

We may try to control priorities by defining levels. And shall not
be delivered at window boundaries.

In short, control tuples shall never be treated as any other tuples in real
time systems.

On Thursday, November 3, 2016, David Yan  wrote:

> Hi all,
>
> I would like to renew the discussion of control tuples.
>
> Last time, we were in a debate about whether:
>
> 1) the platform should enforce that control tuples are delivered at window
> boundaries only
>
> or:
>
> 2) the platform should deliver control tuples just as other tuples and it's
> the operator developers' choice whether to handle the control tuples as
> they arrive or delay the processing till the next window boundary.
>
> To summarize the pros and cons:
>
> Approach 1: If processing control tuples results in changes of the behavior
> of the operator, if idempotency needs to be preserved, the processing must
> be done at window boundaries. This approach will save the operator
> developers headache to ensure that. However, this will take away the
> choices from the operator developer if they just need to process the
> control tuples as soon as possible.
>
> Approach 2: The operator has a chance to immediately process control
> tuples. This would be useful if latency is more valued than correctness.
> However, if this would open the possibility for operator developers to
> shoot themselves in the foot. This is especially true if there are multiple
> input ports. as there is no easy way to guarantee processing order for
> multiple input ports.
>
> We would like to arrive to a consensus and close this discussion soon this
> time so we can start the work on this important feature.
>
> Thanks!
>
> David
>
> On Tue, Jun 28, 2016 at 10:04 AM, Vlad Rozov  >
> wrote:
>
> > It is not clear how operator will emit custom control tuple at window
> > boundaries. One way is to cache/accumulate control tuples in the operator
> > output port till window closes (END_WINDOW is inserted into the output
> > sink) or only allow an operator to emit control tuples inside the
> > endWindow(). The later is a slight variation of the operator output port
> > caching behavior with the only difference that now the operator itself is
> > responsible for caching/accumulating control tuples. Note that in many
> > cases it will be necessary to postpone emitting payload tuples that
> > logically come after the custom control tuple till the next window
> begins.
> >
> > IMO, that too restrictive and in a case where input operator uses a push
> > instead of a poll (for example, it provides an end point where remote
> > agents may connect and publish/push data), control tuples may be used for
> > connect/disconnect/watermark broadcast to (partitioned) downstream
> > operators. In this case the platform just need to guarantee order barrier
> > (any tuple emitted prior to a control tuple needs to be delivered prior
> to
> > the control tuple).
> >
> > Thank you,
> >
> > Vlad
> >
> >
> >
> > On 6/27/16 19:36, Amol Kekre wrote:
> >
> >> I agree with David. Allowing control tuples within a window (along with
> >> data tuples) creates very dangerous situation where guarantees are
> >> impacted. It is much safer to enable control tuples (send/receive) at
> >> window boundaries (after END_WINDOW of window N, and before BEGIN_WINDOW
> >> for window N+1). My take on David's list is
> >>
> >> 1. -> window boundaries -> Strong +1; there will be a big issue with
> >> guarantees for operators with multiple ports. (see Thomas's response)
> >> 2. -> All downstream windows -> +1, but there are situations; a caveat
> >> could be "only to operators that implement control tuple
> >> interface/listeners", which could effectively translates to "all
> >> interested
> >> downstream operators"
> >> 3. Only Input operator can create control tuples -> -1; is restrictive
> >> even
> >> though most likely 95% of the time it will be input operators
> >>
> >> Thks,
> >> Amol
> >>
> >>
> >> On Mon, Jun 27, 2016 at 4:37 PM, Thomas Weise  >
> >> wrote:
> >>
> >> The windowing we discuss here is in general event time based, arrival
> time
> >>> is a special case of it.
> >>>
> >>> I don't think state changes can be made independent of the streaming
> >>> window
> >>> boundary as it would prevent idempotent processing and transitively
> >>> exactly
> >>> once. For that to work, tuples need to be presented to the operator in
> a
> >>> guaranteed order *within* the streaming window, which is not possible
> >>> with
> >>> multiple ports (and partitions).
> >>>
> >>> Thomas
> >>>
> >>> On Mon, Jun 27, 2016 at 2:53 PM, David Yan  >
> >>> wrote:
> >>>
> >>> I think for session tracking, if the session boundaries are allowed to
> be
>  not aligned with the 

Re: (APEXMALHAR-2290) JDBC operator does not deploy after failure in certain cases

2016-10-12 Thread Pradeep A. Dalvi
+1 for Option 1 to use conn.getMetaData()

--prad

On Wed, Oct 12, 2016 at 11:39 PM, Chinmay Kolhatkar 
wrote:

> Hi Hitesh,
>
> Instead of limiting the row count please use one of the following 2
> approaches:
>
> 1. ResultSet rsColumns = null;
> DatabaseMetaData meta = conn.getMetaData();
> rsColumns = meta.getColumns(null, null, "tablename", null);
> while (rsColumns.next()) {
>   System.out.println(rsColumns.getString("TYPE_NAME"));
>   System.out.println(rsColumns.getString("COLUMN_NAME"));
> }
>
> Example given here:
> http://www.java2s.com/Code/Java/Database-SQL-JDBC/
> GetColumnNameAndTypeForATable.htm
>
> 2. Execute the select statement putting a always false where clause like
> "select * from table where 1 = 2"
>
> I would prefer option 1 over option 2.
>
> -Chinmay.
>
>
> On Wed, Oct 12, 2016 at 8:38 PM, Hitesh Kapoor 
> wrote:
>
> > Hi All,
> >
> > This issue occurs when we try to insert records in a table which has a
> lot
> > of data.
> > The setup method of JdbcPOJOInsertOutputOperator generates the metadata
> of
> > the columns in the table. To do so it fires a query of the form "Select *
> > from tablename" and the extracts the required meta data like column
> > name,data type and if it allowed to be NULL.
> > When the table has a lot of data this "Select" query takes up a lot of
> time
> > (more than 30 sec) and the operator gets killed.
> > The fix is straight forward and simple as suggested by Sandeep is to
> limit
> > the maximum rows returned by the select query to 1. I am using the JDBC
> > function setMaxRows() to achieve this.
> > Will be opening a PR for the same. This fix won't have corresponding unit
> > test cases and I will test the changes externally via an app.
> >
> > Regards,
> > Hitesh Kapoor
> >
>


Re: can operators emit on a different from the operator itself thread?

2016-10-12 Thread Pradeep A. Dalvi
+1 for ON by default
+1 for disabling it for all output ports

With the kind of issues we have observed being faced by developers in the
past, I strongly believe this check should be ON by default.
However at the same time I feel, it shall be one-time check, mostly in
Development phase and before going into Production. Having said that, if
disabling it at application level i.e. for all operators and their
respective output ports would it make implementation simpler, then that can
be targeted first. Thoughts?

--prad

On Thu, Oct 13, 2016 at 7:32 AM, Vlad Rozov  wrote:

> I run jmh test and check takes 1ns on my MacBook Pro and on the lab
> machine. This corresponds to 3% degradation at 30 million events/second. I
> think we can move forward with the check ON by default. Do we need an
> ability to turn OFF check for a specific operator and/or port? My thought
> is that such ability is not necessary and it should be OK to disable check
> for all output ports in an application.
>
> Vlad
>
>
> On 10/12/16 11:56, Amol Kekre wrote:
>
>> In case there turns out to be a penalty, we can introduce a "check for
>> thread affinity" mode that triggers this check. My initial thought is to
>> make this check ON by default. We should wait till benchmarks are
>> available
>> before discussing adding this check.
>>
>> Thks
>> Amol
>>
>>
>> On Wed, Oct 12, 2016 at 11:07 AM, Sanjay Pujare 
>> wrote:
>>
>> A JIRA has been created for adding this thread affinity check
>>> https://issues.apache.org/jira/browse/APEXCORE-510 . I have made this
>>> enhancement in a branch
>>> https://github.com/sanjaypujare/apex-core/tree/malhar-510.
>>> thread_affinity
>>> and I have been benchmarking the performance with this change. I will be
>>> publishing the results in the above JIRA where we can discuss them and
>>> hopefully agree on merging this change.
>>>
>>> On Thu, Aug 11, 2016 at 1:41 PM, Sanjay Pujare 
>>> wrote:
>>>
>>> You are right, I was subconsciously thinking about the THREAD_LOCAL case
 with a single container and a simple DAG and in that case Vlad’s

>>> assumption
>>>
 might not be valid but may be it is.

 On 8/11/16, 11:47 AM, "Munagala Ramanath"  wrote:

  If I understand Vlad correctly, what he is saying is that each

>>> operator
>>>
  saves currentThread in
  its own setup() and checks it in its own output methods. The
 threads

>>> in
>>>
  different operators are
  running potentially on different nodes and/or processes and there

>>> will
>>>
 be
  no connection between them.

  Ram

  On Thu, Aug 11, 2016 at 11:41 AM, Sanjay Pujare <
 san...@datatorrent.com>
  wrote:

  > Name check is expensive, agreed, but there isn’t anything else
 currently.
  > Ideally the stram engine (considering that it is an engine

>>> providing
>>>
  > resources like threads etc) should use a ThreadFactory or a
 ThreadGroup to
  > create operator threads so identification and adding
 functionality

>>> is
>>>
  > easier.
  >
  > The idea of checking for the same thread between setup() and
 emit()
 won’t
  > work because the emit() check will have to be in the Sink
 hierarchy
 and
  > AFAIK a Sink object doesn’t have access to the corresponding
 operator,
  > right? Another more fundamental problem probably is that these
 threads
  > don’t have to match. The emit() for any operator (or rather a
 Sink
 related
  > to an operator) is ultimately triggered by an emitTuple() on the
 topmost
  > input operator in that path which happens in that input
 operator’s
 thread
  > which doesn’t have to match the thread calling setup() in the
 downstream
  > operators, right?
  >
  >
  > On 8/11/16, 10:59 AM, "Vlad Rozov" 

>>> wrote:
>>>
  >
  > Name verification is too expensive, it will be sufficient to
 store
  > currentThread during setup() and verify that it is the same
 during
  > emit.
  > Checks should be supported not only for DefaultOutputPort, so

>>> we
>>>
 may
  > have it implemented in various Sinks.
  >
  > Vlad
  >
  > On 8/11/16 10:21, Sanjay Pujare wrote:
  > > Thinking more about this – all of the “operator” threads
 are
 created
  > by the Stram engine with appropriate names. So we can put checks
 in
 the
  > DefaultOutputPort.emit() or in the various implementations of
 Sink.put()
  > that the current-thread is one created by the Stram engine (by
 verifying
  > the name).
  >   

Re: [VOTE] Hadoop upgrade

2016-10-04 Thread Pradeep A. Dalvi
+1 for 2.6.x

On Tuesday, October 4, 2016, David Yan  wrote:

> Hi all,
>
> Thomas created this ticket for upgrading our Hadoop dependency version a
> couple weeks ago:
>
> https://issues.apache.org/jira/browse/APEXCORE-536
>
> We'd like to get the ball rolling and would like to take a vote from the
> community which version we would like to upgrade to. We have these choices:
>
> 2.2.0 (no upgrade)
> 2.4.x
> 2.5.x
> 2.6.x
>
> We are not considering 2.7.x because we already know that many Apex users
> are using Hadoop distros that are based on 2.6.
>
> Please note that Apex works with all versions of Hadoop higher or equal to
> the Hadoop version Apex depends on, as long as it's 2.x.x. We are not
> considering Hadoop 3.0.0-alpha yet at this time.
>
> When voting, please keep these in mind:
>
> - The features that are added in 2.4.x, 2.5.x, and 2.6.x respectively, and
> how useful those features are for Apache Apex
> - The Hadoop versions the major distros (Cloudera, Hortonworks, MapR, EMR,
> etc) are supporting
> - The Hadoop versions what typical Apex users are using
>
> Thanks,
>
> David
>


Re: Automated changes git author

2016-09-28 Thread Pradeep A. Dalvi
Build Meister?

On Wed, Sep 28, 2016 at 1:20 PM, Thomas Weise 
wrote:

> What about the name "CI Support"? Does not look like best fit either. Any
> better ideas or keep it?
>
> I will document the outcome in the contributor guidelines.
>
> On Wed, Sep 28, 2016 at 11:13 AM, Pramod Immaneni 
> wrote:
>
> > What trustworthy jenkins no more. Kidding aside +1
> >
> > On Wed, Sep 28, 2016 at 11:34 AM, Thomas Weise  wrote:
> >
> > > Hi,
> > >
> > > For changes made by scripts, there has been an undocumented convention
> to
> > > use the following author information (example):
> > >
> > > commit 763d14fca6b84fdda1b6853235e5d4b71ca87fca
> > > Author: CI Support 
> > > Date:   Mon Sep 26 20:36:22 2016 -0700
> > >
> > > Fix trailing whitespace.
> > >
> > > I would suggest we discontinue use of jenk...@datatorrent.com and
> start
> > > using dev@apex.apache.org instead?
> > >
> > > Thanks,
> > > Thomas
> > >
> >
>


Re: [ANNOUNCE] New Apache Apex PMC Member: Chandni Singh

2016-09-12 Thread Pradeep A. Dalvi
Congratulations, Chandni!

Thanks,
Pradeep A. Dalvi

On Mon, Sep 12, 2016 at 9:33 AM, Thomas Weise <t...@apache.org> wrote:

> The Apache Apex PMC is pleased to announce that Chandni Singh is now a PMC
> member. We appreciate all her contributions to the project so far, and are
> looking forward to more.
>
> Congrats Chandni!
> Thomas, for the Apache Apex PMC.
>


Re: [ANNOUNCE] New Apache Apex Committer: Devendra Tagare

2016-08-10 Thread Pradeep A. Dalvi
Congratulations Dev. Welcome aboard.

--prad

On Wed, Aug 10, 2016 at 1:28 PM, Bright Chen  wrote:

> Devendra, Congratulations
>
> Best,
> Bright
>
> > On Aug 10, 2016, at 1:13 PM, Siyuan Hua  wrote:
> >
> > Welcome, Devendra!
> >
> > On Wed, Aug 10, 2016 at 12:28 PM, Thomas Weise  wrote:
> >
> >> The Project Management Committee (PMC) for Apache Apex has asked
> Devendra
> >> Tagare to become a committer and we are pleased to announce that he has
> >> accepted.
> >>
> >> Devendra has been contributing to Apex for several months now, for
> example
> >> the Avro support and JDBC poll. He also did a few Apex meetup
> presentations
> >> and developed sample applications.
> >>
> >> Welcome, Devendra, and congratulations!
> >> Thomas, for the Apache Apex PMC.
> >>
>
>


Re: custom JAVA_HOME

2016-08-03 Thread Pradeep A. Dalvi
+1 for adding support for Application Environment variables.

--prad

On Wed, Aug 3, 2016 at 10:10 AM, Sanjay Pujare 
wrote:

> +1 for Pramod’s idea of allowing all variables supported by YARN
>
> On 8/3/16, 12:56 AM, "Chinmay Kolhatkar"  wrote:
>
> +1 for the idea.
> Are there any in the list that one can see as conflicting with our own
> environment variables? For e.g. LOGNAME?
>
>
> On Wed, Aug 3, 2016 at 4:49 AM, Pramod Immaneni <
> pra...@datatorrent.com>
> wrote:
>
> > How about allowing specification of all environment variables
> supported by
> > YARN that are non-final described below
> >
> >
> >
> http://atetric.com/atetric/javadoc/org.apache.hadoop/hadoop-yarn-api/0.23.3/org/apache/hadoop/yarn/api/ApplicationConstants.Environment.html
> >
> > Thanks
> >
> > On Tue, Aug 2, 2016 at 3:43 PM, Vlad Rozov 
> > wrote:
> >
> > > Should Apex add JAVA_HOME to DAGContext and allow application to
> specify
> > > which JDK to use if there are multiple JDK installations on Hadoop
> > cluster?
> > > Yarn already supports custom JAVA_HOME (please see
> > > https://issues.apache.org/jira/browse/YARN-2481).
> > >
> > > Vlad
> > >
> >
>
>
>
>


[jira] [Resolved] (APEXCORE-488) Issues in SSL communication with StrAM

2016-07-25 Thread Pradeep A. Dalvi (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep A. Dalvi resolved APEXCORE-488.
---
Resolution: Fixed

PR has been merged.

> Issues in SSL communication with StrAM
> --
>
> Key: APEXCORE-488
> URL: https://issues.apache.org/jira/browse/APEXCORE-488
> Project: Apache Apex Core
>  Issue Type: Bug
>    Reporter: Pradeep A. Dalvi
>    Assignee: Pradeep A. Dalvi
>
> Couple of issues in SSL communication with StrAM to track application progress
>  - trackingURL without protocol scheme, makes YARN to pick up default HTTP. 
> This happens even if the yarn.http.policy is set to HTTPS_ONLY.
>  - StramAgent assumes always HTTP communication



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Using DSL api to construct sql queries

2016-07-22 Thread Pradeep A. Dalvi
+1 From what I read, this change is completely backword compatible.

--prad

On Thu, Jul 21, 2016 at 11:13 PM, Chinmay Kolhatkar  wrote:

> Yes there is no clash with calcite integration. Calcite is a query planner
> converts SQL to a relational algebra... This is different.
>
> On Fri, Jul 22, 2016 at 11:03 AM, Priyanka Gugale 
> wrote:
>
> > I don't know much about Calcite, but reading abstract, calcite seems to
> be
> > for different purpose. What I want to achieve here is some library which
> > will let me construct sql query without worrying about different DB
> > platforms. The library will take care of converting query to the DB
> > specific syntax. I am focusing on query construction only and not
> planning
> > or execution.
> >
> > -Priyanka
> >
> > On Thu, Jul 21, 2016 at 9:58 PM, Siyuan Hua 
> > wrote:
> >
> > > But is it a duplication of integration with Calcite?
> > >
> > > On Thu, Jul 21, 2016 at 9:26 AM, Timothy Farkas <
> > > timothytiborfar...@gmail.com> wrote:
> > >
> > > > I see, cool :)
> > > >
> > > > On Thu, Jul 21, 2016 at 9:21 AM, Priyanka Gugale 
> > > > wrote:
> > > >
> > > > > Hi Tim,
> > > > >
> > > > > We are not creating our own DSL, the jooq is just another query
> > > > > parser/builder like JsqlParser. I am trying to use one of these
> query
> > > DSL
> > > > > libraries to replace the existing code in operator which is written
> > to
> > > > > construct the queries.
> > > > >
> > > > > -Priyanka
> > > > >
> > > > > On Thu, Jul 21, 2016 at 9:42 PM, Timothy Farkas <
> > > > > timothytiborfar...@gmail.com> wrote:
> > > > >
> > > > > > I don't know the exact context here so please forgive me if I'm
> > > > > mistaken. I
> > > > > > don't think creating our own DSL is the way to go. Creating a
> > generic
> > > > DSL
> > > > > > is hard. We should support setting the flavor of SQL being used
> as
> > a
> > > > > > property and then allow standard sql to be specified. There are
> > > already
> > > > > > mature Apache License SQL parsers which support many different
> SQL
> > > > > > implementations.
> > > > > >
> > > > > > https://github.com/JSQLParser/JSqlParser
> > > > > >
> > > > > > Thanks,
> > > > > > Tim
> > > > > >
> > > > > > On Thu, Jul 21, 2016 at 2:19 AM, Priyanka Gugale <
> > pri...@apache.org>
> > > > > > wrote:
> > > > > >
> > > > > > > Looking closely at licensing, it says it *depends but doesn't
> > > bundle*
> > > > > > those
> > > > > > > non ASL license dependencies. As per my understanding those
> will
> > be
> > > > > > > included only if we explicitly include them using our
> application
> > > > pom.
> > > > > > > Right away we are not using any of those features which depend
> of
> > > > such
> > > > > > > third party licenses.
> > > > > > >
> > > > > > > Anyone have any suggestion over including this library?
> > > > > > >
> > > > > > > Dev,
> > > > > > > Yes querydsl is an option, but jooq seems more promising. If
> > there
> > > we
> > > > > see
> > > > > > > license is a problem then may be we can go to querydsl.
> > > > > > >
> > > > > > > -Priyanka
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Jul 20, 2016 at 8:58 PM, Devendra Tagare <
> > > > > > > devend...@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 for using DSL constructs that are vendor agnostic.
> > > > > > > >
> > > > > > > > Checkout https://github.com/querydsl/querydsl (Apache
> > licensed)
> > > as
> > > > > > well
> > > > > > > > in-case it fits better in terms of implementation.
> > > > > > > >
> > > > > > > > Also, once the DSL work is done, please test and document the
> > > > > behavior
> > > > > > > > (exactly once, at-least once ..)the operator has with
> different
> > > > > > > databases.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Dev
> > > > > > > >
> > > > > > > > On Wed, Jul 20, 2016 at 4:04 AM, Bhupesh Chawda <
> > > > bhup...@apache.org>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > It is a good idea to get rid of vendor specific
> > implementation
> > > > > > > > differences
> > > > > > > > > for SQL.
> > > > > > > > >
> > > > > > > > > However, the licensing does not seem to be straightforward.
> > > > Please
> > > > > > > check:
> > > > > > > > > http://www.jooq.org/legal/licensing. Can this be used as a
> > > > > > dependency
> > > > > > > in
> > > > > > > > > Apex?
> > > > > > > > >
> > > > > > > > > ~ Bhupesh
> > > > > > > > >
> > > > > > > > > On Wed, Jul 20, 2016 at 3:06 AM, Priyanka Gugale <
> > > > > pri...@apache.org>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > Malhar JDBC operator does lots of string manipulation and
> > > other
> > > > > > > > handling
> > > > > > > > > to
> > > > > > > > > > construct sql queries as per user inputs. Instead of
> > > > constructing
> > > > > > > > queries
> > > > > > > > > > on our own we should use some dsl api 

Re: [Proposal] Support storing apps in a Configuration Package

2016-07-21 Thread Pradeep A. Dalvi
+1

On Thu, Jul 21, 2016 at 4:24 PM, Pramod Immaneni 
wrote:

> +1
>
> On Tue, Jul 19, 2016 at 5:37 PM, Sandesh Hegde 
> wrote:
>
> > Hi All,
> >
> > Apex supports configuration package, separates application package from
> the
> > actual configuration. (
> http://docs.datatorrent.com/configuration_packages/
> > )
> >
> > We want to enhance the configuration package by adding support to "add
> > Apps" (json format).
> >
> > UseCase: Multiple users sharing the same app package, but have a
> different
> > view of the golden copy of the app package.
> >
> > Note: This feature is requested by an Apex user.
> >
> > Thanks
> >
>


Re: Bleeding edge branch ?

2016-07-20 Thread Pradeep A. Dalvi
I agree with Sandesh on following. The official branch from where releases
are cut, shall continue taking EOL into consideration. However we also need
to be prepared wrt future releases of Hadoop.

--prad

On Wed, Jul 20, 2016 at 10:43 AM, Sandesh Hegde 
wrote:

> @Amol
>
> EOL is important for master branch. To start the work on next version of
> Hadoop on different branch ( let us call that master++ ), we should not
> worry about the EOL. Eventually, master++ becomes master and the master++
> will continue on the later version of the Hadoop.
>
>
>
> On Wed, Jul 20, 2016 at 10:30 AM Siyuan Hua 
> wrote:
>
> > Ok, whether branches or forks. I still think we should have at least some
> > materialized version of malhar/core for the big influencer like java,
> > hadoop or even kafka. Java 8, for example, is actually not new.  We don't
> > have to be aggressive to try out new features from those right now. But
> we
> > can at least have some CI run build/test periodically and make sure our
> > current code is future-prove and avoid some future-deprecated code when
> we
> > add new features. Also if people ask for it, we can have a link to point
> > them to.  BTW, High-level API can definitely benefit from java 8.  :)
> >
> > Regards,
> > Siyuan
> >
> > On Wed, Jul 20, 2016 at 8:30 AM, Sandesh Hegde 
> > wrote:
> >
> > > Our current model of supporting the oldest supported Hadoop, penalizes
> > the
> > > users of latest Hadoop versions by favoring the slow movers.
> > > Also, we won't benefit from the increased maturity of the Hadoop
> > platform,
> > > as we will be working on the many years old version of Hadoop.
> > > We also need to incentivize our customers to upgrade their Hadoop
> > version,
> > > by making use of new features.
> > >
> > > My vote goes to start the work on the Hadoop 2.6 ( or any other
> version )
> > > in a different branch, without waiting for the EOL policies.
> > >
> > > On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise 
> > > wrote:
> > >
> > > > -0
> > > >
> > > > I read the thread twice, it is not clear to me what benefit Apex
> users
> > > > derive from this exercise. A branch normally contains development
> work
> > > that
> > > > is eventually brought back to the main line and into a release. Here,
> > the
> > > > suggestion seems to be an open ended effort to play with latest tech,
> > > isn't
> > > > that something anyone (including a group of folks) can do in a fork.
> I
> > > > don't see value in a permanent branch for that, who is going to
> > maintain
> > > > such code and who will ever use it?
> > > >
> > > > There was a point that we can find out about potential problems with
> > > later
> > > > versions. The way to find such issues is to take the releases and run
> > > them
> > > > on these later versions (that's what users do), not by changing the
> > code!
> > > >
> > > > Regarding Java version: Our users don't use Apex in a vacuum. Please
> > > have a
> > > > look at ASF Hadoop and the distros EOL policies. That will answer the
> > > > question what Java version is appropriate. I would be surprised if
> > > > something that works on Java 7 falls flat on the face with Java 8 as
> a
> > > lot
> > > > of diligence goes into backward compatibility. Again the way to tests
> > > this
> > > > is to run verification with existing Apex releases on Java 8 based
> > stack.
> > > >
> > > > Regarding Hadoop version: This has been discussed off record several
> > > times
> > > > and there are actual JIRA tickets marked accordingly so that the work
> > is
> > > > done when we move. It is a separate discussion, no need to mix Java
> > > > versions and branching with it. I agree with what David said, if
> > someone
> > > > can show that we can move up to 2.6 based on EOL policies and what
> > known
> > > > Apex users have in production, then we should work on that upgrade.
> The
> > > way
> > > > I imagine it would work is that we have a Hadoop-2.6 (or whatever
> > > version)
> > > > branch, make all the upgrade related changes there (which should be a
> > > list
> > > > of JIRAs) and then merge it back to master when we are satisfied.
> After
> > > > that, the branch can be deleted.
> > > >
> > > > Thomas
> > > >
> > > >
> > > >
> > > > On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> > > > chin...@datatorrent.com>
> > > > wrote:
> > > >
> > > > > I'm -0 on this idea.
> > > > >
> > > > > Here is the reason:
> > > > > Unless we see a real case where users want to see everything on
> > latest,
> > > > > this branch might quickly become low hanging fruit and eventually
> get
> > > > > obsolete because its anyway a "no gaurantee" branch.
> > > > >
> > > > > We have a bunch of dependencies which we'll have to take care of to
> > > > really
> > > > > make it bleeding edge. Specially about malhar, its a long list.
> That
> > > > looks
> > > > > like quite significant work.
> > > > > 

Re: Container & memory resource allocation

2016-07-20 Thread Pradeep A. Dalvi
I've verified 'multiples of minimum-allocation-mb' on latest Apex. However
'increment-allocation-mb' was not set during that exercise.
I shall check that param as well.

Thanks,
--prad

On Wednesday, July 20, 2016, Munagala Ramanath <r...@datatorrent.com> wrote:

> Please note that there are multiple sites making the claim that memory
> allocation
> is in multiples of *yarn.scheduler.minimum-allocation-mb*; this may have
> been true
> at one time but is no longer true (thanks to Sandesh for fact-checking
> this).
>
> There is a (?new?) parameter, *yarn.scheduler.increment-allocation-mb*,
> which serves
> this purpose as discussed here:
>
> http://blog.cloudera.com/blog/2013/11/migrating-to-mapreduce-2-on-yarn-for-operators/
>
> Ram
>
> On Tue, Jul 19, 2016 at 11:27 AM, Pradeep A. Dalvi <p...@apache.org
> <javascript:;>> wrote:
>
> > Thanks Chinmay & Ram.
> >
> > Troubleshooting page sounds the appropriate location. I shall raise PR
> with
> > the given suggestions.
> >
> > --prad
> >
> > On Tue, Jul 19, 2016 at 5:49 AM, Munagala Ramanath <r...@datatorrent.com
> <javascript:;>>
> > wrote:
> >
> > > There is already a link to a troubleshooting page at bottom of
> > > https://apex.apache.org/docs.html
> > > That page already has some discussion under the section entitled
> > > "Calculating Container Memory"
> > > so adding new content there seems like the right thing to do.
> > >
> > > Ram
> > >
> > > On Mon, Jul 18, 2016 at 11:27 PM, Chinmay Kolhatkar <
> > > chin...@datatorrent.com <javascript:;>
> > > > wrote:
> > >
> > > > Hi Pradeep,
> > > >
> > > > This is a great content to add to the documents. These are the common
> > set
> > > > of errors which might get googled and hence great to get indexed as
> > well.
> > > >
> > > > You can take a look at:
> > > > https://github.com/apache/apex-core/tree/master/docs
> > > >
> > > > The docs for apex reside there in markdown format. Probably its good
> a
> > > > create a troubleshooting page where all such common questions can
> > reside.
> > > >
> > > > After you have the content ready, you can create a pull request to
> > > > apex-core repo which can get merged to apex-core and later deployed
> to
> > > the
> > > > website by committers.
> > > >
> > > > -Chinmay.
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Jul 19, 2016 at 10:46 AM, Pradeep A. Dalvi <p...@apache.org
> <javascript:;>>
> > > > wrote:
> > > >
> > > >> Container & memory resource allocation has been a common question
> > around
> > > >> and so I thought it would be good to explain related configuration
> > > >> parameters.
> > > >>
> > > >> Please feel free to let me know your thoughts.
> > > >>
> > > >> Also I'm planning to add following set of information under Apex
> Docs.
> > > How
> > > >> could one add this to Apex Docs?
> > > >>
> > > >> =-=-=-=
> > > >>
> > > >> "Container is running beyond physical memory limits. Current usage:
> X
> > GB
> > > >> of
> > > >> Y GB physical memory used; A GB of B GB virtual memory used. Killing
> > > >> container."
> > > >>
> > > >> This is basically for some better understanding on Application
> > Master's
> > > >> container requests & Resource Manager's memory resource allocation.
> > > Please
> > > >> note that these are individual container request params. All these
> > > >> parameters are in MB i.e. 1024 => 1GB.
> > > >>
> > > >> - AM's container requests to RM shall contain memory in the
> multiples
> > of
> > > >> *yarn.scheduler.minimum-**allocation-mb* & not exceeding
> > > >> *yarn.scheduler.maximum-**allocation-mb*
> > > >>- If *yarn.scheduler.minimum-**allocation-mb *is configured as
> 1024
> > > and
> > > >> container memory requirement is 1025 ( <= 2048 ), container will be
> > > >> allocated with 2048 memory.
> > > >>
> > > >> - With Apex applications, operator memory can be specified by
> property
> > > >> *dt.application..operator..attr.MEMORY_MB*
> > > >>- Please note this parameter is at Operator level and container
> > > memory
> > > >> is calculated based on number of Operators deployed in a container +
> > > >> additional memory required depending on physical deployment
> > requirements
> > > >> e.g. unifier or bufferserver
> > > >>- Wildcard * can be used at APP_NAME and/or OPERATOR_NAME
> > > >>
> > > >> - If container memory is not specified, then AM would request for 1
> > unit
> > > >> of
> > > >> *yarn.scheduler.minimum-**allocation-mb*, RM would provision
> container
> > > >> taking that into consideration.
> > > >>
> > > >> Node Manager monitors memory usage of each of these containers and
> > kills
> > > >> the ones crossing the configured limit.
> > > >>
> > > >> Almost similar stuff is applicable for CPUs.
> > > >>
> > > >> --prad
> > > >>
> > > >
> > > >
> > >
> >
>


Re: Container & memory resource allocation

2016-07-19 Thread Pradeep A. Dalvi
Thanks Chinmay & Ram.

Troubleshooting page sounds the appropriate location. I shall raise PR with
the given suggestions.

--prad

On Tue, Jul 19, 2016 at 5:49 AM, Munagala Ramanath <r...@datatorrent.com>
wrote:

> There is already a link to a troubleshooting page at bottom of
> https://apex.apache.org/docs.html
> That page already has some discussion under the section entitled
> "Calculating Container Memory"
> so adding new content there seems like the right thing to do.
>
> Ram
>
> On Mon, Jul 18, 2016 at 11:27 PM, Chinmay Kolhatkar <
> chin...@datatorrent.com
> > wrote:
>
> > Hi Pradeep,
> >
> > This is a great content to add to the documents. These are the common set
> > of errors which might get googled and hence great to get indexed as well.
> >
> > You can take a look at:
> > https://github.com/apache/apex-core/tree/master/docs
> >
> > The docs for apex reside there in markdown format. Probably its good a
> > create a troubleshooting page where all such common questions can reside.
> >
> > After you have the content ready, you can create a pull request to
> > apex-core repo which can get merged to apex-core and later deployed to
> the
> > website by committers.
> >
> > -Chinmay.
> >
> >
> >
> >
> > On Tue, Jul 19, 2016 at 10:46 AM, Pradeep A. Dalvi <p...@apache.org>
> > wrote:
> >
> >> Container & memory resource allocation has been a common question around
> >> and so I thought it would be good to explain related configuration
> >> parameters.
> >>
> >> Please feel free to let me know your thoughts.
> >>
> >> Also I'm planning to add following set of information under Apex Docs.
> How
> >> could one add this to Apex Docs?
> >>
> >> =-=-=-=
> >>
> >> "Container is running beyond physical memory limits. Current usage: X GB
> >> of
> >> Y GB physical memory used; A GB of B GB virtual memory used. Killing
> >> container."
> >>
> >> This is basically for some better understanding on Application Master's
> >> container requests & Resource Manager's memory resource allocation.
> Please
> >> note that these are individual container request params. All these
> >> parameters are in MB i.e. 1024 => 1GB.
> >>
> >> - AM's container requests to RM shall contain memory in the multiples of
> >> *yarn.scheduler.minimum-**allocation-mb* & not exceeding
> >> *yarn.scheduler.maximum-**allocation-mb*
> >>- If *yarn.scheduler.minimum-**allocation-mb *is configured as 1024
> and
> >> container memory requirement is 1025 ( <= 2048 ), container will be
> >> allocated with 2048 memory.
> >>
> >> - With Apex applications, operator memory can be specified by property
> >> *dt.application..operator..attr.MEMORY_MB*
> >>- Please note this parameter is at Operator level and container
> memory
> >> is calculated based on number of Operators deployed in a container +
> >> additional memory required depending on physical deployment requirements
> >> e.g. unifier or bufferserver
> >>- Wildcard * can be used at APP_NAME and/or OPERATOR_NAME
> >>
> >> - If container memory is not specified, then AM would request for 1 unit
> >> of
> >> *yarn.scheduler.minimum-**allocation-mb*, RM would provision container
> >> taking that into consideration.
> >>
> >> Node Manager monitors memory usage of each of these containers and kills
> >> the ones crossing the configured limit.
> >>
> >> Almost similar stuff is applicable for CPUs.
> >>
> >> --prad
> >>
> >
> >
>


[jira] [Updated] (APEXCORE-488) Issues in SSL communication with StrAM

2016-07-11 Thread Pradeep A. Dalvi (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXCORE-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep A. Dalvi updated APEXCORE-488:
--
Assignee: Pradeep A. Dalvi

> Issues in SSL communication with StrAM
> --
>
> Key: APEXCORE-488
> URL: https://issues.apache.org/jira/browse/APEXCORE-488
> Project: Apache Apex Core
>  Issue Type: Bug
>    Reporter: Pradeep A. Dalvi
>    Assignee: Pradeep A. Dalvi
>
> Couple of issues in SSL communication with StrAM to track application progress
>  - trackingURL without protocol scheme, makes YARN to pick up default HTTP. 
> This happens even if the yarn.http.policy is set to HTTPS_ONLY.
>  - StramAgent assumes always HTTP communication



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (APEXCORE-488) Issues in SSL communication with StrAM

2016-07-11 Thread Pradeep A. Dalvi (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15371644#comment-15371644
 ] 

Pradeep A. Dalvi commented on APEXCORE-488:
---

I would like to fix these issues.

> Issues in SSL communication with StrAM
> --
>
> Key: APEXCORE-488
> URL: https://issues.apache.org/jira/browse/APEXCORE-488
> Project: Apache Apex Core
>  Issue Type: Bug
>    Reporter: Pradeep A. Dalvi
>
> Couple of issues in SSL communication with StrAM to track application progress
>  - trackingURL without protocol scheme, makes YARN to pick up default HTTP. 
> This happens even if the yarn.http.policy is set to HTTPS_ONLY.
>  - StramAgent assumes always HTTP communication



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-16 Thread Pradeep A. Dalvi
Congrats Siyuan!

-prad

On Thursday, June 16, 2016, Dongming Liang  wrote:

> Congrats, Siyuan!
>
> Thanks,
> - Dongming
>
> Dongming LIANG
> 
> dongming.li...@gmail.com 
>
> On Thu, Jun 16, 2016 at 12:19 AM, Shubham Pathak  >
> wrote:
>
> > Congrats Siyuan !
> >
> > Thanks,
> > Shubham
> >
> > On Thu, Jun 16, 2016 at 11:17 AM, Ashish Tadose  >
> > wrote:
> >
> > > Congratulations Siyuan.
> > >
> > > Ashish
> > >
> > > > On 16-Jun-2016, at 10:49 AM, Aniruddha Thombare <
> > > anirud...@datatorrent.com > wrote:
> > > >
> > > > Congratulations!!!
> > > >
> > > > Thanks,
> > > >
> > > > A
> > > >
> > > > _
> > > > Sent with difficulty, I mean handheld ;)
> > > > On 16 Jun 2016 10:47 am, "Devendra Tagare" <
> devend...@datatorrent.com >
> > > > wrote:
> > > >
> > > >> Congratulations Siyuan
> > > >>
> > > >> Cheers,
> > > >> Dev
> > > >> On Jun 15, 2016 10:13 PM, "Chinmay Kolhatkar" <
> > chin...@datatorrent.com >
> > > >> wrote:
> > > >>
> > > >>> Congrats Siyuan :)
> > > >>>
> > > >>> On Wed, Jun 15, 2016 at 10:05 PM, Priyanka Gugale <
> pri...@apache.org 
> > >
> > > >>> wrote:
> > > >>>
> > >  Congrats Siyuan :)
> > > 
> > >  -Priyanka
> > > 
> > >  On Thu, Jun 16, 2016 at 10:19 AM, Pradeep Kumbhar <
> > > >>> prad...@datatorrent.com 
> > > >
> > >  wrote:
> > > 
> > > > Congratulations Siyuan!!
> > > >
> > > > On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli <
> > te...@datatorrent.com 
> > > >>>
> > > > wrote:
> > > >
> > > >> Congrats Siyuan!
> > > >>
> > > >> On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
> > > >> ashwinchand...@gmail.com > wrote:
> > > >>
> > > >>> Congratulations Siyuan!!
> > > >>> On Jun 15, 2016 9:26 PM, "Thomas Weise" <
> tho...@datatorrent.com >
> > > > wrote:
> > > >>>
> > >  The Apache Apex PMC is pleased to announce that Siyuan Hua is
> > > >>> now a
> > > > PMC
> > >  member. We appreciate all his contributions to the project so
> > > >>> far,
> > > > and
> > > >>> are
> > >  looking forward to more.
> > > 
> > >  Welcome, Siyuan, and congratulations!
> > >  Thomas, for the Apache Apex PMC.
> > > 
> > > >>>
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> Regards,
> > > >>
> > > >> Teddy Rusli
> > > >>
> > > >
> > > >
> > > >
> > > > --
> > > > *regards,*
> > > > *~pradeep*
> > > >
> > > 
> > > >>>
> > > >>
> > >
> > >
> >
>


[jira] [Assigned] (APEXMALHAR-2082) Data Filter Operator

2016-05-24 Thread Pradeep A. Dalvi (JIRA)

 [ 
https://issues.apache.org/jira/browse/APEXMALHAR-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pradeep A. Dalvi reassigned APEXMALHAR-2082:


Assignee: Pradeep A. Dalvi

> Data Filter Operator 
> -
>
> Key: APEXMALHAR-2082
> URL: https://issues.apache.org/jira/browse/APEXMALHAR-2082
> Project: Apache Apex Malhar
>  Issue Type: New Feature
>    Reporter: Pradeep A. Dalvi
>    Assignee: Pradeep A. Dalvi
>
> Data Filter Operator which will allow apex users to filter (select/drop) 
> tuples based on the certain condition from incoming stream.
> Use case:
> -
> In many cases, not all tuples are of interest for the downstream operators. 
> In such cases, one may want select/filter out tuples to downstream. Also one 
> may want to process tuples which did not meet the condition/expression.
> Functionality:
> -
> 1. Any tuple for which expression could not be evaluated shall be emitted on 
> error output port.
> 2. Filter operator shall receive POJO as input tuple and emit POJO on either 
> of remaining output ports i.e. truePort and falsePort. As the output ports' 
> name signify, when condition is met then the POJO shall be emitted on 
> truePort and if condition is not met then that POJO shall be emitted on 
> falsePort.
> 3. Operator needs condition/expression as a input param. This condition is 
> based on expression language support we already have in Malhar.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)