Re: Core release 3.7.0

2018-04-09 Thread Sandesh Hegde
+1

On Mon, Apr 9, 2018 at 11:31 AM Pramod Immaneni 
wrote:

> Also, since we will be changing min jdk version to 8 for next release, I am
> fine with thomas's suggestion to change next version to 4.0. If there are
> no objections I will update the version on master.
>
> Thanks
>
> On Mon, Apr 9, 2018 at 11:19 AM, Pramod Immaneni 
> wrote:
>
> > Can this be moved to the next release as it is not done
> >
> > https://issues.apache.org/jira/browse/APEXCORE-755
> >
> > On Mon, Apr 9, 2018 at 8:32 AM, Pramod Immaneni 
> > wrote:
> >
> >> The branch release-3.7 has been created and master changed to
> >> 3.8.0-SNAPSHOT.
> >>
> >> On Fri, Apr 6, 2018 at 1:03 PM, Pramod Immaneni  >
> >> wrote:
> >>
> >>> Now that this has been merged, I will cut the release branch over the
> >>> weekend.
> >>>
> >>> Thanks
> >>>
> >>> On Fri, Mar 23, 2018 at 12:49 PM, Thomas Weise  wrote:
> >>>
>  https://github.com/apache/apex-core/pull/583
> 
> 
>  On Thu, Mar 22, 2018 at 1:47 PM, Pramod Immaneni <
>  pra...@datatorrent.com>
>  wrote:
> 
>  > Is there anything folks would want to get merged before the release
>  process
>  > starts. Remember this would be the last release supporting Java 7.
>  >
>  > Thanks
>  >
>  > On Sat, Feb 24, 2018 at 5:16 AM, Pramod Immaneni <
>  pra...@datatorrent.com>
>  > wrote:
>  >
>  > > ok
>  > >
>  > > On Mon, Feb 19, 2018 at 1:30 PM, Thomas Weise 
>  wrote:
>  > >
>  > >> I think that should be done after the release, along with 4.x
>  version
>  > >> change.
>  > >>
>  > >>
>  > >> On Mon, Feb 19, 2018 at 10:06 AM, Pramod Immaneni <
>  > pra...@datatorrent.com
>  > >> >
>  > >> wrote:
>  > >>
>  > >> > Shall we move to Java 8 with this release
>  > >> >
>  > >> > On Wed, Jan 31, 2018 at 9:06 AM, Pramod Immaneni <
>  > >> pra...@datatorrent.com>
>  > >> > wrote:
>  > >> >
>  > >> > > I can do it if no one is volunteering.
>  > >> > >
>  > >> > > Thanks
>  > >> > >
>  > >> > > > On Jan 31, 2018, at 7:59 AM, Thomas Weise 
>  wrote:
>  > >> > > >
>  > >> > > > We are still looking for a volunteer to run the release..
>  > >> > > >
>  > >> > > >
>  > >> > > >> On Fri, Jan 19, 2018 at 8:32 AM, Vlad Rozov <
>  vro...@apache.org>
>  > >> > wrote:
>  > >> > > >>
>  > >> > > >> +1. Should be the last release to support Java 7.
>  > >> > > >>
>  > >> > > >> Thank you,
>  > >> > > >>
>  > >> > > >> Vlad
>  > >> > > >>
>  > >> > > >>
>  > >> > > >>> On 1/18/18 18:45, Bhupesh Chawda wrote:
>  > >> > > >>>
>  > >> > > >>> +1 for release
>  > >> > > >>>
>  > >> > > >>> ~ Bhupesh
>  > >> > > >>>
>  > >> > > >>> On Jan 19, 2018 12:14 AM, "Ananth G" <
>  ananthg.a...@gmail.com>
>  > >> wrote:
>  > >> > > >>>
>  > >> > > >>> +1 for the release.
>  > >> > > 
>  > >> > >  Regards,
>  > >> > >  Ananth
>  > >> > > 
>  > >> > >  On Fri, Jan 19, 2018 at 4:41 AM, Chinmay Kolhatkar <
>  > >> > > chin...@apache.org>
>  > >> > >  wrote:
>  > >> > > 
>  > >> > >  +1 for the release.
>  > >> > > >
>  > >> > > > - Chinmay.
>  > >> > > >
>  > >> > > > On 18 Jan 2018 8:58 pm, "Tushar Gosavi" <
>  > tus...@datatorrent.com
>  > >> >
>  > >> > > wrote:
>  > >> > > >
>  > >> > > > +1 for the release
>  > >> > > >>
>  > >> > > >> Regards,
>  > >> > > >> - Tushar.
>  > >> > > >>
>  > >> > > >>
>  > >> > > >> On Thu, Jan 18, 2018 at 3:28 AM, Amol Kekre <
>  > >> a...@datatorrent.com
>  > >> > >
>  > >> > > >>
>  > >> > > > wrote:
>  > >> > > >
>  > >> > > >> +1
>  > >> > > >>>
>  > >> > > >>> Thks,
>  > >> > > >>> Amol
>  > >> > > >>>
>  > >> > > >>>
>  > >> > > >>> E:a...@datatorrent.com | M: 510-449-2606
> <(510)%20449-2606> | Twitter:
>  > >> > @*amolhkekre*
>  > >> > > >>>
>  > >> > > >>> www.datatorrent.com
>  > >> > > >>>
>  > >> > > >>>
>  > >> > > >>> On Wed, Jan 17, 2018 at 1:55 PM, Pramod Immaneni <
>  > >> > > >>>
>  > >> > > >> pra...@datatorrent.com
>  > >> > > >
>  > >> > > >> wrote:
>  > >> > > >>>
>  > >> > > >>> +1
>  > >> > > 
>  > >> > >  On Jan 17, 2018, at 7:25 AM, Thomas Weise <
>  t...@apache.org>
>  > >> > > >
>  > >> > >  wrote:
>  > >> > > 
>  > >> > > > Last release was 3.6.0 in May and following issues are
>  ready
>  > for
>  > >> > > >
>  > >> > 

Re: [RESULT] [VOTE] Major version change for Apex Library (Malhar)

2017-09-01 Thread Sandesh Hegde
Using all the technicalities and loop holes, we can declare many votes
invalid. What purpose does it solve? This thread is dividing the community,
instead of recognizing the difference if we move forward with this, there
is a chance that Apex will alienate many contributors. What's the end game
here? At what cost?

On Fri, Sep 1, 2017 at 9:31 AM Thomas Weise  wrote:

> Yes, you would need a separate discussion/vote on changes not being
> reflected in master that you make to a branch (current procedure).
>
> Regarding procedural vote, the decision to start development towards new
> major release is a longer term decision, not just code change.
>
> https://www.apache.org/foundation/glossary.html#MajorityApproval
>
> "Refers to a vote (sense 1) which has completed with at least three binding
> +1 votes and more +1 votes than -1 votes. ( I.e. , a simple majority with a
> minimum quorum of three positive votes.) Note that in votes requiring
> majority approval a -1 vote is simply a vote against, not a veto. Compare
> Consensus Approval. See also the description of the voting process."
>
>
> For code modifications the rules are different, -1 is a veto that needs to
> have a valid technical reason why the change cannot be made. Otherwise it
> is void. None of the -1s in the vote result provide such justification.
>
> Thanks,
> Thomas
>
>
>
> On Thu, Aug 31, 2017 at 10:06 PM, Pramod Immaneni 
> wrote:
>
> > Thomas,
> >
> > Wouldn't you need to call a separate procedural vote for whether changes
> > cannot be allowed into 3.x without requiring they be submitted to 4.x as
> > there was a disagreement there? Also, I am not sure that the procedural
> > vote argument can be used here for 4.x given that it involves
> modifications
> > to existing code. I would say we should drive towards getting a consensus
> > by addressing the concerns folks have about 4.x.
> >
> > On Thu, Aug 31, 2017 at 8:24 PM, Thomas Weise  wrote:
> >
> > > There wasn't any more discussion on this, so here is the result:
> > >
> > > 1. Version 4.0 as major version change from 3.x
> > > 
> > >
> > > +1 (7)
> > >
> > > Thomas Weise (PMC)
> > > Ananth G
> > > Vlad Rozov (PMC)
> > > Munagala Ramanath (committer)
> > > Pramod Immaneni (PMC)
> > > Sanjay Pujare
> > > David Yan (PMC)
> > >
> > > -1 (3)
> > >
> > > Amol Kekre (PMC)
> > > Sergey Golovko
> > > Ashwin Chandra Putta (committer)
> > >
> > >
> > > 2. Version 1.0 with simultaneous change of Maven artifact IDs
> > > ===
> > >
> > > +1 (5)
> > >
> > > Thomas Weise (PMC)
> > > Ananth G
> > > Vlad Rozov (PMC)
> > > Munagala Ramanath (committer)
> > > David Yan (PMC)
> > >
> > > -1 (5)
> > >
> > > Pramod Immaneni (PMC)
> > > Sanjay Pujare
> > > Amol Kekre (PMC)
> > > Sergey Golovko
> > > Ashwin Chandra Putta (committer)
> > >
> > >
> > > RESULT
> > > ===
> > >
> > > Vote for option 1 (major version 4.x) *passes* with majority rule [1].
> > >
> > > Thanks,
> > > Thomas
> > >
> > >
> > > [1] https://www.apache.org/foundation/voting.html
> > >
> > >
> > > On Mon, Aug 21, 2017 at 6:39 PM, Thomas Weise  wrote:
> > >
> > > > This is to formalize the major version change for Malhar discussed in
> > > [1].
> > > >
> > > > There are two options for major version change. Major version change
> > will
> > > > rename legacy packages to org.apache.apex sub packages while
> retaining
> > > file
> > > > history in git. Other cleanup such as removing deprecated code is
> also
> > > > expected.
> > > >
> > > > 1. Version 4.0 as major version change from 3.x
> > > >
> > > > 2. Version 1.0 with simultaneous change of Maven artifact IDs
> > > >
> > > > Please refer to the discussion thread [1] for reasoning behind both
> of
> > > the
> > > > options.
> > > >
> > > > Please vote on both options. Primary vote for your preferred option,
> > > > secondary for the other. Secondary vote can be used when counting
> > primary
> > > > vote alone isn't conclusive.
> > > >
> > > > Vote will be open for at least 72 hours.
> > > >
> > > > Thanks,
> > > > Thomas
> > > >
> > > > [1] https://lists.apache.org/thread.html/
> > bd1db8a2d01e23b0c0ab98a785f6ee
> > > > 9492a1ac9e52d422568a46e5f3@%3Cdev.apex.apache.org%3E
> > > >
> > >
> >
>


Re: -1 or veto voting

2017-08-24 Thread Sandesh Hegde
Today, I saw the below -1 by Thomas,
https://github.com/apache/apex-malhar/pull/666 without the technical
justification.

Saumya, PR Author, has created a mail thread to discuss the justification,
but there was no comment in the mail thread.

So should we consider this as invalid -1?

On Thu, Aug 24, 2017 at 10:08 AM Vlad Rozov  wrote:

> For -1 to be valid there *must* be *technical* justification(s) not to
> proceed with the code change. Without such justification -1 is
> considered to be void/invalid [1].
>
> I don't see any possible *technical* justification not to proceed with
> the package rename as it was done in the past by a large number of
> Apache (and not only Apache) projects  and nothing bad happened (no
> performance degradation, no introduction of security vulnerability) and
> projects remained usable by their users. With the current IDEs, it is a
> question of 5 minutes to complete necessary modifications.
>
> Both Apache Felix and Apache Groovy (as well as Apache Apex) are split
> package projects. There is mix and match of org.apache.* and other
> package names (org.osgi, groovy, com.datatorrent). IMO, this is a bad
> practice and I don't think that Apex community should use those projects
> as a best practice examples. Majority of Apache projects consistently
> use org.apache package and IMO that simplifies user and community
> experience.
>
> Majority of malhar library classes are excluded from semantic versioning
> check and are not subject of backward compatibility/stable API
> guarantee. Due to that there never be a good reason to change major
> version as backward incompatible changes are introduced silently and
> without proper semantic versioning.
>
> Thank you,
>
> Vlad
>
> [1] https://www.apache.org/foundation/voting.html
>
> On 8/23/17 15:17, Sergey Golovko wrote:
> > -1 for the option 2
> >
> > I don't think it makes sense to rush to rename the package name. There
> are
> > Apache Java projects that use the original package names after migration
> to
> > Apache Software Foundation. For instance,
> >
> > Apache Felix  (org.osgi)
> > Apache Groovy  (groovy)
> >
> > Personally I don't like the idea to rename package names for any existing
> > tools and applications. It can just be a big confusion for users without
> > any real benefits.
> >
> > -1 for the option 1
> >
> > I see only one valid reason to change the major version now. It is the
> full
> > refactoring of the code without supporting of any backward compatibility.
> > If we are going to make the package refactoring we need to change the
> major
> > version. If we are not going to do it now, it does not make sense to
> > change the major version. I don't think it makes sense to vote for the
> two
> > options separately.
> >
> > Thanks,
> > Sergey
> >
> >
> > On Wed, Aug 23, 2017 at 6:39 AM, Thomas Weise  wrote:
> >
> >> So far everyone else has voted +1 on option 1. Your -1 is not a veto
> >> (unlike your previous -1 on a pull request), but your response also
> states
> >> "I am for option 1" and that you want to have the branch release-3
> >> included. So why don't you include that into your vote for option 1 as a
> >> condition, since that's what is going to happen anyways.
> >>
> >> Thomas
> >>
> >>
> >> On Tue, Aug 22, 2017 at 6:17 PM, Amol Kekre 
> wrote:
> >>
> >>> On just voting part, I remain -1 on both options
> >>>
> >>> Thks
> >>> Amol
> >>>
> >>>
> >>>
> >> On Tue, Aug 22, 2017 at 10:03 AM, Amol Kekre 
> wrote:
> >>
> >>> I am -1 on option 2. There is no need to do so, as going back on
> versions
> >>> at this stage has consequences to Apex users.
> >>>
> >>> I am for option 1, but I want to propose explicit change to the text.
> >> Based
> >>> on verbatim text, I am voting -1 on option 1. I believe in the original
> >>> discussion thread there was talk about continuing release-3 that should
> >> be
> >>> explicit in the vote.
> >>>
> >>>
>
>
> Thank you,
>
> Vlad
>


Re: Request to close/update progress of JIRA's.

2017-08-07 Thread Sandesh Hegde
Ambarish, those operators still need some work to make it user friendly.

GenericRecord is not Kryo serializable, so AvroReader and GenericToPojo
converter needs to be CONTAINER/THREAD local.
This combo of 2 operators is best suitable for creation of Avro Module.

On Mon, Aug 7, 2017 at 7:58 AM Vlad Rozov  wrote:

> Is it a duplicate of APEXMALHAR-2011 or APEXMALHAR-2012?
>
> Thank you,
>
> Vlad
>
> On 8/6/17 23:41, Ambarish Pande wrote:
> > Hello,
> >
> > Yes, I was looking at unassigned JIRA's. But I found out these jira's are
> > already implemented in malhar. e.g. APEXMALHAR-2034
> > . I did not see
> any
> > work log or history.
> >
> > Thank You.
> >
> > On Mon, Aug 7, 2017 at 11:43 AM, Thomas Weise  wrote:
> >
> >> http://apex.apache.org/contributing.html#jira
> >>
> >> JIRAs should not be assigned unless they are actively being worked on.
> >>
> >> Look for JIRAs that are not assigned. If you want to work on one that is
> >> already assigned and believe it is not active, ask on that JIRA if it
> is OK
> >> to reassign it.
> >>
> >> Thomas
> >>
> >>
> >> On Sun, Aug 6, 2017 at 10:38 PM, Ambarish Pande <
> ambar...@datatorrent.com>
> >> wrote:
> >>
> >>> Hello Devs,
> >>>
> >>> As a new developer I am trying to find JIRA's to work on. And I see
> many
> >>> JIRA's that already are completed or being worked on as OPEN. I request
> >>> devs to mark progress on their JIRA's so that it is easy for new
> >>> contributers to choose JIRA's to work.
> >>>
> >>> Thank You.
> >>>
> >>> --
> >>> *Thanks & Regards,*
> >>>
> >>> **
> >>> *Ambarish Pande*
> >>>
> >>> Associate Software Engineer
> >>>
> >>> E: ambar...@datatorrent.com  |
> >> M:+91-9028293982 <+91%2090282%2093982>
> >>> www.datatorrent.com  |  apex.apache.org
> >>>
> >
> >
>
>


Re: Java packages: legacy -> org.apache.apex

2017-07-15 Thread Sandesh Hegde
Why is there a urgency, why cant this go into 4.0 Malhar with possibly
other breaking changes?
On Sat, Jul 15, 2017 at 7:57 AM Thomas Weise  wrote:

> Discussing what in the future might become stable needs to be a separate
> thread, it will be a much bigger discussion.
>
> The topic here is to relocate the packages. With a few exceptions
> relocation won't affect the semantic versioning. Semantic versioning is
> essentially not effective for Malhar because almost everything is @Evolving
> (and there are reasons for that.. -> separate topic)
>
> I don't really like the idea of creating bw compatibility stubs for the
> follow up PR. It creates even more clutter in the source tree than there
> already is and so here is an alternative suggestion:
>
>
> https://github.com/tweise/apex-malhar/blob/malhar37-compat/shaded-malhar37/pom.xml
>
> Create a shaded artifact that provides the old com.datatorrent.* classes as
> of release 3.7. Users can include that artifact if they don't want to
> change import statements. At the same time they have an incentive to switch
> to the relocated classes to take advantage of bug fixes and new
> functionality.
>
> I will work on the first PR that does the relocate. In the meantime, we can
> finalize what backward compatibility support we want to provide and how.
>
> Thanks,
> Thomas
>
>
>
>
> On Fri, Jul 14, 2017 at 11:33 AM, Pramod Immaneni 
> wrote:
>
> > How about coming up with a list of @Evolving operators that we would like
> > to see in the eventual stable list and move those along with the not
> > @Evolving ones in org.apache.apex with b/w stubs and leave the rest as
> they
> > are. Then have a follow up JIRA for the rest to be moved over to contrib
> > and be deprecated.
> >
> > Thanks
> >
> > On Fri, Jul 14, 2017 at 10:37 AM, Thomas Weise 
> > wrote:
> >
> > > We need to keep the discussion here on topic, if other things are piled
> > on
> > > top then nothing gets done.
> > >
> > > Most operators today are not stable, they are @Evolving. So perhaps it
> is
> > > necessary to have a separate discussion about that, outside of this
> > thread.
> > > My guess is that there are only a few operators that we could declare
> > > stable. Specifically, under contrib the closest one might have been
> > Kafka,
> > > but that is already superseded by the newer versions.
> > >
> > > Thomas
> > >
> > >
> > > On Fri, Jul 14, 2017 at 10:21 AM, Pramod Immaneni <
> > pra...@datatorrent.com>
> > > wrote:
> > >
> > > > We had a discussion a while back, agreed to relegate non-stable and
> > > > experimental operators to contrib and also added this to contribution
> > > > guidelines. We aexecuted on this and cleaned up the repo by moving
> > > > operators deemed non-suitable for production use at that time to
> > contrib.
> > > > So I wouldn't say the operators that are at the top level today or
> the
> > > ones
> > > > in library are 0.x.x quality. Granted, we may need to do one more
> pass
> > as
> > > > some of the operators may no longer be considered the right
> > > implementations
> > > > with the advent of the windowed operator.
> > > >
> > > > Since contrib used to be the place that housed operators that
> required
> > > > third party libraries, there are some production quality operators in
> > > there
> > > > that need to be pulled out into top level modules.
> > > >
> > > > I think we are in agreement that for operators that we consider
> stable,
> > > we
> > > > should provide a b/w stub. I would add that we strongly consider
> > creating
> > > > the org.apache.apex counterparts of any stable operators that are in
> > > > contrib out in top level modules and have the com.datatorrent stubs
> in
> > > > contrib.
> > > >
> > > > For the operators not considered stable, I would prefer we either
> leave
> > > the
> > > > current package path as is or if we change the package path then
> create
> > > the
> > > > b/w stub as I am not sure how widely they are in use (so, in essence,
> > > > preserve semantic versioning). It would be good if there is a
> followup
> > > JIRA
> > > > that takes a look at what other operators can be moved to contrib
> with
> > > the
> > > > advent of the newer frameworks and understanding.
> > > >
> > > > Thanks
> > > >
> > > > On Fri, Jul 14, 2017 at 9:44 AM, Thomas Weise 
> wrote:
> > > >
> > > > > Most of the operators evolve, as is quite clearly indicated by
> > > @Evolving
> > > > > annotations. So while perhaps 0.x.x would be a more appropriate
> > version
> > > > > number, I don't think you can change that.
> > > > >
> > > > > Thomas
> > > > >
> > > > > On Fri, Jul 14, 2017 at 9:35 AM, Vlad Rozov <
> v.ro...@datatorrent.com
> > >
> > > > > wrote:
> > > > >
> > > > > > If entire library is not stable, its version should be 0.x.x and
> > > there
> > > > > > should not be any semantic versioning enabled or implied. It
> > evolves.
> > > > If
> > > > > > some operators are 

Re: anyone else seeing a 404

2017-07-13 Thread Sandesh Hegde
+1

On Thu, Jul 13, 2017 at 2:00 PM Amol Kekre  wrote:

> me too
>
> Thks
> Amol
>
>
> E:a...@datatorrent.com | M: 510-449-2606 <(510)%20449-2606> | Twitter:
> @*amolhkekre*
>
> www.datatorrent.com
>
>
> On Thu, Jul 13, 2017 at 1:31 PM, Pramod Immaneni 
> wrote:
>
> > https://git-wip-us.apache.org/repos/asf?p=apex-core.git
> >
>


Re: impersonation and application path

2017-05-18 Thread Sandesh Hegde
My vote is to make the new proposal as the default behavior. Is there a use
case for the current behavior? If not then no need to add the configuration
setting.

On Thu, May 18, 2017 at 3:47 PM Pramod Immaneni 
wrote:

> Sorry typo in sentence "as we are not asking for permissions for a lower
> privilege", please read as "as we are now asking for permissions for a
> lower privilege".
>
> On Thu, May 18, 2017 at 3:44 PM, Pramod Immaneni 
> wrote:
>
> > Apex cli supports impersonation in secure mode. With impersonation, the
> > user running the cli or the user authenticating with hadoop (henceforth
> > referred to as login user) can be different from the effective user with
> > which the actions are performed under hadoop. An example for this is an
> > application can be launched by user A to run in hadoop as user B. This is
> > kind of like the sudo functionality in unix. You can find more details
> > about the functionalilty here
> https://apex.apache.org/docs/apex/security/ in
> > the Impersonation section.
> >
> > What happens today with launching an application with impersonation,
> using
> > the above launch example, is that even though the application runs as
> user
> > B it still uses user A's hdfs path for the application path. The
> > application path is where the artifacts necessary to run the application
> > are stored and where the runtime files like checkpoints are stored. This
> > means that user B needs to have read and write access to user A's
> > application path folders.
> >
> > This may not be allowed in certain environments as it may be a policy
> > violation for the following reason. Because user A is able to impersonate
> > as user B to launch the application, A is considered to be a higher
> > privileged user than B and is given necessary privileges in hadoop to do
> > so. But after launch B needs to access folders belonging to A which could
> > constitute a violation as we are not asking for permissions for a lower
> > privilege user to access resources of a higher privilege user.
> >
> > I would like to propose adding a configuration setting, which when set
> > will use the application path in the impersonated user's home directory
> > (user B) as opposed to impersonating user's home directory (user A). If
> > this setting is not specified then the behavior can default to what it is
> > today for backwards compatibility.
> >
> > Comments, suggestions, concerns?
> >
> > Thanks
> >
>


Re: NO_LOCAL_WRITE Error from Stram

2017-03-03 Thread Sandesh Hegde
Please check the Hadoop dependency version in your POM. Also we need to
move these discussions to users@

On Fri, Mar 3, 2017 at 2:14 PM Ganelin, Ilya 
wrote:

> Minor amendment: hadoop-2.6.0+cdh5.8.0+1592 (2.6 vs 2.7)
>
>
>
>
>
> - Ilya Ganelin
>
> [image: id:image001.png@01D1F7A4.F3D42980]
>
>
>
> *From: *"Ganelin, Ilya" 
> *Reply-To: *"dev@apex.apache.org" 
> *Date: *Friday, March 3, 2017 at 1:59 PM
> *To: *"dev@apex.apache.org" 
> *Subject: *NO_LOCAL_WRITE Error from Stram
>
>
>
> Hello, all – new error cropping up at Application start and Google does
> not offer any helpful guidance.
>
>
>
> 2017-03-03 16:49:02,558 INFO
> com.datatorrent.common.util.AsyncFSStorageAgent: using
> /opt/cloudera/hadoop/1/yarn/nm/usercache/XX/appcache/application_1483979920683_0448/container_e97905_1483979920683_0448_01_01/tmp/chkp3008940656673217935
> as the basepath for checkpointing.
>
>
>
> 2017-03-03 16:49:02,739 ERROR com.datatorrent.stram.StreamingAppMaster:
> Exiting Application Master
>
> 2017-03-03 16:49:02,739 ERROR com.datatorrent.stram.StreamingAppMaster:
> Exiting Application Master
>
> java.lang.NoSuchFieldError: NO_LOCAL_WRITE
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1909)
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream.(DFSOutputStream.java:1938)
>
> at
> org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1996)
>
> at
> org.apache.hadoop.hdfs.DFSClient.primitiveCreate(DFSClient.java:1786)
>
> at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:105)
>
> at org.apache.hadoop.fs.Hdfs.createInternal(Hdfs.java:59)
>
> at
> org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:577)
>
> at
> org.apache.hadoop.fs.FileContext$3.next(FileContext.java:680)
>
> at
> org.apache.hadoop.fs.FileContext$3.next(FileContext.java:676)
>
> at
> org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>
> at
> org.apache.hadoop.fs.FileContext.create(FileContext.java:676)
>
> at
> com.datatorrent.common.util.AsyncFSStorageAgent.copyToHDFS(AsyncFSStorageAgent.java:118)
>
> at
> com.datatorrent.stram.plan.physical.PhysicalPlan.initCheckpoint(PhysicalPlan.java:1236)
>
> at
> com.datatorrent.stram.plan.physical.PhysicalPlan.(PhysicalPlan.java:495)
>
> at
> com.datatorrent.stram.StreamingContainerManager.(StreamingContainerManager.java:418)
>
> at
> com.datatorrent.stram.StreamingContainerManager.getInstance(StreamingContainerManager.java:3065)
>
> at
> com.datatorrent.stram.StreamingAppMasterService.serviceInit(StreamingAppMasterService.java:552)
>
> at
> org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
>
> at
> com.datatorrent.stram.StreamingAppMaster.main(StreamingAppMaster.java:102)
>
>
>
> Hadoop version is 2.7.2, Apex-core 3.5.0, Gateway 3.7.0, Java 1.8
>
>
>
> Any thoughts? Thanks!
>
>
>
> - Ilya Ganelin
>
>
> --
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>
> --
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>
-- 
*Join us at Apex Big Data World-San Jose
, April 4, 2017!*
[image: http://www.apexbigdata.com/san-jose-register.html]


Re: APEXCORE-619 Recovery windowId in future during application relaunch.

2017-03-01 Thread Sandesh Hegde
Instead of treating the stateless operator in a special way and missing
corner cases, just have a dummy checkpoint, then there is no need to handle
corner cases.

There is a name for this solution,
https://en.wikipedia.org/wiki/Null_Object_pattern



On Wed, Mar 1, 2017 at 2:52 PM Pramod Immaneni 
wrote:

> There is code in various places that deals with stateless operators in a
> special way even though a physical checkpoint does not exist on the disk.
> It is probably a matter of applying similar thought process/logic correctly
> here.
>
> On Wed, Mar 1, 2017 at 2:27 PM, Amol Kekre  wrote:
>
> > hmm! the fact that commitWindowId has moved up (right now in memory of
> > Stram) should mean that a complete set of checkpoints are available, i.e
> > commitWindowId can be derived. Lets say that next checkpoint window also
> > gets checkpointed across the app, commitwindowID is in memory but not
> > written to stram-state yet, then upon relaunch the latest commitwindowID
> > should get computed correctly.
> >
> > This may be just about setting stateless operators to commitWindowid on
> > re-launch? aka bug/feature?
> >
> > Thks
> > Amol
> >
> >
> >
> > E:a...@datatorrent.com | M: 510-449-2606 <(510)%20449-2606> | Twitter:
> @*amolhkekre*
> >
> > www.datatorrent.com  |  apex.apache.org
> >
> > *Join us at Apex Big Data World-San Jose
> > , April 4, 2017!*
> > [image: http://www.apexbigdata.com/san-jose-register.html]
> > 
> >
> > On Wed, Mar 1, 2017 at 1:41 PM, Pramod Immaneni 
> > wrote:
> >
> > > Do we need to save committedWindowId? Can't it be computed from
> existing
> > > checkpoints by walking through the DAG. We probably do this anyway and
> I
> > > suspect there is a minor bug somewhere in there. If an operator is
> > > stateless you could assume checkpoint as long max for sake of
> computation
> > > and compute the committed window to be the lowest common checkpoint. If
> > > they are all stateless and you end up with long max you can start with
> > > window id that reflects the current timestamp.
> > >
> > > Thanks
> > >
> > > On Wed, Mar 1, 2017 at 1:09 PM, Amol Kekre 
> wrote:
> > >
> > > > CommitWindowId could be computed from the existing checkpoints. That
> > > > solution still needs purge to be done after commitWindowId is
> confirmed
> > > to
> > > > be saved in Stram state. Without ths the commitWindowId computed from
> > the
> > > > checkpoints may have some checkpoints missing.
> > > >
> > > > Thks
> > > > Amol
> > > >
> > > >
> > > >
> > > > E:a...@datatorrent.com | M: 510-449-2606 <(510)%20449-2606> |
> Twitter: @*amolhkekre*
> > > >
> > > > www.datatorrent.com  |  apex.apache.org
> > > >
> > > > *Join us at Apex Big Data World-San Jose
> > > > , April 4, 2017!*
> > > > [image: http://www.apexbigdata.com/san-jose-register.html]
> > > > 
> > > >
> > > > On Wed, Mar 1, 2017 at 12:36 PM, Pramod Immaneni <
> > pra...@datatorrent.com
> > > >
> > > > wrote:
> > > >
> > > > > Can't the commitedWindowId be calculated by looking at the physical
> > > plan
> > > > > and the existing checkpoints?
> > > > >
> > > > > On Wed, Mar 1, 2017 at 5:34 AM, Tushar Gosavi 
> > > wrote:
> > > > >
> > > > > > Help Needed for APEXCORE-619
> > > > > >
> > > > > > Issue : When application is relaunched after long time with
> > stateless
> > > > > > opeartors at the end of the DAG, the stateless operators starts
> > with
> > > a
> > > > > very
> > > > > > high windowId. In this case the stateless operator ignors all the
> > > data
> > > > > > received till upstream operator catches up with it. This breaks
> the
> > > > > > *at-least-once* gaurantee while relaunch of the opeartor or when
> > > master
> > > > > is
> > > > > > killed and application is restarted.
> > > > > >
> > > > > > Solutions:
> > > > > > - Fix windowId for stateless leaf operators from upstream
> opeartor.
> > > But
> > > > > it
> > > > > > has some issues when we have a join with two upstrams operators
> at
> > > > > > different windowId. If we set the windowID to min(upstream
> > windowId),
> > > > > then
> > > > > > we need to again recalulate the new recovery window ids for
> > upstream
> > > > > paths
> > > > > > from this operators.
> > > > > >
> > > > > > - Other solution is to create a empty file in checkpoint
> directory
> > > for
> > > > > > stateless operators. This will help us to identify the
> checkpoints
> > of
> > > > > > stateless operators during relaunch instead of computing from
> > latest
> > > > > > timestamp.
> > > > > >
> > > > > > - Bring the entire DAG to committedWindowId. This could be
> achived
> > > > using
> > > > > > writing committedWindowId in a journal. we need to make sure that
> > we
> > > > are
> > > > > > not puring 

Re: At-least once semantics for Kafka to Cassndra ingest

2017-02-14 Thread Sandesh Hegde
Settings mentioned by Sanjay, will only guarantee exactly once for Windows,
but not for partial window processed by the operator, in a way that setting
is a misnomer.
To achieve Exactly once, there are some precoditions that need to be met
along with the support in the output operator. Here is a blog that gives
the idea about exactly once,
https://www.datatorrent.com/blog/end-to-end-exactly-once-with-apache-apex/

On Tue, Feb 14, 2017 at 2:11 PM Sanjay Pujare 
wrote:

> Have you taken a look at
> http://apex.apache.org/docs/apex/application_development/#exactly-once ?
> i.e. setting that processing mode on all the operators in the pipeline .
>
> Join us at Apex Big Data World-San Jose <
> http://www.apexbigdata.com/san-jose.html>, April 4, 2017!
>
> http://www.apexbigdata.com/san-jose-register.html
>
>
> On 2/14/17, 12:00 PM, "Himanshu Bari"  wrote:
>
> How to ensure that the Kafka to Cassandra ingestion pipeline in Apex
> will
> guarantee exactly once processing semantics.
> Eg. Message was read from Kafka but apex app died before it was written
> successfully to Cassandra.
>
>
>
> --
*Join us at Apex Big Data World-San Jose
, April 4, 2017!*
[image: http://www.apexbigdata.com/san-jose-register.html]



Re: Setting JAVA Serializer to be used at App Level.

2017-02-05 Thread Sandesh Hegde
Java serializer comes with a big performance cost, so it is better to
reduce it's usage.
Can you please give more detail about your use case?

On Sun, Feb 5, 2017 at 10:05 PM Hitesh Kapoor 
wrote:

Hi Ambarish,

Yes you can plug in your own serializer. You will have to set the
"STREAM_CODEC" port attribute to achieve the same.
You can refer xmlParserApplication from examples repo (
https://github.com/DataTorrent/examples).

Regards,
Hitesh


On Mon, Feb 6, 2017 at 11:07 AM, Ambarish Pande <
ambarish.pande2...@gmail.com> wrote:

> Hello,
>
> Is there a way to set up JAVA Serializer as the default serializer to be
> used for a particular application. Currently, Kryo is the default
> serializer and the library that I am using has compatibility issues with
> Kryo.
>
> Thank You.
>


Re: Upgrade Apache Bigtop to Apex Core 3.5.0

2017-01-24 Thread Sandesh Hegde
Prefer it to be made part of Apex core release. No need for another vote.

On Fri, Jan 20, 2017 at 1:53 PM Amol Kekre  wrote:

> +1
>
> Thks
> Amol
>
>
> On Thu, Jan 19, 2017 at 8:55 PM, Priyanka Gugale  >
> wrote:
>
> > +1
> >
> > On Fri, Jan 20, 2017 at 9:25 AM, Chinmay Kolhatkar <
> > chin...@datatorrent.com>
> > wrote:
> >
> > > Sanjay,
> > > Its not a lot of work. Just a version change, but primarily follwing
> > apache
> > > process for bigtop.
> > > Powered by Page of bigtop is here:
> > > https://cwiki.apache.org/confluence/display/BIGTOP/Powered+By+Bigtop
> > > Looking at the existing content, I would like to know what we can add
> > > there.
> > >
> > > All,
> > > Thanks for feedback. I'll start communication on bigtop mailing list
> for
> > > version upgrade.
> > >
> > >
> > >
> > > On Fri, Jan 20, 2017 at 12:09 AM, Sanjay Pujare <
> san...@datatorrent.com>
> > > wrote:
> > >
> > > > +1 assuming not a lot of work is involved.
> > > >
> > > > What does it take to add a mention of Apex in
> > http://bigtop.apache.org/
> > > ?
> > > >
> > > > On 1/19/17, 8:12 AM, "Pramod Immaneni" 
> wrote:
> > > >
> > > > +1
> > > >
> > > > On Thu, Jan 19, 2017 at 12:53 AM, Chinmay Kolhatkar <
> > > > chin...@datatorrent.com
> > > > > wrote:
> > > >
> > > > > Dear Community,
> > > > >
> > > > > Now that Apex core 3.5.0 is released, is it good time to
> upgrade
> > > > Apache
> > > > > Bigtop for Apex to 3.5.0?
> > > > >
> > > > > -Chinmay.
> > > > >
> > > >
> > > >
> > > >
> > > >
> > >
> >
>


Re: Suggestion on optimise kryo Output

2017-01-16 Thread Sandesh Hegde
Kryo is used in a default implementation of the StreamCodec interface.
Ideally, if the StreamCodec interface itself allows the buffer to be passed
then we can also send the buffer from the BufferServer in future.

On Mon, Jan 9, 2017 at 4:10 PM Bright Chen  wrote:

> Hi,
>
> The kryo Output has some limitation
>
>- The size of the data is limited. kryo write data to a fixed buffer, it
>will throw the overflow exception if the data exceed the size
>- The Output.toBytes() will copy the data to temporary buffer and
>output, it will decrease the performance and introduce garbage
> collection.
>
> When I was tuning Spillable Data structure and Manage State. I create a
> mechanism to share and reuse the memory to avoid above problem. And it can
> be reused in core serialization with small change. Please see jira:
> https://issues.apache.org/jira/browse/APEXMALHAR-2190
>
>
> Any suggestion or comments, please put in jira:
> https://issues.apache.org/jira/browse/APEXCORE-606
>
>
> thanks
>
> Bright
>


Re: [VOTE] Apache Apex Core Release 3.5.0 (RC1)

2016-12-07 Thread Sandesh Hegde
+1

Followed the steps in mentioned in http://apex.apache.org/verification.html
Verified the launch of an application.

On Tue, Dec 6, 2016 at 10:55 PM Thomas Weise  wrote:

> Dear Community,
>
> Please vote on the following Apache Apex Core 3.5.0 release candidate.
>
> This is a source release with binary artifacts published to Maven.
>
> List of all issues fixed:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?version=12335724=Text=12318823
> User documentation: https://apex.apache.org/docs/apex-3.5/
>
> Staging directory:
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-core-3.5.0-RC1/
> Source zip:
>
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-core-3.5.0-RC1/apache-apex-core-3.5.0-source-release.zip
> Source tar.gz:
>
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-core-3.5.0-RC1/apache-apex-core-3.5.0-source-release.tar.gz
> Maven staging repository:
> https://repository.apache.org/content/repositories/orgapacheapex-1021/
>
> Git source:
>
> https://git-wip-us.apache.org/repos/asf?p=apex-core.git;a=commit;h=refs/tags/v3.5.0-RC1
>  (commit: 6de8828e4f3d5734d0a6f9c1be0aa7057cb60ac8)
>
> PGP key:
> http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org
> KEYS file:
> https://dist.apache.org/repos/dist/release/apex/KEYS
>
> More information at:
> http://apex.apache.org
>
> Please try the release and vote; vote will be open for 72 hours.
>
> [ ] +1 approve (and what verification was done)
> [ ] -1 disapprove (and reason why)
>
> http://www.apache.org/foundation/voting.html
>
> How to verify release candidate:
>
> http://apex.apache.org/verification.html
>
> Thanks,
> Thomas
>


Re: "ExcludeNodes" for an Apex application

2016-12-02 Thread Sandesh Hegde
Yarn allows the AppMaster to run on the selected node, Apex shouldn't
select the blacklisted nodes, so it is possible to achieve not running the
Apex containers on certain nodes.

http://stackoverflow.com/questions/29302659/run-my-own-application-master-on-a-specific-node-in-a-yarn-cluster


On Thu, Dec 1, 2016 at 11:52 PM Amol Kekre  wrote:

> Yarn will deploy AM (Stram) on a node of its choice, therey rendering any
> attribute within the app un-enforceable in terms of not deploying master on
> a node.
>
> Thks
> Amol
>
>
> On Thu, Dec 1, 2016 at 11:19 PM, Milind Barve  wrote:
>
> > Additionally, this would apply to Stram as well i.e. the master should
> also
> > not be deployed on these nodes. Not sure if anti-affinity goes beyond
> > operators.
> >
> > On Fri, Dec 2, 2016 at 12:47 PM, Milind Barve  wrote:
> >
> > > My previous mail explains it, but just forgot to add : -1 to cover this
> > > under anti affinity.
> > >
> > > On Fri, Dec 2, 2016 at 12:46 PM, Milind Barve 
> wrote:
> > >
> > >> While it is possible to extend anti-affinity to take care of this, I
> > feel
> > >> it will cause confusion from a user perspective. As a user, when I
> think
> > >> about anti-affinity, what comes to mind right away is a relative
> > relation
> > >> between operators.
> > >>
> > >> On the other hand, the current ask is not that, but a relation at an
> > >> application level w.r.t. a node. (Further, we might even think of
> > extending
> > >> this at an operator level - which would mean do not deploy an operator
> > on a
> > >> particular node)
> > >>
> > >> We would be better off clearly articulating and allowing users to
> > >> configure it seperately as against using anti-affinity.
> > >>
> > >> On Fri, Dec 2, 2016 at 10:03 AM, Bhupesh Chawda <
> > bhup...@datatorrent.com>
> > >> wrote:
> > >>
> > >>> Okay, I think that serves an alternate purpose of detecting any newly
> > >>> gone
> > >>> bad node and excluding it.
> > >>>
> > >>> +1 for covering the original scenario under anti-affinity.
> > >>>
> > >>> ~ Bhupesh
> > >>>
> > >>> On Fri, Dec 2, 2016 at 9:14 AM, Munagala Ramanath <
> r...@datatorrent.com
> > >
> > >>> wrote:
> > >>>
> > >>> > It only takes effect after failures -- no way to exclude from the
> > >>> get-go.
> > >>> >
> > >>> > Ram
> > >>> >
> > >>> > On Dec 1, 2016 7:15 PM, "Bhupesh Chawda" 
> > >>> wrote:
> > >>> >
> > >>> > > As suggested by Sandesh, the parameter
> > >>> > > MAX_CONSECUTIVE_CONTAINER_FAILURES_FOR_BLACKLIST seems to do
> > exactly
> > >>> > what
> > >>> > > is needed.
> > >>> > > Why would this not work?
> > >>> > >
> > >>> > > ~ Bhupesh
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> ~Milind bee at gee mail dot com
> > >>
> > >
> > >
> > >
> > > --
> > > ~Milind bee at gee mail dot com
> > >
> >
> >
> >
> > --
> > ~Milind bee at gee mail dot com
> >
>


Re: "ExcludeNodes" for an Apex application

2016-12-01 Thread Sandesh Hegde
I have created a jira, for adding the list of blacklisted nodes,
https://issues.apache.org/jira/browse/APEXCORE-584

On Wed, Nov 30, 2016 at 11:06 PM Sanjay Pujare <san...@datatorrent.com>
wrote:

> Yes, Ram explained to me that in practice this would be a useful feature
> for Apex devops who typically have no control over Hadoop/Yarn cluster.
>
> On 11/30/16, 9:22 PM, "Mohit Jotwani" <mo...@datatorrent.com> wrote:
>
> This is a practical scenario where developers would be required to
> exclude
> certain nodes as they might be required for some mission critical
> applications. It would be good to have this feature.
>
> I understand that Stram should not get into resourcing and still rely
> on
> Yarn, however, as the App Master it should have the right to reject the
> nodes offered by Yarn and request for other resources.
>
>     Regards,
> Mohit
>
> On Thu, Dec 1, 2016 at 2:34 AM, Sandesh Hegde <sand...@datatorrent.com
> >
> wrote:
>
> > Apex has automatic blacklisting of the troublesome nodes, please
> take a
> > look at the following attributes,
> >
> > MAX_CONSECUTIVE_CONTAINER_FAILURES_FOR_BLACKLIST
> > https://www.datatorrent.com/docs/apidocs/com/datatorrent/
> > api/Context.DAGContext.html#MAX_CONSECUTIVE_CONTAINER_
> > FAILURES_FOR_BLACKLIST
> >
> > BLACKLISTED_NODE_REMOVAL_TIME_MILLIS
> >
> > Thanks
> >
> >
> >
> > On Wed, Nov 30, 2016 at 12:56 PM Munagala Ramanath <
> r...@datatorrent.com>
> > wrote:
> >
> > Not sure if this is what Milind had in mind but we often run into
> > situations where the dev group
> > working with Apex has no control over cluster configuration -- to
> make any
> > changes to the cluster they need to
> > go through an elaborate process that can take many days.
> >
> > Meanwhile, if they notice that a particular node is consistently
> causing
> > problems for their
> > app, having a simple way to exclude it would be very helpful since
> it gives
> > them a way
> > to bypass communication and process issues within their own
> organization.
> >
> > Ram
> >
> > On Wed, Nov 30, 2016 at 10:58 AM, Sanjay Pujare <
> san...@datatorrent.com>
> > wrote:
> >
> > > To me both use cases appear to be generic resource management use
> cases.
> > > For example, a randomly rebooting node is not good for any purpose
> esp.
> > > long running apps so it is a bit of a stretch to imagine that
> these nodes
> > > will be acceptable for some batch jobs in Yarn. So such a node
> should be
> > > marked “Bad” or Unavailable in Yarn itself.
> > >
> > > Second use case is also typical anti-affinity use case which
> ideally
> > > should be implemented in Yarn – Milind’s example can also apply to
> > non-Apex
> > > batch jobs. In any case it looks like Yarn still doesn’t have it (
> > > https://issues.apache.org/jira/browse/YARN-1042) so if Apex needs
> it we
> > > will need to do it ourselves.
> > >
> > > On 11/30/16, 10:39 AM, "Munagala Ramanath" <r...@datatorrent.com>
> wrote:
> > >
> > > But then, what's the solution to the 2 problem scenarios that
> Milind
> > > describes ?
> > >
> > > Ram
> > >
> > > On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujare <
> > > san...@datatorrent.com>
> > > wrote:
> > >
> > > > I think “exclude nodes” and such is really the job of the
> resource
> > > manager
> > > > i.e. Yarn. So I am not sure taking over some of these tasks
> in Apex
> > > would
> > > > be very useful.
> > > >
> > > > I agree with Amol that apps should be node neutral. Resource
> > > management in
> > > > Yarn together with fault tolerance in Apex should minimize
> the need
> > > for
> > > > this feature although I am sure one can find use cases.
> > > >
> > > >
> > > > On 11/29/16, 10:41 PM, "Amol Kekre" <a...@datatorrent.com>
> wrote:
> > > >
> > > > We do have this feature in Yarn, but that applies to all
> 

Re: [DISCUSSION] Custom Control Tuples

2016-11-30 Thread Sandesh Hegde
I am interested in working on the following subtask

https://issues.apache.org/jira/browse/APEXCORE-581

Thanks


On Wed, Nov 30, 2016 at 2:07 PM David Yan <da...@datatorrent.com> wrote:

> I have created an umbrella ticket for control tuple support:
>
> https://issues.apache.org/jira/browse/APEXCORE-579
>
> Currently it has two subtasks. Please have a look at them and see whether
> I'm missing anything or if you have anything to add. You are welcome to add
> more subtasks or comment on the existing subtasks.
>
> We would like to kick start the implementation soon.
>
> Thanks!
>
> David
>
> On Mon, Nov 28, 2016 at 5:22 PM, Bhupesh Chawda <bhup...@datatorrent.com>
> wrote:
>
> > +1 for the plan.
> >
> > I would be interested in contributing to this feature.
> >
> > ~ Bhupesh
> >
> > On Nov 29, 2016 03:26, "Sandesh Hegde" <sand...@datatorrent.com> wrote:
> >
> > > I am interested in contributing to this feature.
> > >
> > > On Mon, Nov 28, 2016 at 1:54 PM David Yan <da...@datatorrent.com>
> wrote:
> > >
> > > > I think we should probably go ahead with option 1 since this works
> with
> > > > most use cases and prevents developers from shooting themselves in
> the
> > > foot
> > > > in terms of idempotency.
> > > >
> > > > We can have a configuration property that enables option 2 later if
> we
> > > have
> > > > concrete use cases that call for it.
> > > >
> > > > Please share your thoughts if you think you don't agree with this
> plan.
> > > > Also, please indicate if you're interested in contributing to this
> > > feature.
> > > >
> > > > David
> > > >
> > > > On Sun, Nov 27, 2016 at 9:02 PM, Bhupesh Chawda <
> > bhup...@datatorrent.com
> > > >
> > > > wrote:
> > > >
> > > > > It appears that option 1 is more favored due to unavailability of a
> > use
> > > > > case which could use option 2.
> > > > >
> > > > > However, option 2 is problematic in specific cases, like presence
> of
> > > > > multiple input ports for example. In case of a linear DAG where
> > control
> > > > > tuples are flowing in order with the data tuples, it should not be
> > > > > difficult to guarantee idempotency. For example, cases where there
> > > could
> > > > be
> > > > > multiple changes in behavior of an operator during a single window,
> > it
> > > > > should not wait until end window for these changes to take effect.
> > > Since,
> > > > > we don't have a concrete use case right now, perhaps we do not want
> > to
> > > go
> > > > > that road. This feature should be available through a platform
> > > attribute
> > > > > (may be at a later point in time) where the default is option 1.
> > > > >
> > > > > I think option 1 is suitable for a starting point in the
> > implementation
> > > > of
> > > > > this feature and we should proceed with it.
> > > > >
> > > > > ~ Bhupesh
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Nov 11, 2016 at 12:59 AM, David Yan <da...@datatorrent.com
> >
> > > > wrote:
> > > > >
> > > > > > Good question Tushar. The callback should be called only once.
> > > > > > The way to implement this is to keep a list of control tuple
> hashes
> > > for
> > > > > the
> > > > > > given streaming window and only do the callback when the operator
> > has
> > > > not
> > > > > > seen it before.
> > > > > >
> > > > > > Other thoughts?
> > > > > >
> > > > > > David
> > > > > >
> > > > > > On Thu, Nov 10, 2016 at 9:32 AM, Tushar Gosavi <
> > > tus...@datatorrent.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi David,
> > > > > > >
> > > > > > > What would be the behaviour in case where we have a DAG with
> > > > following
> > > > > > > operators, the number in bracket is number of partitions, X is
> > NxM
> > > > > > > partitioning.
> > > > > > 

Re: "ExcludeNodes" for an Apex application

2016-11-30 Thread Sandesh Hegde
Apex has automatic blacklisting of the troublesome nodes, please take a
look at the following attributes,

MAX_CONSECUTIVE_CONTAINER_FAILURES_FOR_BLACKLIST
https://www.datatorrent.com/docs/apidocs/com/datatorrent/api/Context.DAGContext.html#MAX_CONSECUTIVE_CONTAINER_FAILURES_FOR_BLACKLIST

BLACKLISTED_NODE_REMOVAL_TIME_MILLIS

Thanks



On Wed, Nov 30, 2016 at 12:56 PM Munagala Ramanath 
wrote:

Not sure if this is what Milind had in mind but we often run into
situations where the dev group
working with Apex has no control over cluster configuration -- to make any
changes to the cluster they need to
go through an elaborate process that can take many days.

Meanwhile, if they notice that a particular node is consistently causing
problems for their
app, having a simple way to exclude it would be very helpful since it gives
them a way
to bypass communication and process issues within their own organization.

Ram

On Wed, Nov 30, 2016 at 10:58 AM, Sanjay Pujare 
wrote:

> To me both use cases appear to be generic resource management use cases.
> For example, a randomly rebooting node is not good for any purpose esp.
> long running apps so it is a bit of a stretch to imagine that these nodes
> will be acceptable for some batch jobs in Yarn. So such a node should be
> marked “Bad” or Unavailable in Yarn itself.
>
> Second use case is also typical anti-affinity use case which ideally
> should be implemented in Yarn – Milind’s example can also apply to
non-Apex
> batch jobs. In any case it looks like Yarn still doesn’t have it (
> https://issues.apache.org/jira/browse/YARN-1042) so if Apex needs it we
> will need to do it ourselves.
>
> On 11/30/16, 10:39 AM, "Munagala Ramanath"  wrote:
>
> But then, what's the solution to the 2 problem scenarios that Milind
> describes ?
>
> Ram
>
> On Wed, Nov 30, 2016 at 10:34 AM, Sanjay Pujare <
> san...@datatorrent.com>
> wrote:
>
> > I think “exclude nodes” and such is really the job of the resource
> manager
> > i.e. Yarn. So I am not sure taking over some of these tasks in Apex
> would
> > be very useful.
> >
> > I agree with Amol that apps should be node neutral. Resource
> management in
> > Yarn together with fault tolerance in Apex should minimize the need
> for
> > this feature although I am sure one can find use cases.
> >
> >
> > On 11/29/16, 10:41 PM, "Amol Kekre"  wrote:
> >
> > We do have this feature in Yarn, but that applies to all
> applications.
> > I am
> > not sure if Yarn has anti-affinity. This feature may be used,
> but in
> > general there is danger is an application taking over resource
> > allocation.
> > Another quirk is that big data apps should ideally be
> node-neutral.
> > This is
> > a good idea, if we are able to carve out something where need is
> app
> > specific.
> >
> > Thks
> > Amol
> >
> >
> > On Tue, Nov 29, 2016 at 10:00 PM, Milind Barve <
> mili...@gmail.com>
> > wrote:
> >
> > > We have seen 2 cases mentioned below, where, it would have
> been nice
> > if
> > > Apex allowed us to exclude a node from the cluster for an
> > application.
> > >
> > > 1. A node in the cluster had gone bad (was randomly rebooting)
> and
> > so an
> > > Apex app should not use it - other apps can use it as they
were
> > batch jobs.
> > > 2. A node is being used for a mission critical app (Could be
> an Apex
> > app
> > > itself), but another Apex app which is mission critical should
> not
> > be using
> > > resources on that node.
> > >
> > > Can we have a way in which, Stram and YARN can coordinate
> between
> > each
> > > other to not use a set of nodes for the application. It an be
> done
> > in 2 way
> > > s-
> > >
> > > 1. Have a list of "exclude" nodes with Stram- when YARN
> allcates
> > resources
> > > on either of these, STRAM rejects and gets resources allocated
> again
> > frm
> > > YARN
> > > 2. Have a list of nodes that can be used for an app - This can
> be a
> > part of
> > > config. Hwever, I don't think this would be a right way to do
> so as
> > we will
> > > need support from YARN as well. Further, this might be
> difficult to
> > change
> > > at runtim if need be.
> > >
> > > Any thoughts?
> > >
> > >
> > > --
> > > ~Milind bee at gee mail dot com
> > >
> >
> >
> >
> >
>
>
>
>


Re: [DISCUSSION] Custom Control Tuples

2016-11-28 Thread Sandesh Hegde
I am interested in contributing to this feature.

On Mon, Nov 28, 2016 at 1:54 PM David Yan  wrote:

> I think we should probably go ahead with option 1 since this works with
> most use cases and prevents developers from shooting themselves in the foot
> in terms of idempotency.
>
> We can have a configuration property that enables option 2 later if we have
> concrete use cases that call for it.
>
> Please share your thoughts if you think you don't agree with this plan.
> Also, please indicate if you're interested in contributing to this feature.
>
> David
>
> On Sun, Nov 27, 2016 at 9:02 PM, Bhupesh Chawda 
> wrote:
>
> > It appears that option 1 is more favored due to unavailability of a use
> > case which could use option 2.
> >
> > However, option 2 is problematic in specific cases, like presence of
> > multiple input ports for example. In case of a linear DAG where control
> > tuples are flowing in order with the data tuples, it should not be
> > difficult to guarantee idempotency. For example, cases where there could
> be
> > multiple changes in behavior of an operator during a single window, it
> > should not wait until end window for these changes to take effect. Since,
> > we don't have a concrete use case right now, perhaps we do not want to go
> > that road. This feature should be available through a platform attribute
> > (may be at a later point in time) where the default is option 1.
> >
> > I think option 1 is suitable for a starting point in the implementation
> of
> > this feature and we should proceed with it.
> >
> > ~ Bhupesh
> >
> >
> >
> > On Fri, Nov 11, 2016 at 12:59 AM, David Yan 
> wrote:
> >
> > > Good question Tushar. The callback should be called only once.
> > > The way to implement this is to keep a list of control tuple hashes for
> > the
> > > given streaming window and only do the callback when the operator has
> not
> > > seen it before.
> > >
> > > Other thoughts?
> > >
> > > David
> > >
> > > On Thu, Nov 10, 2016 at 9:32 AM, Tushar Gosavi  >
> > > wrote:
> > >
> > > > Hi David,
> > > >
> > > > What would be the behaviour in case where we have a DAG with
> following
> > > > operators, the number in bracket is number of partitions, X is NxM
> > > > partitioning.
> > > > A(1) X B(4) X C(2)
> > > >
> > > > If A sends a control tuple, it will be sent to all 4 partition of B,
> > > > and from each partition from B it goes to C, i.e each partition of C
> > > > will receive same control tuple originated from A multiple times
> > > > (number of upstream partitions of C). In this case will the callback
> > > > function get called multiple times or just once.
> > > >
> > > > -Tushar.
> > > >
> > > >
> > > > On Fri, Nov 4, 2016 at 12:14 AM, David Yan 
> > > wrote:
> > > > > Hi Bhupesh,
> > > > >
> > > > > Since each input port has its own incoming control tuple, I would
> > > imagine
> > > > > there would be an additional DefaultInputPort.processControl method
> > > that
> > > > > operator developers can override.
> > > > > If we go for option 1, my thinking is that the control tuples would
> > > > always
> > > > > be delivered at the next window boundary, even if the emit method
> is
> > > > called
> > > > > within a window.
> > > > >
> > > > > David
> > > > >
> > > > > On Thu, Nov 3, 2016 at 1:46 AM, Bhupesh Chawda <
> > > bhup...@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > >> I have a question regarding the callback for a control tuple. Will
> > it
> > > be
> > > > >> similar to InputPort::process() method? Something like
> > > > >> InputPort::processControlTuple(t)
> > > > >> ? Or will it be a method of the operator similar to beginWindow()?
> > > > >>
> > > > >> When we say that the control tuple will be delivered at window
> > > boundary,
> > > > >> does that mean all control tuples emitted in that window will be
> > > > processed
> > > > >> together at the end of the window? This would imply that there is
> no
> > > > >> ordering among regular tuples and control tuples.
> > > > >>
> > > > >> I think we should get started with the option 1 - control tuples
> at
> > > > window
> > > > >> boundary, which seems to handle most of the use cases. For some
> > cases
> > > > which
> > > > >> require option 2, we can always build on this.
> > > > >>
> > > > >> ~ Bhupesh
> > > > >>
> > > > >> On Thu, Nov 3, 2016 at 1:35 PM, Thomas Weise 
> > wrote:
> > > > >>
> > > > >> > I don't see how that would work. Suppose you have a file
> splitter
> > > and
> > > > >> > multiple partitions of block readers. The "end of file" event
> > cannot
> > > > be
> > > > >> > processed downstream until all block readers are done. I also
> > think
> > > > that
> > > > >> > this is related to the batch demarcation discussion and there
> > should
> > > > be a
> > > > >> > single generalized mechanism to support this.
> > > > >> >
> > > > >> >
> > > > >> > On Wed, 

Re: Proposal for apex/malhar extensions

2016-11-16 Thread Sandesh Hegde
Do we have any projects today that can benefit from this setup?
Earlier in this mail thread, we discussed "contrib (low bar) & graduation"
in Malhar, that is not sufficient?

On Wed, Nov 16, 2016 at 11:19 AM Chinmay Kolhatkar 
wrote:

> @sanjay, yes we can define the process around this.
>
> @pramod, Well, I said apex-malhar because the extensions can be operators
> and and other plugins to apex engine.
> Do you see the use of this for apex-core as well?
>
> -Chinmay.
>
>
> On Wed, Nov 16, 2016 at 7:24 PM, Pramod Immaneni 
> wrote:
>
> > So it would be like a yellow pages for the apex ecosystem. Sounds like a
> > good idea. Why limit it to malhar?
> >
> > On Wed, Nov 16, 2016 at 3:17 AM, Chinmay Kolhatkar 
> > wrote:
> >
> > > Dear Community,
> > >
> > > This is in relation to malhar cleanup work that is ongoing.
> > >
> > > In one of the talks during Apache BigData Europe, I got to know about
> > > Spark-Packages (https://spark-packages.org/) (I believe lot of you
> must
> > be
> > > aware of it).
> > > Spark package is basically functionality over and above and using Spark
> > > core functionality. The spark packages can initially present in
> someone's
> > > public repository and one could register that with
> > > https://spark-packages.org/ and later on as it matures and finds more
> > use,
> > > it gets consumed in mainstream Spark repository and releases.
> > >
> > > I found this idea quite interesting to keep our apex-malhar releases
> > > cleaner.
> > >
> > > One could have extension to apex-malhar in their own repository and
> just
> > > register itself with Apache Apex. As it matures and find more and more
> > use
> > > we can consume that in mainstream releases.
> > > Advantages to this are multiple:
> > > 1. The entry point for registering extensions with Apache Apex can be
> > > minimal. This way we get more indirect contributions.
> > > 2. Faster way to add more feature in the project.
> > > 3. We keep our releases cleaner.
> > > 4. One could progress on feature-set faster balancing both Apache Way
> as
> > > well as their own Enterprise Interests.
> > >
> > > Please share your thoughts on this.
> > >
> > > Thanks,
> > > Chinmay.
> > >
> >
>


Re: Malhar release 3.6

2016-11-15 Thread Sandesh Hegde
I have the PR open for the following issue,
https://issues.apache.org/jira/browse/APEXMALHAR-2298

https://github.com/apache/apex-malhar/pull/492

This change was done after a user feedback. Should we get this in for 3.6?

Thanks



On Tue, Nov 15, 2016 at 3:45 PM Thomas Weise  wrote:

> It has been a while and the issues that we were waiting for are still
> unresolved.
>
>
> https://issues.apache.org/jira/issues/?jql=fixVersion%20%3D%203.6.0%20AND%20project%20%3D%20APEXMALHAR%20and%20resolution%20%3D%20unresolved%20ORDER%20BY%20status%20ASC
>
> I would suggest:
>
> APEXMALHAR-2300  -
> defer
> APEXMALHAR-2130  /
> APEXMALHAR-2301  -
> David/Siyuan can you please give an update and recommendation.
> APEXMALHAR-2284  -
> see discussion on the PR and JIRA. Also discussed it with Bhupesh and
> Chinmay today and we think that this join operator should probably be
> replaced with the new operator that is based on WindowOperator.
> APEXMALHAR-2203  -
> defer
>
> Please provide any feedback you may have within a day so that we can
> unblock the release.
>
> Thanks,
> Thomas
>
>
>
>
>
>
>
>
> On Mon, Nov 7, 2016 at 6:36 AM, Chaitanya Chebolu <
> chaita...@datatorrent.com
> > wrote:
>
> > Hi Thomas,
> >
> >I am working on APEXMALHAR-2284 and will open a PR in couple of days.
> >
> > Regards,
> > Chaitanya
> >
> > On Sun, Nov 6, 2016 at 10:51 PM, Thomas Weise  wrote:
> >
> > > Is anyone working on APEXMALHAR-2284 ?
> > >
> > > On Fri, Oct 28, 2016 at 11:00 AM, Bhupesh Chawda <
> > bhup...@datatorrent.com>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > ~ Bhupesh
> > > >
> > > > On Fri, Oct 28, 2016 at 2:29 PM, Tushar Gosavi <
> tus...@datatorrent.com
> > >
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > - Tushar.
> > > > >
> > > > >
> > > > > On Fri, Oct 28, 2016 at 12:52 PM, Aniruddha Thombare
> > > > >  wrote:
> > > > > > +1
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > A
> > > > > >
> > > > > > _
> > > > > > Sent with difficulty, I mean handheld ;)
> > > > > >
> > > > > > On 28 Oct 2016 12:30 pm, "Siyuan Hua" 
> > > wrote:
> > > > > >
> > > > > >> +1
> > > > > >>
> > > > > >> Thanks!
> > > > > >>
> > > > > >> Sent from my iPhone
> > > > > >>
> > > > > >> > On Oct 26, 2016, at 13:11, Thomas Weise 
> wrote:
> > > > > >> >
> > > > > >> > Hi,
> > > > > >> >
> > > > > >> > I'm proposing another release of Malhar in November. There are
> > 49
> > > > > issues
> > > > > >> > marked for the release, including important bug fixes, new
> > > > > documentation,
> > > > > >> > SQL support and the work for windowed operator state
> management:
> > > > > >> >
> > > > > >> > https://issues.apache.org/jira/issues/?jql=fixVersion%
> > > > > >> 20%3D%203.6.0%20AND%20project%20%3D%20APEXMALHAR%20ORDER%
> > > > > >> 20BY%20status%20ASC
> > > > > >> >
> > > > > >> > Currently there is at least one blocker, the join operator is
> > > broken
> > > > > >> after
> > > > > >> > change in managed state. It also affects the SQL feature.
> > > > > >> >
> > > > > >> > Thanks,
> > > > > >> > Thomas
> > > > > >>
> > > > >
> > > >
> > >
> >
>


Re: Integration with Apache Samoa

2016-11-07 Thread Sandesh Hegde
Good work Bhupesh.

On Mon, Nov 7, 2016 at 11:17 AM David Yan  wrote:

> It took perseverance to get this merged, Good work Bhupesh!
>
> On Mon, Nov 7, 2016 at 1:25 AM, Bhupesh Chawda 
> wrote:
>
> > Hi All,
> >
> > The PR for making Apex a runner for SAMOA has been merged.
> >
> > Apache SAMOA now has an additional runner with Apache Apex -
> > https://github.com/apache/incubator-samoa/tree/master/samoa-apex
> >
> > Thanks.
> >
> > ~ Bhupesh
> >
> > On Mon, Mar 28, 2016 at 11:05 AM, Bhupesh Chawda <
> bhup...@datatorrent.com>
> > wrote:
> >
> > > Hi All,
> > >
> > > Here is the status of the integration, since the last update:
> > >
> > >- Launch process cleaned up. Launch still happening through DTCli
> > >calls. Once APEXCORE-405
> > > is
> implemented,
> > >this will become a lot better.
> > >- Some issues still causing problems in running Samoa apps on Apex.
> > >Containers getting killed. This may be due to low memory. This is
> > work in
> > >progress.
> > >
> > > ~Bhupesh
> > >
> > > On Wed, Mar 2, 2016 at 10:14 AM, Bhupesh Chawda <
> bhup...@datatorrent.com
> > >
> > > wrote:
> > >
> > >> Hi David,
> > >>
> > >> Here is the working branch you can look at:
> > >> https://github.com/bhupeshchawda/incubator-samoa/tree/samoa-apex
> > >>
> > >> As I mentioned, the launch part needs to be worked on. Currently I
> have
> > a
> > >> few hacks in my local environment.
> > >> You can use the test cases though to get an idea.
> > >>
> > >> -Bhupesh
> > >>
> > >> On Wed, Mar 2, 2016 at 1:01 AM, David Yan 
> > wrote:
> > >>
> > >>> Hi Bhupesh,
> > >>>
> > >>> That's good progress.  Can you send us a link to the code you did for
> > >>> this?  Or maybe a review-only PR?
> > >>>
> > >>> David
> > >>>
> > >>> On Mon, Feb 29, 2016 at 10:15 PM, Bhupesh Chawda <
> > >>> bhup...@datatorrent.com>
> > >>> wrote:
> > >>>
> > >>> > Hi All,
> > >>> >
> > >>> > Here is the status of integration of Apache Apex into Apache Samoa.
> > >>> >
> > >>> >- Samoa API implemented and able to convert Samoa topology into
> > Apex
> > >>> > Dag.
> > >>> >- Implemented partitioning support using parallelism hints from
> > >>> Samoa
> > >>> >API.
> > >>> >- Implemented stream multiplexing:
> > >>> >   - Added All-based partitioner. Upstream tuples go to all
> > >>> downstream
> > >>> >   partitions
> > >>> >   - Stream codec for Key based partitioning
> > >>> >   - Stream codec for Random partitioning
> > >>> >- Able to launch a Samoa task on the cluster. This has to be
> > worked
> > >>> on.
> > >>> >Currently some hacks are used by calling DTCli explicitly from
> the
> > >>> main
> > >>> >entry point in Samoa code. Also jars are needed to be manually
> > >>> bundled.
> > >>> >This will be worked on in this sprint.
> > >>> >- Tested the following algorithms on local cluster:
> > >>> >   - Prequential Evaluation using Vertical Hoeffding Tree
> > >>> classifier.
> > >>> >   This is a decision tree based classifier.
> > >>> >   - Clustering using CluStream algorithm.
> > >>> >- I have asked clarifications on some more details of these
> > >>> algorithms
> > >>> >as well as serialization issues with Samoa classes. I am waiting
> > for
> > >>> > some
> > >>> >response from the Samoa community.
> > >>> >- Although Samoa does not have many algorithms currently (it is
> a
> > >>> >framework for developing algorithms), more algorithms are
> expected
> > >>> as a
> > >>> >part of their roadmap:
> > >>> >https://cwiki.apache.org/confluence/display/SAMOA/Roadmap
> > >>> >
> > >>> > Thanks,
> > >>> >
> > >>> > Bhupesh
> > >>> >
> > >>>
> > >>
> > >>
> > >
> >
>


Re: Malhar release 3.6

2016-10-27 Thread Sandesh Hegde
Here is a one more jira for 3.6, I need few more days to open the PR.
https://issues.apache.org/jira/browse/APEXMALHAR-2298?filter=-1

Thanks


On Thu, Oct 27, 2016 at 10:28 AM Vlad Rozov  wrote:

+1. It will be nice to have
https://issues.apache.org/jira/browse/APEXMALHAR-2178 fixed in 3.6.

Thank you,

Vlad

On 10/27/16 08:48, Amol Kekre wrote:
> +1
>
> Thks
> Amol
>
>
> On Wed, Oct 26, 2016 at 11:22 PM, Milind Barve  wrote:
>
>> +1
>>
>> On Thu, Oct 27, 2016 at 11:21 AM, Chinmay Kolhatkar 
>> wrote:
>>
>>> +1.
>>>
>>> On Thu, Oct 27, 2016 at 1:41 AM, Thomas Weise  wrote:
>>>
 Hi,

 I'm proposing another release of Malhar in November. There are 49
>> issues
 marked for the release, including important bug fixes, new
>> documentation,
 SQL support and the work for windowed operator state management:

 https://issues.apache.org/jira/issues/?jql=fixVersion%
 20%3D%203.6.0%20AND%20project%20%3D%20APEXMALHAR%20ORDER%
 20BY%20status%20ASC

 Currently there is at least one blocker, the join operator is broken
>>> after
 change in managed state. It also affects the SQL feature.

 Thanks,
 Thomas

>>
>>
>> --
>> ~Milind bee at gee mail dot com
>>


Better error message for java version mismatch

2016-10-05 Thread Sandesh Hegde
Hi All,

When an app package is compiled with Java 1.8 and Apex is using 1.7, we see
the following behaviour with Apex CLI

1. launch 
"No applications in Application Package", ideally it should point out the
exact error message

2. get-app-package-operators 
Throws the following exception java.lang.UnsupportedClassVersionError,

Should we add an extra message indicating that the Java version mismatch
has happened?

Thanks


Re: [VOTE] Hadoop upgrade

2016-10-03 Thread Sandesh Hegde
+1 for 2.6

On Mon, Oct 3, 2016, 2:06 PM Sasha Parfenov  wrote:

> +1 for Hadoop 2.6 upgrade.
>
> Thanks,
> Sasha
>
> On Monday, October 3, 2016, Thomas Weise  wrote:
>
> > +1 for 2.6 upgrade
> >
> >
> > On Mon, Oct 3, 2016 at 1:47 PM, David Yan  > > wrote:
> >
> > > Hi all,
> > >
> > > Thomas created this ticket for upgrading our Hadoop dependency version
> a
> > > couple weeks ago:
> > >
> > > https://issues.apache.org/jira/browse/APEXCORE-536
> > >
> > > We'd like to get the ball rolling and would like to take a vote from
> the
> > > community which version we would like to upgrade to. We have these
> > choices:
> > >
> > > 2.2.0 (no upgrade)
> > > 2.4.x
> > > 2.5.x
> > > 2.6.x
> > >
> > > We are not considering 2.7.x because we already know that many Apex
> users
> > > are using Hadoop distros that are based on 2.6.
> > >
> > > Please note that Apex works with all versions of Hadoop higher or equal
> > to
> > > the Hadoop version Apex depends on, as long as it's 2.x.x. We are not
> > > considering Hadoop 3.0.0-alpha yet at this time.
> > >
> > > When voting, please keep these in mind:
> > >
> > > - The features that are added in 2.4.x, 2.5.x, and 2.6.x respectively,
> > and
> > > how useful those features are for Apache Apex
> > > - The Hadoop versions the major distros (Cloudera, Hortonworks, MapR,
> > EMR,
> > > etc) are supporting
> > > - The Hadoop versions what typical Apex users are using
> > >
> > > Thanks,
> > >
> > > David
> > >
> >
>


Re: Apex Core PR

2016-10-01 Thread Sandesh Hegde
Travis is lot faster than the Apache Jenkins, should we just stick with
that?

On Sat, Oct 1, 2016 at 9:46 AM Chinmay Kolhatkar 
wrote:

> Hi Vlad,
>
> In one of the PR the travis build is passing but Jenkins build is failing.
> https://github.com/apache/apex-malhar/pull/409
>
> Is it observed somewhere else too?
> Any idea why that is the case?
>
> Thanks,
> Chinmay.
>
>
> On Sat, Oct 1, 2016 at 2:13 AM, Vlad Rozov 
> wrote:
>
> > Apex Malhar PR validation is enabled for Apache Jenkins as well.
> >
> > Vlad
> >
> >
> > On 9/7/16 14:51, Vlad Rozov wrote:
> >
> >> Please note that I added Apex Core PR validation to the Apache Jenkins:
> >> https://builds.apache.org/job/Apex_Core_PR/. I'll add Apex Malhar later
> >> unless somebody volunteers to pick this item.
> >>
> >> Vlad
> >>
> >
> >
>


Re: Automated changes git author

2016-09-28 Thread Sandesh Hegde
BigFoot?

On Wed, Sep 28, 2016 at 1:20 PM Thomas Weise  wrote:

> What about the name "CI Support"? Does not look like best fit either. Any
> better ideas or keep it?
>
> I will document the outcome in the contributor guidelines.
>
> On Wed, Sep 28, 2016 at 11:13 AM, Pramod Immaneni 
> wrote:
>
> > What trustworthy jenkins no more. Kidding aside +1
> >
> > On Wed, Sep 28, 2016 at 11:34 AM, Thomas Weise  wrote:
> >
> > > Hi,
> > >
> > > For changes made by scripts, there has been an undocumented convention
> to
> > > use the following author information (example):
> > >
> > > commit 763d14fca6b84fdda1b6853235e5d4b71ca87fca
> > > Author: CI Support 
> > > Date:   Mon Sep 26 20:36:22 2016 -0700
> > >
> > > Fix trailing whitespace.
> > >
> > > I would suggest we discontinue use of jenk...@datatorrent.com and
> start
> > > using dev@apex.apache.org instead?
> > >
> > > Thanks,
> > > Thomas
> > >
> >
>


Re: checkpoint statistics

2016-09-25 Thread Sandesh Hegde
Say it takes x MB size and y seconds to do the checkpoint. What does the
user do with that information?

On Sun, Sep 25, 2016, 6:51 AM Tushar Gosavi  wrote:

> +1
>
> -Tushar
>
> On Sun, Sep 25, 2016, 8:54 AM Sanjay Pujare 
> wrote:
>
> > +1
> >
> > Sanjay
> >
> >
> > On Sun, Sep 25, 2016 at 7:06 AM, Devendra Tagare <
> > devend...@datatorrent.com>
> > wrote:
> >
> > > +1
> > >
> > > Thanks,
> > > Dev
> > >
> > > On Sep 25, 2016 1:17 AM, "Pramod Immaneni" 
> > wrote:
> > >
> > > > +1
> > > >
> > > > > On Sep 24, 2016, at 10:01 AM, Vlad Rozov 
> > > > wrote:
> > > > >
> > > > > IMO, it may be useful to provide checkpoint statistics for example,
> > > > total size of checkpoint for particular window or average size of
> > > > checkpoints for a particular operator. Also, how long it takes to
> write
> > > > checkpoints to storage.
> > > > >
> > > > > Thank you,
> > > > >
> > > > > Vlad
> > > >
> > >
> >
>


Re: Improving Apex relaunch time.

2016-09-21 Thread Sandesh Hegde
Relaunching from the same location can be one of the options.

On Tue, Sep 20, 2016, 10:17 PM Tushar Gosavi <tus...@datatorrent.com> wrote:

> In case of application failure, we will like to have ability to
> quickly restart the application while keeping the old state for
> failure
> analysis. Also the problem remains the same when we want to start from
> savepoint, where we will need to copy state from
> savepoint to application.
>
> -Tushar.
>
>
>
> On Tue, Sep 20, 2016 at 8:34 PM, Sandesh Hegde <sand...@datatorrent.com>
> wrote:
> > How about re-launching the app from the same location?
> >
> > If at all they want to store the state we can provide savepoint feature.
> >
> > On Tue, Sep 20, 2016 at 4:39 AM Tushar Gosavi <tus...@datatorrent.com>
> > wrote:
> >
> >> We have observed that application relaunch takes long time.
> >> The one major reason for delay in application startup during relaunch
> >> is time taken to copy state of exisitng application to new application.
> >> This state could grow in GBs and copy is performed in single thread
> before
> >> new application is submitted to Yarn.
> >>
> >> The state of previous application constists
> >> - jars
> >> - stram checkpoint/recovery file.
> >> - events
> >> - container file
> >> - stats recording if enabled.
> >> - operator checkpoints
> >> - operator data.
> >>
> >> We could avoid copying debugging data like stat recording which could
> >> run in TB for long
> >> running application and is not required for functioning of new
> application.
> >>
> >> Similarly operator checkpoints could be read in parallel when they are
> >> launched for first time,
> >> This will also help in copying only required checkpoints and will be
> >> done in parallel
> >> by multiple containers/threads.
> >>
> >> For operator data stored in application directory, we could copy it
> >> completely for now, but
> >> in future we could provide an callback which will allow operator
> >> partition to read only
> >> required state from previous location.
> >>
> >> let me know your though on this.
> >>
> >> Regards,
> >> - Tushar.
> >>
>


Re: Improving Apex relaunch time.

2016-09-20 Thread Sandesh Hegde
How about re-launching the app from the same location?

If at all they want to store the state we can provide savepoint feature.

On Tue, Sep 20, 2016 at 4:39 AM Tushar Gosavi 
wrote:

> We have observed that application relaunch takes long time.
> The one major reason for delay in application startup during relaunch
> is time taken to copy state of exisitng application to new application.
> This state could grow in GBs and copy is performed in single thread before
> new application is submitted to Yarn.
>
> The state of previous application constists
> - jars
> - stram checkpoint/recovery file.
> - events
> - container file
> - stats recording if enabled.
> - operator checkpoints
> - operator data.
>
> We could avoid copying debugging data like stat recording which could
> run in TB for long
> running application and is not required for functioning of new application.
>
> Similarly operator checkpoints could be read in parallel when they are
> launched for first time,
> This will also help in copying only required checkpoints and will be
> done in parallel
> by multiple containers/threads.
>
> For operator data stored in application directory, we could copy it
> completely for now, but
> in future we could provide an callback which will allow operator
> partition to read only
> required state from previous location.
>
> let me know your though on this.
>
> Regards,
> - Tushar.
>


Re: [proposal] Application tags

2016-09-13 Thread Sandesh Hegde
One point to my previous mail, Yarn tags are supported from Hadoop 2.4
release onwards. Apex supports Hadoop 2.2, that is why Yarn feature cannot
be leveraged.

On Tue, Sep 13, 2016 at 1:24 PM Sandesh Hegde <sand...@datatorrent.com>
wrote:

> Hi All,
>
> I am proposing a new attribute “Tags”, on the similar lines as Yarn
> application tags. (https://issues.apache.org/jira/browse/YARN-1461)
>
> This is useful in a case where an Admin wants to attach an extra
> information to Apex applications launched by various users/departments in
> the company.
>
> We can use the existing attribute, “APP_PACKAGE_SOURCE”, which is not used
> by the platform, but the name of that attribute won’t reflect the purpose.
>
> Even though we say attributes are for the platform use, we need to open
> this for Admins for storing extra information.
>
> Let me know your thought on this.
>
> Thanks
>
>


[proposal] Application tags

2016-09-13 Thread Sandesh Hegde
Hi All,

I am proposing a new attribute “Tags”, on the similar lines as Yarn
application tags. (https://issues.apache.org/jira/browse/YARN-1461)

This is useful in a case where an Admin wants to attach an extra
information to Apex applications launched by various users/departments in
the company.

We can use the existing attribute, “APP_PACKAGE_SOURCE”, which is not used
by the platform, but the name of that attribute won’t reflect the purpose.

Even though we say attributes are for the platform use, we need to open
this for Admins for storing extra information.

Let me know your thought on this.

Thanks


Re: [ANNOUNCE] New Apache Apex PMC Member: Chandni Singh

2016-09-12 Thread Sandesh Hegde
Congratulations Chandni

On Mon, Sep 12, 2016 at 10:46 AM Shubham Pathak 
wrote:

> Congratulations Chandni !!
>
> Thanks,
> Shubham
>
> On Mon, Sep 12, 2016 at 10:29 AM, Amol Kekre  wrote:
>
> > Chandni,
> > Congrats
> >
> > Thks
> > Amol
> >
> > On Mon, Sep 12, 2016 at 10:22 AM, Chanchal Singh <
> > chanchal.apex...@gmail.com
> > > wrote:
> >
> > > Congratulations Chandni .. All the best
> > >
> > > On Sep 12, 2016 10:03 PM, "Thomas Weise"  wrote:
> > >
> > > > The Apache Apex PMC is pleased to announce that Chandni Singh is now
> a
> > > PMC
> > > > member. We appreciate all her contributions to the project so far,
> and
> > > are
> > > > looking forward to more.
> > > >
> > > > Congrats Chandni!
> > > > Thomas, for the Apache Apex PMC.
> > > >
> > >
> >
>


Re: [VOTE] Apache Apex Malhar Release 3.5.0 (RC2)

2016-08-31 Thread Sandesh Hegde
+1

Did the tests mentioned below
http://apex.apache.org/verification.html



On Wed, Aug 31, 2016 at 4:06 PM David Yan  wrote:

> +1 (binding)
>
> Downloaded the source, built with "mvn clean apache-rat:check verify
> -Dlicense.skip=false -Pall-modules install" successfully
> Checked LICENSE, NOTICE, README.md and CHANGELOG.md
> Ran pi-demo for more than 5 minutes now in a single-node cluster with no
> issues
>
> David
>
> On Tue, Aug 30, 2016 at 11:37 PM, Thomas Weise  wrote:
>
> > Dear Community,
> >
> > Please vote on the following Apache Apex Malhar 3.5.0 release candidate.
> >
> > RC2 fixes the copyright related issues with some test data files.
> >
> > This is a source release with binary artifacts published to Maven.
> >
> > This release is based on Apex Core 3.4 and comes with 61 resolved issues.
> >
> > The release advances the high level stream API to support stateful
> > transformations with Beam style windowing semantics. The demo package has
> > examples for usage of the API. There are also important improvements to
> > underlying operator state management components, which are functional
> first
> > cut and will be enhanced in upcoming releases, such as WindowOperator,
> > spillable collections and incremental state saving.
> >
> > The release also adds several new operators.
> >
> > List of all issues fixed: https://s.apache.org/5vQi
> >
> > Staging directory:
> >
> https://dist.apache.org/repos/dist/dev/apex/apache-apex-malhar-3.5.0-RC2/
> > Source zip:
> > https://dist.apache.org/repos/dist/dev/apex/apache-apex-
> > malhar-3.5.0-RC2/apache-apex-malhar-3.5.0-source-release.zip
> > Source tar.gz:
> > https://dist.apache.org/repos/dist/dev/apex/apache-apex-
> > malhar-3.5.0-RC2/apache-apex-malhar-3.5.0-source-release.tar.gz
> > Maven staging repository:
> > https://repository.apache.org/content/repositories/orgapacheapex-1017/
> >
> > Git source:
> > https://git-wip-us.apache.org/repos/asf?p=apex-malhar.git;a=
> > commit;h=refs/tags/v3.5.0-RC2
> >  (commit: f4c975a1ba9e7e4c68cd02924e526e612906b3e7)
> >
> > PGP key:
> > http://pgp.mit.edu:11371/pks/lookup?op=vindex=t...@apache.org
> > KEYS file:
> > https://dist.apache.org/repos/dist/release/apex/KEYS
> >
> > More information at:
> > http://apex.apache.org
> >
> > Please try the release and vote; vote will be open for at least 72 hours.
> >
> > [ ] +1 approve (and what verification was done)
> > [ ] -1 disapprove (and reason why)
> >
> > http://www.apache.org/foundation/voting.html
> >
> > How to verify release candidate:
> >
> > http://apex.apache.org/verification.html
> >
> > Thanks,
> > Thomas
> >
>


Re: Malhar 3.5.0 release

2016-08-26 Thread Sandesh Hegde
We can have categorized highlights, individuals can update it in the proper
categories.

New Operators
New Features
Major Bug Fixes
Miscellaneous



On Fri, Aug 26, 2016 at 2:04 PM Thomas Weise  wrote:

> There are 2 issues left which unless they get resolved today I suggest we
> move to 3.6
>
> There is a lot of good stuff going into this release that users may want to
> know about. Can everyone please have a look at resolved issues and reply if
> you see items that you think we should highlight in the release
> announcement.
>
>
> https://issues.apache.org/jira/issues/?jql=fixVersion%20%3D%203.5.0%20AND%20project%20%3D%20APEXMALHAR
>
>
>
> On Fri, Aug 26, 2016 at 7:18 AM, Thomas Weise 
> wrote:
>
> > See comments on PR.
> >
> > On Fri, Aug 26, 2016 at 7:16 AM, Yogi Devendra <
> > devendra.vyavah...@gmail.com> wrote:
> >
> >> Fix for APEXMALHAR-2206 has been reviewed and merged into asf master.
> >>
> >> ~ Yogi
> >>
> >> On 26 August 2016 at 19:04, Yogi Devendra  >
> >> wrote:
> >>
> >> > We noticed that lc.runAsync() is called without corresponding
> >> > lc.shutdown() at couple of places in unit tests.
> >> >
> >> > This caused:
> >> > 1. Async thread continuing.
> >> > 2. Build not terminating.
> >> > 3. target dir size incresing indefinately.
> >> >
> >> > @All reviewers
> >> > Please make sure to check for lc.shutdown() for every lc.runAsync().
> >> >
> >> > ~ Yogi
> >> >
> >> > On 26 August 2016 at 17:07, Bhupesh Chawda 
> >> > wrote:
> >> >
> >> >> Hi Thomas,
> >> >>
> >> >> https://issues.apache.org/jira/browse/APEXMALHAR-2206 is one small
> bug
> >> >> which Yogi pointed out. It was intermittently taking too long to
> >> complete
> >> >> the travis builds. We thought it would be a good idea to get it in
> for
> >> the
> >> >> release.
> >> >>
> >> >> I have just opened a PR: https://github.com/apache/apex
> >> -malhar/pull/383
> >> >>
> >> >> ~ Bhupesh
> >> >>
> >> >>
> >> >> On Fri, Aug 26, 2016 at 9:45 AM, Yogi Devendra <
> >> >> devendra.vyavah...@gmail.com
> >> >> > wrote:
> >> >>
> >> >> > https://github.com/apache/apex-malhar/pull/372 is open for
> >> >> > APEXMALHAR-2195.
> >> >> >
> >> >> > I have incorporated the review comments from @gauravgopi123,
> @tweise
> >> >> > I have updated fix version to 3.5.0 to reflect this ticket in the
> >> query.
> >> >> >
> >> >> > Can someone please review and merge this for 3.5.0?
> >> >> >
> >> >> > ~ Yogi
> >> >> >
> >> >> > On 25 August 2016 at 21:41, Chaitanya Chebolu <
> >> >> chaita...@datatorrent.com>
> >> >> > wrote:
> >> >> >
> >> >> > > Hi Thomas,
> >> >> > >
> >> >> > >I will fix APEXMALHAR-2134.
> >> >> > >
> >> >> > > Regards,
> >> >> > > Chaitanya
> >> >> > >
> >> >> > > On Thu, Aug 25, 2016 at 9:13 PM, Thomas Weise <
> >> tho...@datatorrent.com
> >> >> >
> >> >> > > wrote:
> >> >> > >
> >> >> > > > We are almost ready to get the RC out.
> >> >> > > >
> >> >> > > > Can someone fix APEXMALHAR-2134 for the release?
> >> >> > > >
> >> >> > > >
> >> >> > > > On Wed, Aug 17, 2016 at 12:56 PM, Thomas Weise <
> >> >> tho...@datatorrent.com
> >> >> > >
> >> >> > > > wrote:
> >> >> > > >
> >> >> > > > > Friendly reminder.. There are still a few unresolved tickets
> >> for
> >> >> 3.5:
> >> >> > > > >
> >> >> > > > > https://issues.apache.org/jira/issues/?jql=fixVersion%
> >> >> > > > > 20%3D%203.5.0%20AND%20project%20%3D%20APEXMALHAR%20AND%
> >> >> > > > 20resolution%20%3D%
> >> >> > > > > 20Unresolved%20ORDER%20BY%20priority%20DESC
> >> >> > > > >
> >> >> > > > > Please remove the fix version unless you expect to complete
> the
> >> >> work
> >> >> > in
> >> >> > > > > the next few days.
> >> >> > > > >
> >> >> > > > > And we need to complete the PR for high level API:
> >> >> > > > >
> >> >> > > > > https://issues.apache.org/jira/browse/APEXMALHAR-2142
> >> >> > > > >
> >> >> > > > > Thanks,
> >> >> > > > > Thomas
> >> >> > > > >
> >> >> > > > >
> >> >> > > > > On Fri, Aug 5, 2016 at 2:12 AM, Chaitanya Chebolu <
> >> >> > > > > chaita...@datatorrent.com> wrote:
> >> >> > > > >
> >> >> > > > >> APEXMALHAR-2100. PR is open: https://github.com/apache/apex
> >> >> > > > >> -malhar/pull/330
> >> >> > > > >>
> >> >> > > > >> Regards,
> >> >> > > > >> Chaitanya
> >> >> > > > >>
> >> >> > > > >> On Fri, Aug 5, 2016 at 1:10 PM, Priyanka Gugale <
> >> >> > > > priya...@datatorrent.com
> >> >> > > > >> >
> >> >> > > > >> wrote:
> >> >> > > > >>
> >> >> > > > >> > APEXMALHAR-2171
> >> >> > > > >> > PR is open:
> https://github.com/apache/apex-malhar/pull/358
> >> >> > > > >> >
> >> >> > > > >> > -Priyanka
> >> >> > > > >> >
> >> >> > > > >> > On Fri, Aug 5, 2016 at 12:32 PM, Yogi Devendra <
> >> >> > > > >> > devendra.vyavah...@gmail.com
> >> >> > > > >> > > wrote:
> >> >> > > > >> >
> >> >> > > > >> > > Fix for APEXMALHAR-2176
> >> >> > > > >> > > PR open: https://github.com/apache/apex-malhar/pull/361
> >> >> > > > >> > >
> >> >> > > > >> > > ~ 

Re: [Proposal] DAG listener

2016-08-15 Thread Sandesh Hegde
@Sanjay
Here is the issue I mentioned
https://issues.apache.org/jira/browse/APEXCORE-496

Instead of exposing one information at a time, why not publish all the
information?
To start with it could just be Physical DAG and update the subscribers
after partitioning.

Thanks


On Wed, Aug 10, 2016 at 10:40 AM Sanjay Pujare <san...@datatorrent.com>
wrote:

> Looks like a good idea but would like more details about the use case
> (“example ... to access the operator name while using Batched
> StatsListener”).
>
> Do you propose very fine grained subscription model? Also what “major
> activities in the DAG” will be available thru this listener?
>
>
>
> On 8/10/16, 8:20 AM, "Sandesh Hegde" <sand...@datatorrent.com> wrote:
>
> Any operators can subscribe to Stram Events affecting the DAG.
>
> Implementation will most probably use heartbeat.
>
> On Tue, Aug 9, 2016 at 11:43 PM Tushar Gosavi <tus...@datatorrent.com>
> wrote:
>
> > Hi Sandesh,
> >
> > Dag changes are handled in Stram, and operators are running in
> > different containers.
> > Are you suggesting an RPC interface or operator request for sending
> > this information
> > from Stram to all partitions of the interested operator?
> >
> > - Tushar.
> >
> >
> >
> > On Wed, Aug 10, 2016 at 11:28 AM, Sandesh Hegde <
> sand...@datatorrent.com>
> > wrote:
> > > Hi All,
> > >
> > > As we add more features to support batch use cases, there will be
> a need
> > to
> > > access more information about the DAG from an operator. One
> example is
> > the need
> > > to access the operator name while using Batched StatsListener.
> > >
> > > The idea here is to implement DAG Listener ( similar to
> StatsListener ),
> > > which can potentially give every information present in the Stram.
> > > Operators will also have visibility into all the major activities
> in the
> > > DAG.
> > >
> > > Thoughts?
> > >
> > > Thanks
> >
>
>
>
>


Re: [Proposal] DAG listener

2016-08-10 Thread Sandesh Hegde
Any operators can subscribe to Stram Events affecting the DAG.

Implementation will most probably use heartbeat.

On Tue, Aug 9, 2016 at 11:43 PM Tushar Gosavi <tus...@datatorrent.com>
wrote:

> Hi Sandesh,
>
> Dag changes are handled in Stram, and operators are running in
> different containers.
> Are you suggesting an RPC interface or operator request for sending
> this information
> from Stram to all partitions of the interested operator?
>
> - Tushar.
>
>
>
> On Wed, Aug 10, 2016 at 11:28 AM, Sandesh Hegde <sand...@datatorrent.com>
> wrote:
> > Hi All,
> >
> > As we add more features to support batch use cases, there will be a need
> to
> > access more information about the DAG from an operator. One example is
> the need
> > to access the operator name while using Batched StatsListener.
> >
> > The idea here is to implement DAG Listener ( similar to StatsListener ),
> > which can potentially give every information present in the Stram.
> > Operators will also have visibility into all the major activities in the
> > DAG.
> >
> > Thoughts?
> >
> > Thanks
>


[Proposal] DAG listener

2016-08-09 Thread Sandesh Hegde
Hi All,

As we add more features to support batch use cases, there will be a need to
access more information about the DAG from an operator. One example is the need
to access the operator name while using Batched StatsListener.

The idea here is to implement DAG Listener ( similar to StatsListener ),
which can potentially give every information present in the Stram.
Operators will also have visibility into all the major activities in the
DAG.

Thoughts?

Thanks


Re: [Proposal] Named Checkpoints

2016-08-08 Thread Sandesh Hegde
The idea here was to create, on demand, recovery/committed window. But
there is always one(except before the first) recovery window for the DAG.
Instead of using/modifying the Checkpoint tuple, I am planning to reuse
the existing recovery window state, which simplifies the implementation.

Proposed API:

ApexCli> savepoint  
ApexCli> launch -savepoint 

first prototype:
https://github.com/sandeshh/apex-core/commit/8ec7e837318c2b33289251cda78ece0024a3f895

Thanks

On Thu, Aug 4, 2016 at 11:54 AM Amol Kekre <a...@datatorrent.com> wrote:

> hmm! actually it may be a good debugging tool too. Keep the named
> checkpoints around. The feature is to keep checkpoints around, which can be
> done by giving a feature to not delete checkpoints, but then naming them
> makes it more operational. Send a command from cli->get checkpoint -> know
> it is the one you need as the file name has your string you send with the
> command -> debug. This is different that querying a state as this gives
> entire app checkpoint to debug with.
>
> Thks
> Amol
>
>
> On Thu, Aug 4, 2016 at 11:41 AM, Venkatesh Kottapalli <
> venkat...@datatorrent.com> wrote:
>
> > + 1 for the idea.
> >
> > It might be helpful to developers as well when dealing with variety of
> > data in large volumes if this can help them run from the checkpointed
> state
> > rather than rerunning the application altogether in case of issues.
> >
> > I have seen cases where the application runs for more than 10 hours and
> > some partitions fail because of the variety of data that it is dealing
> > with. In such cases, the application has to be restarted and it will be
> > helpful to developers with a feature of this kind.
> >
> >  The ease of enabling/disabling this feature to run the app will also be
> > important.
> >
> > -Venkatesh.
> >
> >
> > > On Aug 4, 2016, at 10:29 AM, Amol Kekre <a...@datatorrent.com> wrote:
> > >
> > > We had an user who wanted roll-back and restart from audit purposes.
> That
> > > time we did not have timed-window. Names checkpoint would have helped a
> > > little bit..
> > >
> > > Problem statement: Auditors ask for rerun of yesterday's computations
> for
> > > verification. Assume that these computations depend on previous state
> > (i.e
> > > data from day before yesterday).
> > >
> > > Solution
> > > 1. Have named checkpoints at 12 in the night (an input adapter triggers
> > it)
> > > every day
> > > 2. The app spools raw logs into hdfs along with window ids and event
> > times
> > > 3. The re-run is a separate app that starts off on a named checkpoint
> (12
> > > night yesterday)
> > >
> > > Technically the solution will not as simple and "new audit app" will
> > need a
> > > lot of other checks (dedups, drop events not in yesterday's window,
> wait
> > > for late arrivals, ...), but names checkpoint helps.
> > >
> > > I do agree with Pramod's that replay within the same running app is not
> > > viable within a data-in-motion architecture. But it helps somewhat in a
> > new
> > > audit app. Named checkpoints help data-in-motion architectures handle
> > batch
> > > apps better. In the above case #2 spooling done with event time
> > stamp+state
> > > suffices. The state part comes from names checkpoint.
> > >
> > > Thks,
> > > Amol
> > >
> > >
> > >
> > >
> > > On Thu, Aug 4, 2016 at 10:12 AM, Sanjay Pujare <san...@datatorrent.com
> >
> > > wrote:
> > >
> > >> I agree. A specific use-case will be useful to support this feature.
> > Also
> > >> the ability to replay from the named checkpoint will be limited
> because
> > of
> > >> various factors, isn’t it?
> > >>
> > >> On 8/4/16, 9:00 AM, "Pramod Immaneni" <pra...@datatorrent.com> wrote:
> > >>
> > >>There is a problem here, keeping old checkpoints and recovering
> from
> > >> them
> > >>means preserving the old input data along with the state. This is
> > more
> > >> than
> > >>the mechanism of actually creating named checkpoints, it means
> having
> > >> the
> > >>ability for operators to move forward (a.k.a committed and dropping
> > >>    committed states and buffer data) while still having the ability to
> > >> replay
> > >>from that point from the input source 

Re: [Proposal] Named Checkpoints

2016-08-04 Thread Sandesh Hegde
@Chinmay
We can enhance the existing checkpoint tuple but that one is more
frequently used than this feature, so why burden Checkpoint tuple with
an extra field?

@Aniruddha
It is better to leave the scheduling to the users, they can use any tool
that they are already familiar with.

On Thu, Aug 4, 2016 at 7:40 AM Aniruddha Thombare <anirud...@datatorrent.com>
wrote:

> +1 On the idea, it would be awesome to have.
>
> Question: Can we further develop this brilliant idea into:-
> Scheduled checkpoints ( To save as  dynamically named checkpoint)?
> This would be on the lines of logrotate / general backup strategies.
>
>
> Thanks,
>
> A
>
> _
> Sent with difficulty, I mean handheld ;)
> On 4 Aug 2016 8:03 pm, "Munagala Ramanath" <r...@datatorrent.com> wrote:
>
> > +1
> >
> > Ram
> >
> > On Thu, Aug 4, 2016 at 12:10 AM, Sandesh Hegde <sand...@datatorrent.com>
> > wrote:
> >
> > > Hello Team,
> > >
> > > This thread is to discuss the Named Checkpoint feature for Apex. (
> > > https://issues.apache.org/jira/browse/APEXCORE-498)
> > >
> > > Named checkpoints allow following workflow,
> > >
> > > 1. Users can trigger a checkpoint and give it a name
> > > 2. Relaunch the application from the named checkpoint.
> > > 3. These checkpoints survive the "purge of old checkpoints".
> > >
> > > Current idea is to add a new control tuple, NamedCheckPointTuple, which
> > > contains the user specified name, it traverses the DAG and along the
> way
> > > necessary actions are taken.
> > >
> > > Please let me know your thoughts on this.
> > >
> > > Thanks
> > >
> >
>


Re: Operator name in BatchedOperatorStats interface.

2016-07-25 Thread Sandesh Hegde
How about making the mapping of  "OperatorId" to "OperatorName" ( and other
extra information) as a part of the DAG context?


On Mon, Jul 18, 2016 at 10:53 PM Tushar Gosavi 
wrote:

> Hi All,
>
> We support shared stats listener, But user do not have any way to
> identify the operator for which the stat listener is called. The
> BatchedOperatorStats only provides operatorId, and there is no api
> available to use to get operator information from operatorId.
>
> Can we include operator name as part of BatchedOperatorStats? This
> will allow users to take decision based on which operator stats are
> being processed.
>
> Regards,
> -Tushar.
>


Re: [Proposal] Support storing apps in a Configuration Package

2016-07-21 Thread Sandesh Hegde
Configuration packages will contain JSON apps.
During an app launch users can choose to see and launch only the apps
present in the Config package.

It is like a hub and spoke model, single AppPackage and multiple custom
views.

Initial work is here,
https://github.com/apache/apex-core/pull/360

On Thu, Jul 21, 2016 at 5:02 PM Sasha Parfenov <sas...@apache.org> wrote:

> Sounds promising.  Perhaps you can elaborate if it mean that we're adding
> JSON apps to the Configuration Packages spec?  Or that we're providing
> support to link Configuration Packages to existing App Package Apps?  Or
> something else?
>
> Thanks,
> Sasha
>
>
>
> On Tue, Jul 19, 2016 at 5:37 PM, Sandesh Hegde <sand...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > Apex supports configuration package, separates application package from
> the
> > actual configuration. (
> http://docs.datatorrent.com/configuration_packages/
> > )
> >
> > We want to enhance the configuration package by adding support to "add
> > Apps" (json format).
> >
> > UseCase: Multiple users sharing the same app package, but have a
> different
> > view of the golden copy of the app package.
> >
> > Note: This feature is requested by an Apex user.
> >
> > Thanks
> >
>


Re: Bleeding edge branch ?

2016-07-20 Thread Sandesh Hegde
@Amol

EOL is important for master branch. To start the work on next version of
Hadoop on different branch ( let us call that master++ ), we should not
worry about the EOL. Eventually, master++ becomes master and the master++
will continue on the later version of the Hadoop.



On Wed, Jul 20, 2016 at 10:30 AM Siyuan Hua <siy...@datatorrent.com> wrote:

> Ok, whether branches or forks. I still think we should have at least some
> materialized version of malhar/core for the big influencer like java,
> hadoop or even kafka. Java 8, for example, is actually not new.  We don't
> have to be aggressive to try out new features from those right now. But we
> can at least have some CI run build/test periodically and make sure our
> current code is future-prove and avoid some future-deprecated code when we
> add new features. Also if people ask for it, we can have a link to point
> them to.  BTW, High-level API can definitely benefit from java 8.  :)
>
> Regards,
> Siyuan
>
> On Wed, Jul 20, 2016 at 8:30 AM, Sandesh Hegde <sand...@datatorrent.com>
> wrote:
>
> > Our current model of supporting the oldest supported Hadoop, penalizes
> the
> > users of latest Hadoop versions by favoring the slow movers.
> > Also, we won't benefit from the increased maturity of the Hadoop
> platform,
> > as we will be working on the many years old version of Hadoop.
> > We also need to incentivize our customers to upgrade their Hadoop
> version,
> > by making use of new features.
> >
> > My vote goes to start the work on the Hadoop 2.6 ( or any other version )
> > in a different branch, without waiting for the EOL policies.
> >
> > On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise <tho...@datatorrent.com>
> > wrote:
> >
> > > -0
> > >
> > > I read the thread twice, it is not clear to me what benefit Apex users
> > > derive from this exercise. A branch normally contains development work
> > that
> > > is eventually brought back to the main line and into a release. Here,
> the
> > > suggestion seems to be an open ended effort to play with latest tech,
> > isn't
> > > that something anyone (including a group of folks) can do in a fork. I
> > > don't see value in a permanent branch for that, who is going to
> maintain
> > > such code and who will ever use it?
> > >
> > > There was a point that we can find out about potential problems with
> > later
> > > versions. The way to find such issues is to take the releases and run
> > them
> > > on these later versions (that's what users do), not by changing the
> code!
> > >
> > > Regarding Java version: Our users don't use Apex in a vacuum. Please
> > have a
> > > look at ASF Hadoop and the distros EOL policies. That will answer the
> > > question what Java version is appropriate. I would be surprised if
> > > something that works on Java 7 falls flat on the face with Java 8 as a
> > lot
> > > of diligence goes into backward compatibility. Again the way to tests
> > this
> > > is to run verification with existing Apex releases on Java 8 based
> stack.
> > >
> > > Regarding Hadoop version: This has been discussed off record several
> > times
> > > and there are actual JIRA tickets marked accordingly so that the work
> is
> > > done when we move. It is a separate discussion, no need to mix Java
> > > versions and branching with it. I agree with what David said, if
> someone
> > > can show that we can move up to 2.6 based on EOL policies and what
> known
> > > Apex users have in production, then we should work on that upgrade. The
> > way
> > > I imagine it would work is that we have a Hadoop-2.6 (or whatever
> > version)
> > > branch, make all the upgrade related changes there (which should be a
> > list
> > > of JIRAs) and then merge it back to master when we are satisfied. After
> > > that, the branch can be deleted.
> > >
> > > Thomas
> > >
> > >
> > >
> > > On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> > > chin...@datatorrent.com>
> > > wrote:
> > >
> > > > I'm -0 on this idea.
> > > >
> > > > Here is the reason:
> > > > Unless we see a real case where users want to see everything on
> latest,
> > > > this branch might quickly become low hanging fruit and eventually get
> > > > obsolete because its anyway a "no gaurantee" branch.
> > > >
> > > > We have a bunch of dependencies which we'll

Re: Bleeding edge branch ?

2016-07-20 Thread Sandesh Hegde
Our current model of supporting the oldest supported Hadoop, penalizes the
users of latest Hadoop versions by favoring the slow movers.
Also, we won't benefit from the increased maturity of the Hadoop platform,
as we will be working on the many years old version of Hadoop.
We also need to incentivize our customers to upgrade their Hadoop version,
by making use of new features.

My vote goes to start the work on the Hadoop 2.6 ( or any other version )
in a different branch, without waiting for the EOL policies.

On Tue, Jul 12, 2016 at 1:16 AM Thomas Weise  wrote:

> -0
>
> I read the thread twice, it is not clear to me what benefit Apex users
> derive from this exercise. A branch normally contains development work that
> is eventually brought back to the main line and into a release. Here, the
> suggestion seems to be an open ended effort to play with latest tech, isn't
> that something anyone (including a group of folks) can do in a fork. I
> don't see value in a permanent branch for that, who is going to maintain
> such code and who will ever use it?
>
> There was a point that we can find out about potential problems with later
> versions. The way to find such issues is to take the releases and run them
> on these later versions (that's what users do), not by changing the code!
>
> Regarding Java version: Our users don't use Apex in a vacuum. Please have a
> look at ASF Hadoop and the distros EOL policies. That will answer the
> question what Java version is appropriate. I would be surprised if
> something that works on Java 7 falls flat on the face with Java 8 as a lot
> of diligence goes into backward compatibility. Again the way to tests this
> is to run verification with existing Apex releases on Java 8 based stack.
>
> Regarding Hadoop version: This has been discussed off record several times
> and there are actual JIRA tickets marked accordingly so that the work is
> done when we move. It is a separate discussion, no need to mix Java
> versions and branching with it. I agree with what David said, if someone
> can show that we can move up to 2.6 based on EOL policies and what known
> Apex users have in production, then we should work on that upgrade. The way
> I imagine it would work is that we have a Hadoop-2.6 (or whatever version)
> branch, make all the upgrade related changes there (which should be a list
> of JIRAs) and then merge it back to master when we are satisfied. After
> that, the branch can be deleted.
>
> Thomas
>
>
>
> On Tue, Jul 12, 2016 at 8:36 AM, Chinmay Kolhatkar <
> chin...@datatorrent.com>
> wrote:
>
> > I'm -0 on this idea.
> >
> > Here is the reason:
> > Unless we see a real case where users want to see everything on latest,
> > this branch might quickly become low hanging fruit and eventually get
> > obsolete because its anyway a "no gaurantee" branch.
> >
> > We have a bunch of dependencies which we'll have to take care of to
> really
> > make it bleeding edge. Specially about malhar, its a long list. That
> looks
> > like quite significant work.
> > Moreover, if this branch is going to be in "may or may not work" state;
> I,
> > as a user or developer, would bank on what certainly works.
> >
> > I also think that, if its going to be "no gaurantee" then its worth
> > spending time contributions towards master rather than bleeding-edge
> > branch.
> >
> > If a question of "should we upgrade?" comes, the community is mature to
> > take that call then and work accordingly.
> >
> > -Chinmay.
> >
> >
> >
> > On Tue, Jul 12, 2016 at 11:42 AM, Priyanka Gugale 
> > wrote:
> >
> > > +1 for creating such branch.
> > > One of us will have to rebase it with master branch at intervals. I
> don't
> > > think everyone will cherry-pick their commits here. We can make it once
> > in
> > > a month activity. Are we considering updating all dependency library
> > > version as well?
> > >
> > > -Priyanka
> > >
> > > On Tue, Jul 12, 2016 at 2:34 AM, Munagala Ramanath <
> r...@datatorrent.com>
> > > wrote:
> > >
> > > > Following up on some comments, wanted to clarify what I have in mind
> > for
> > > > this branch:
> > > >
> > > > 1. The main goal is to stay up-to-date with new releases, so if a
> > > question
> > > > of the form
> > > > "A new release of X is available, should we upgrade ?" comes up,
> > the
> > > > answer is
> > > > *always* an *emphatic* yes; otherwise it doesn't bleed enough
> (:-)
> > as
> > > > Sanjay points out.
> > > > 2. Pull requests are submitted as always; there is no requirement to
> > > > generate an additional
> > > > pull requests against this branch. It may get
> merged/cherry-picked
> > > > depending on who has the
> > > >time and inclination to do it.
> > > > 3. There is no expectation of dedication of any additional resources,
> > so
> > > > people work on
> > > > it as and when time is available. ("No guarantee" means exactly
> > > that).
> > > > So there is no
> > > > question of 

Re: Dynamic partition is not working in Kafka Input Operator

2016-07-18 Thread Sandesh Hegde
Was this resolved?

My understanding is that, Kafka Input operator doesn't support the changes
in Kafka partitions after the initial launch.

On Mon, Jul 18, 2016 at 1:54 AM Chaitanya Chebolu 
wrote:

> Hi All,
>
>I am facing dynamic partition issues in 0.8 version of Kafka Input
> Operator. My application has the following DAG:
>
>KafkaSinglePortStringInputOperator(Input) ->
> ConsoleOutputOperator(Output)
>
>I launched the application with below configuration:
> Kafka topic created with single partition and replication factor as 1.
> Partition Strategy: ONE_TO_ONE
>
>Launched the application successfully. After some time, I increased the
> topic partitions to 2. After re-partition, the window of down stream
> operator is not moving. By looking into the app Physical DAG, it looks like
> there is an issue in construction of Physical DAG after re-partition.
>
> Please let me know if any one observed the same behavior. Do we have JIRA
> for tracking this issue.
> I am attaching some of the screenshots of this application.
>
> Regards,
> Chaitanya
>
>


Re: Bleeding edge branch ?

2016-07-11 Thread Sandesh Hegde
+1 with some variation

Support next version, compared to one supported by the Apex main, of the
Hadoop instead of the latest Hadoop. This makes moving the Apex main to
next version of the Hadoop easy.



On Mon, Jul 11, 2016 at 10:33 AM Sanjay Pujare 
wrote:

> strong +1 (will be nice to have some dedicated resource to maintain this
> branch)
>
> On Mon, Jul 11, 2016 at 8:59 AM, Munagala Ramanath 
> wrote:
>
> > We've had a number of issues recently related to dependencies on old
> > versions
> > of various packages/libraries such as Hadoop itself, Google guava,
> > HTTPClient,
> > mbassador, etc.
> >
> > How about we create a "bleeding-edge" branch in both Core and Malhar
> which
> > will use the latest versions of these various dependencies, upgrade to
> Java
> > 8 so
> > we can use the new Java features, etc. ?
> >
> > This will give us an opportunity to discover these sorts of problems
> early
> > and,
> > when we are ready to pull the trigger for a major version, we have a
> branch
> > ready
> > for merge with, hopefully, minimal additional effort.
> >
> > There will be no guarantees w.r.t. this branch so people using it use it
> at
> > their own
> > risk.
> >
> > Ram
> >
>


Re: [Proposal] Make the default Unifier ThreadLocal with the downstream operator

2016-07-08 Thread Sandesh Hegde
Created a jira for this issue,
https://issues.apache.org/jira/browse/APEXCORE-482


On Thu, Jul 7, 2016 at 9:22 PM Amol Kekre <a...@datatorrent.com> wrote:

> +1. Makes sense. Do need to allow users to override if they want.
>
> Thks
> Amol
>
>
> On Thu, Jul 7, 2016 at 6:53 PM, Sandesh Hegde <sand...@datatorrent.com>
> wrote:
>
> > Hi All,
> >
> > Unifier's are deployed as CONTAINER_LOCAL with the downstream operator(
> > except in the corner case of Mx1 ).  Default Unifiers are essentially
> doing
> > buffer to buffer copy, so instead it should be THREAD_LOCAL to improve
> the
> > performance.
> >
> > Let me know your thoughts on this.
> >
> > Thanks
> >
>


[Proposal] Make the default Unifier ThreadLocal with the downstream operator

2016-07-07 Thread Sandesh Hegde
Hi All,

Unifier's are deployed as CONTAINER_LOCAL with the downstream operator(
except in the corner case of Mx1 ).  Default Unifiers are essentially doing
buffer to buffer copy, so instead it should be THREAD_LOCAL to improve the
performance.

Let me know your thoughts on this.

Thanks


Re: Getting various statitstics from StreamingContainers

2016-06-27 Thread Sandesh Hegde
Looked around to see, how others are implementing this, found the feature
implemented in Storm.

Storm supports whitelist of commands to run heap dumps/stack and few
others.

Looked at their implementation, they are also using new process to lauch
the JDK tools to get the information.
https://issues.apache.org/jira/browse/STORM-1157


On Mon, May 23, 2016 at 11:31 AM Sandesh Hegde <sand...@datatorrent.com>
wrote:

> After connecting to the app user will run the following command.
>
> Users will select the container id, jdk tool and the arguments to the tool.
>
> Apex CLI api
> run-jdkTools  "Container-id" "Tool-name" "Arguments"
>
> Output
>  of the command is interpreted by the user.
>
>
>
> On Mon, May 23, 2016 at 11:31 AM Thomas Weise <tho...@datatorrent.com>
> wrote:
>
>> I think it is appropriate to collect the information that the JVM provides
>> using the available API instead of running external processes.
>>
>> For other information, how do you suggest that will be provided to the
>> user?
>>
>> Thanks,
>> Thomas
>>
>>
>>
>> On Mon, May 23, 2016 at 11:27 AM, Sandesh Hegde <sand...@datatorrent.com>
>> wrote:
>>
>> > Users can pass the arguments to the JDK tools. So it exposes all the
>> power
>> > of those tools. If we have to write the code we are doing the duplicate
>> > work.
>> > Also it doesn't evolve with the new features of the JVM, but the tools
>> will
>> > and we just have to change the arguments that we pass.
>> >
>> > On Mon, May 23, 2016 at 11:15 AM Vlad Rozov <v.ro...@datatorrent.com>
>> > wrote:
>> >
>> > > What is the purpose of the new process? Why that information can't be
>> > > collected directly from JVM and passed to app master using heartbeat?
>> > >
>> > > Thank you,
>> > > Vlad
>> > >
>> > > On 5/23/16 10:57, Sandesh Hegde wrote:
>> > > > Hello All,
>> > > >
>> > > > Getting various information from the StreamingConatainers is a
>> useful
>> > > > feature to have.
>> > > > As StreamingContainers are JVMs, various JDK tools can be used to
>> get
>> > the
>> > > > information.
>> > > >
>> > > > So the idea is to spawn the new process from the streaming
>> containers
>> > and
>> > > > return the information via Stram.
>> > > >
>> > > > Recently we have added the feature to get stack trace I have
>> modified
>> > > that
>> > > > to show the idea I am talking about.
>> > > >
>> > > > Here is the pull request, the purpose of that is to show the idea,
>> let
>> > me
>> > > > know your thoughts.
>> > > > https://github.com/apache/incubator-apex-core/pull/340
>> > > >
>> > > > I have not created a jira yet, wanted to check the viability of the
>> > idea.
>> > > >
>> > > > Thanks
>> > > > Sandesh
>> > > >
>> > >
>> > >
>> >
>>
>


Re: [DISCUSSION] Custom Control Tuples

2016-06-25 Thread Sandesh Hegde
Why restrict the control tuples to input operators?

On Sat, Jun 25, 2016 at 9:07 AM Amol Kekre  wrote:

> David,
> We should avoid control tuple within the window by simply restricting it
> through API. This can be done by calling something like "sendControlTuple"
> between windows, notably in input operators.
>
> Thks
> Amol
>
>
> On Sat, Jun 25, 2016 at 7:32 AM, Munagala Ramanath 
> wrote:
>
> > What would the API look like for option 1 ? Another operator callback
> > called controlTuple() or does the operator code have to check each
> > incoming tuple to see if it was data or control ?
> >
> > Ram
> >
> > On Fri, Jun 24, 2016 at 11:42 PM, David Yan 
> wrote:
> >
> > > It looks like option 1 is preferred by the community. But let me
> > elaborate
> > > why I brought up the option of piggy backing BEGIN and END_WINDOW
> > >
> > > Option 2 implicitly enforces that the operations related to the custom
> > > control tuple be done at the streaming window boundary.
> > >
> > > For most operations, it makes sense to have that enforcement. Option 1
> > > opens the door to the possibility of sending and handling control
> tuples
> > > within a window, thus imposing a challenge of ensuring idempotency. In
> > > fact, allowing that would make idempotency extremely difficult to
> > achieve.
> > >
> > > David
> > >
> > > On Fri, Jun 24, 2016 at 4:38 PM, Vlad Rozov 
> > > wrote:
> > >
> > > > +1 for option 1.
> > > >
> > > > Thank you,
> > > >
> > > > Vlad
> > > >
> > > >
> > > > On 6/24/16 14:35, Bright Chen wrote:
> > > >
> > > >> +1
> > > >> It also can help to Shutdown the application gracefully.
> > > >> Bright
> > > >>
> > > >> On Jun 24, 2016, at 1:35 PM, Siyuan Hua 
> > wrote:
> > > >>>
> > > >>> +1
> > > >>>
> > > >>> I think it's good to have custom control tuple and I prefer the 1
> > > option.
> > > >>>
> > > >>> Also I think we should think about couple different callbacks, that
> > > could
> > > >>> be operator level(triggered when an operator receives an control
> > tuple)
> > > >>> or
> > > >>> dag level(triggered when control tuple flow over the whole dag)
> > > >>>
> > > >>> Regards,
> > > >>> Siyuan
> > > >>>
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Fri, Jun 24, 2016 at 12:42 PM, David Yan  >
> > > >>> wrote:
> > > >>>
> > > >>> My initial thinking is that the custom control tuples, just like
> the
> > >  existing control tuples, will only be generated from the input
> > > operators
> > >  and will be propagated downstream to all operators in the DAG. So
> > the
> > >  NxM
> > >  partitioning scenario works just like how other control tuples
> work,
> > >  i.e.
> > >  the callback will not be called unless all ports have received the
> > >  control
> > >  tuple for a particular window. This creates a little bit of
> > > complication
> > >  with multiple input operators though.
> > > 
> > >  David
> > > 
> > > 
> > >  On Fri, Jun 24, 2016 at 12:03 PM, Tushar Gosavi <
> > > tus...@datatorrent.com
> > >  >
> > >  wrote:
> > > 
> > >  +1 for the feature
> > > >
> > > > I am in favor of option 1, but we may need an helper method to
> > avoid
> > > > compiler error on typed port, as calling port.emit(controlTuple)
> > will
> > > > be an error if type of control tuple and port does not match. or
> > new
> > > > method in outputPort object , emitControlTuple(ControlTuple).
> > > >
> > > > Can you give example of piggy backing tuple with current
> > BEGIN_WINDOW
> > > > and END_WINDOW control tuples?
> > > >
> > > > In case of NxM partitioning, each downstream operator will
> receive
> > N
> > > > control tuples. will it call user handler N times for each
> > downstream
> > > > operator or just once.
> > > >
> > > > Regards,
> > > > - Tushar.
> > > >
> > > >
> > > >
> > > > On Fri, Jun 24, 2016 at 11:52 PM, David Yan <
> da...@datatorrent.com
> > >
> > > >
> > >  wrote:
> > > 
> > > > Hi all,
> > > >>
> > > >> I would like to propose a new feature to the Apex core engine --
> > the
> > > >> support of custom control tuples. Currently, we have control
> > tuples
> > > >>
> > > > such
> > > 
> > > > as
> > > >
> > > >> BEGIN_WINDOW, END_WINDOW, CHECKPOINT, and so on, but we don't
> have
> > > the
> > > >> support for applications to insert their own control tuples. The
> > way
> > > >> currently to get around this is to use data tuples and have a
> > > separate
> > > >>
> > > > port
> > > >
> > > >> for such tuples that sends tuples to all partitions of the
> > > downstream
> > > >> operators, which is not exactly developer friendly.
> > > >>
> > > >> We have already seen a number of use cases that can use this
> > > feature:
> > > 

Re: APEXCORE-408 : Ability to schedule Sub-DAG from running application

2016-06-21 Thread Sandesh Hegde
For the usecase 1, is it possible to avoid changing the Context? Can we
have something along the lines of "StramToNodeRequest" ?

On Tue, Jun 21, 2016 at 11:09 AM Tushar Gosavi 
wrote:

> Hi All,
>
> We have seen few use cases in field which require Apex application
> scheduling based on some condition. This has also came up as part of
> Batch Support in Apex previously
> (
> http://mail-archives.apache.org/mod_mbox/apex-dev/201602.mbox/%3CCAKJfLDPXNsG1kEQs4T_2e%3DweRJLs4VeL%2B1-7LxO_bjovD9%2B-Rw%40mail.gmail.com%3E
> )
> . I am proposing following functionality in Apex to help scheduling
> and better resource utilization for batch jobs. Please provide your
> comments.
>
> Usecase 1 - Dynamic Dag modification.
>
> Each operator in DAG consumes yarn resources, sometimes it is
> desirable to return the resources to yarn when no data is available
> for processing, and deploy whole DAG once data starts to appear. For
> this to happen automatically, we will need some data monitoring
> operators running in the DAG to trigger restart and shutdown of the
> operators in the DAG.
>
> Apex already have such api to dynamically change the running dag
> through cli. We could provide similar API available to operators which
> will trigger dag modification at runtime. This information can be
> passed to master using heartbeat RPC and master will make
> required changed to the DAG. let me know what do you think about it..
> something like below.
> Context.beginDagChange();
> context.addOperator("o1") <== launch operator from previous check-pointed
> state.
> context.addOperator("o2", new Operator2()) <== create new operator
> context.addStream("s1", "reader.output", "o1.input");
> context.shutdown("o3"); <== delete this and downstream operators from the
> DAG.
> context.apply();  <== dag changes will be send to master, and master
> will apply these changes.
>
> Similarly API for other functionalities such as locality settings
> needs to be provided.
>
>
> Usecase 2 - Classic Batch Scheduling.
>
> Provide an API callable from operator to launch a DAG. The operator
> will prepare an dag object and submit it to the yarn, the DAG will be
> scheduled as a new application. This way complex schedulers can be
> written as operators.
>
> public SchedulerOperator implements Operator {
>void handleIdleTime() {
>   // check of conditions to start a job (for example enough files
> available, enough items are available in kafa, or time has reached
>  Dag dag = context.createDAG();
>  dag.addOperator();
>  dag.addOperator();
>  LaunchOptions lOptions = new LaunchOptions();
>  lOptions.oldId = ""; // start for this checkpoint.
>  DagHandler dagHandler = context.submit(dag, lOptions);
>}
> }
>
> DagHandler will have methods to monitor the final state of
> application, or to kill the DAG
> dagHandler.waitForCompletion() <== wait till the DAG terminates
> dagHandler.status()  <== get the status of application.
> dagHandler.kill() <== kill the running application.
> dagHandler.shutdown() <== shutdown the application.
>
> The more complex Scheduler operators could be written to manage the
> workflows, i.e DAG of DAGs. using these APIs.
>
> Regards,
> -Tushar.
>


Re: Proposal : DAG - SetOperatorAttribute

2016-06-20 Thread Sandesh Hegde
Hi All,

This change has been merged, will be part of 3.5.

New API in the DAG, setOperatorAttribute(Operator operator, Attribute key,
T value) )
Another API was deprecated, "setAttribute(Operator ... "

Thanks



On Tue, Jun 7, 2016 at 1:08 PM Sandesh Hegde <sand...@datatorrent.com>
wrote:

> I have created a Jira, will start working on this.
>
> https://issues.apache.org/jira/browse/APEXCORE-470
>
>
>
> On Tue, Jun 7, 2016 at 12:21 PM Munagala Ramanath <r...@datatorrent.com>
> wrote:
>
>> +1
>>
>> Since we have *setInputPortAttribute* and *setOutputPortAttribute*, it
>> seems reasonable
>> to add *setOperatorAttribute*.
>>
>> Ram
>>
>> On Mon, Jun 6, 2016 at 1:39 PM, Sandesh Hegde <sand...@datatorrent.com>
>> wrote:
>>
>> > Currently, *setAttribute* is used to set the operator attributes. Other
>> 2
>> > Attribute setting APIs are specific to input ports
>> > (*setInputPortAttributes*) and output ports
>> (*setOutputPortsAttributes*).
>> >
>> > Proposal is to have *SetOperatorAttribute*
>> > api, which will clearly indicate that user wants set attributes on the
>> > operator.
>> > ( setOperatorAttribute(Operator operator, Attribute key, T value) )
>> >
>> > Following will be the roles for the APIs
>> > *setAttributes* --> for setting Attributes for the whole DAG (
>> > setAttribute(Operator operator, Attribute key, T value) - can be
>> > deprecated )
>> > *setOperatorAttributes* --> for setting Attributes for the operator
>> >
>> > Let me know your thoughts.
>> >
>> > Thanks
>> >
>>
>


Proposal : Last Window Store for Output Operators

2016-06-17 Thread Sandesh Hegde
Hi All,

Our current design pattern for using Window Data Manager is to purge the
Window Data in the committed window call back.

In the case of Output operators, all we care is the Last (/window)
snapshot, that is because we have already written the tuples before that
snapshot so those can be safely ignored. So there is no need to keep
multiple checkpoints around.

This behaviour can be achieved in the WindowDataManager but having the
separate interface makes it easy for the developers.

So the proposed interface, for the pattern mentioned, is as below,

public interface LastStoredState {
long getWindowId();
Map  load()
void save(, windowId, ) //Overwrite the previous window.
void delete()   // Cleanup
}

Please let me know your thoughts.

Thanks


Re: [ANNOUNCE] New Apache Apex PMC Member: Siyuan Hua

2016-06-16 Thread Sandesh Hegde
Congratulations!!!

On Thu, Jun 16, 2016 at 10:06 AM Bright Chen  wrote:

> Siyuan, Congratulations!
>
> thanks
> Bright
>
> > On Jun 16, 2016, at 9:27 AM, Pradeep A. Dalvi  wrote:
> >
> > Congrats Siyuan!
> >
> > -prad
> >
> > On Thursday, June 16, 2016, Dongming Liang 
> wrote:
> >
> >> Congrats, Siyuan!
> >>
> >> Thanks,
> >> - Dongming
> >>
> >> Dongming LIANG
> >> 
> >> dongming.li...@gmail.com 
> >>
> >> On Thu, Jun 16, 2016 at 12:19 AM, Shubham Pathak <
> shub...@datatorrent.com
> >> >
> >> wrote:
> >>
> >>> Congrats Siyuan !
> >>>
> >>> Thanks,
> >>> Shubham
> >>>
> >>> On Thu, Jun 16, 2016 at 11:17 AM, Ashish Tadose <
> ashishtad...@gmail.com
> >> >
> >>> wrote:
> >>>
>  Congratulations Siyuan.
> 
>  Ashish
> 
> > On 16-Jun-2016, at 10:49 AM, Aniruddha Thombare <
>  anirud...@datatorrent.com > wrote:
> >
> > Congratulations!!!
> >
> > Thanks,
> >
> > A
> >
> > _
> > Sent with difficulty, I mean handheld ;)
> > On 16 Jun 2016 10:47 am, "Devendra Tagare" <
> >> devend...@datatorrent.com >
> > wrote:
> >
> >> Congratulations Siyuan
> >>
> >> Cheers,
> >> Dev
> >> On Jun 15, 2016 10:13 PM, "Chinmay Kolhatkar" <
> >>> chin...@datatorrent.com >
> >> wrote:
> >>
> >>> Congrats Siyuan :)
> >>>
> >>> On Wed, Jun 15, 2016 at 10:05 PM, Priyanka Gugale <
> >> pri...@apache.org 
> 
> >>> wrote:
> >>>
>  Congrats Siyuan :)
> 
>  -Priyanka
> 
>  On Thu, Jun 16, 2016 at 10:19 AM, Pradeep Kumbhar <
> >>> prad...@datatorrent.com 
> >
>  wrote:
> 
> > Congratulations Siyuan!!
> >
> > On Thu, Jun 16, 2016 at 10:17 AM, Teddy Rusli <
> >>> te...@datatorrent.com 
> >>>
> > wrote:
> >
> >> Congrats Siyuan!
> >>
> >> On Wed, Jun 15, 2016 at 9:28 PM, Ashwin Chandra Putta <
> >> ashwinchand...@gmail.com > wrote:
> >>
> >>> Congratulations Siyuan!!
> >>> On Jun 15, 2016 9:26 PM, "Thomas Weise" <
> >> tho...@datatorrent.com >
> > wrote:
> >>>
>  The Apache Apex PMC is pleased to announce that Siyuan Hua is
> >>> now a
> > PMC
>  member. We appreciate all his contributions to the project so
> >>> far,
> > and
> >>> are
>  looking forward to more.
> 
>  Welcome, Siyuan, and congratulations!
>  Thomas, for the Apache Apex PMC.
> 
> >>>
> >>
> >>
> >>
> >> --
> >> Regards,
> >>
> >> Teddy Rusli
> >>
> >
> >
> >
> > --
> > *regards,*
> > *~pradeep*
> >
> 
> >>>
> >>
> 
> 
> >>>
> >>
>
>


Purging the checkpoints from the StreamingContainers

2016-06-15 Thread Sandesh Hegde
Hello Team,

Purging of the Checkpoints is done in Stram. Why not do that from the
StreamingContainers?

Committed window information is already available in StreamingContainers
and it will also distribute the computation across the containers.

Corner cases can still be handled in Stram. Example: Dynamic partitioning.

Thanks


Re: Proposal : DAG - SetOperatorAttribute

2016-06-07 Thread Sandesh Hegde
I have created a Jira, will start working on this.

https://issues.apache.org/jira/browse/APEXCORE-470



On Tue, Jun 7, 2016 at 12:21 PM Munagala Ramanath <r...@datatorrent.com>
wrote:

> +1
>
> Since we have *setInputPortAttribute* and *setOutputPortAttribute*, it
> seems reasonable
> to add *setOperatorAttribute*.
>
> Ram
>
> On Mon, Jun 6, 2016 at 1:39 PM, Sandesh Hegde <sand...@datatorrent.com>
> wrote:
>
> > Currently, *setAttribute* is used to set the operator attributes. Other 2
> > Attribute setting APIs are specific to input ports
> > (*setInputPortAttributes*) and output ports (*setOutputPortsAttributes*).
> >
> > Proposal is to have *SetOperatorAttribute*
> > api, which will clearly indicate that user wants set attributes on the
> > operator.
> > ( setOperatorAttribute(Operator operator, Attribute key, T value) )
> >
> > Following will be the roles for the APIs
> > *setAttributes* --> for setting Attributes for the whole DAG (
> > setAttribute(Operator operator, Attribute key, T value) - can be
> > deprecated )
> > *setOperatorAttributes* --> for setting Attributes for the operator
> >
> > Let me know your thoughts.
> >
> > Thanks
> >
>


Re: A proposal for Malhar

2016-05-27 Thread Sandesh Hegde
+1 for removing the not-used operators.

So we are creating a process for operator writers who don't want to
understand the platform, yet wants to contribute? How big is that set?
If we tell the app-user, here is the code which has not passed all the
checklist, will they be ready to use that in production?

This thread has 2 conflicting forces, reduce the operators and make it easy
to add more operators.



On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni 
wrote:

> On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta 
> wrote:
>
> > Pramod,
> >
> > By that logic I would say let's put all partitionable operators into one
> > folder, non-partitionable operators in another and so on...
> >
>
> Remember the original goal of making it easier for new members to
> contribute and managing those contributions to maturity. It is not a
> functional level separation.
>
>
> > When I look at hadoop code I see these annotations being used at class
> > level and not at package/folder level.
>
>
> I had a typo in my email, I meant to say "think of this like a folder..."
> as an analogy and not literally.
>
> Thanks
>
>
> > Thanks
> >
> > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni  >
> > wrote:
> >
> > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> gaurav.gopi...@gmail.com>
> > > wrote:
> > >
> > > > Can same goal not be achieved by
> > > > using org.apache.hadoop.classification.InterfaceStability.Evolving /
> > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > annotation?
> > > >
> > >
> > > I think it is important to localize the additions in one place so that
> it
> > > becomes clearer to users about the maturity level of these, easier for
> > > developers to track them towards the path to maturity and also
> provides a
> > > clearer directive for committers and contributors on acceptance of new
> > > submissions. Relying on the annotations alone makes them spread all
> over
> > > the place and adds an additional layer of difficulty in identification
> > not
> > > just for users but also for developers who want to find such operators
> > and
> > > improve them. This of this like a folder level annotation where
> > everything
> > > under this folder is unstable or evolving.
> > >
> > > Thanks
> > >
> > >
> > > >
> > > > On Fri, May 27, 2016 at 12:35 PM, David Yan 
> > > wrote:
> > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > Malhar in its current state, has way too many operators that
> > fall
> > > > in
> > > > > > the
> > > > > > > > "non-production quality" category. We should make it obvious
> to
> > > > users
> > > > > > > that
> > > > > > > > which operators are up to par, and which operators are not,
> and
> > > > maybe
> > > > > > > even
> > > > > > > > remove those that are likely not ever used in a real use
> case.
> > > > > > > >
> > > > > > >
> > > > > > > I am ambivalent about revisiting older operators and doing this
> > > > > exercise
> > > > > > as
> > > > > > > this can cause unnecessary tensions. My original intent is for
> > > > > > > contributions going forward.
> > > > > > >
> > > > > > >
> > > > > > IMO it is important to address this as well. Operators outside
> the
> > > play
> > > > > > area should be of well known quality.
> > > > > >
> > > > > >
> > > > > I think this is important, and I don't anticipate much tension if
> we
> > > > > establish clear criteria.
> > > > > It's not helpful if we let the old subpar operators stay and put up
> > the
> > > > > bars for new operators.
> > > > >
> > > > > David
> > > > >
> > > >
> > >
> >
>


Re: Kafka Exactly once output operator

2016-05-26 Thread Sandesh Hegde
Current design:

Window Data Manager - Stores the Kafka partitions offsets.
Kafka Key - Used by the operator = AppID#OperatorId

During recovery. Partially written window is re-created using the following
 approach:

Tuples between the largest recovery offsets and the current offset are
checked. Based on the key, tuples written by the other entities are
discarded.

Only tuples which are not in the recovered set are emitted.

Here is the first cut of the design
https://github.com/apache/incubator-apex-malhar/pull/298

Please give your feedbacks on the design.

@Bright,
Recovery data needs to be present in the Key, to distinguish the tuple
coming from the different instances of the output operator or external
applications.

Thanks

On Fri, May 13, 2016 at 2:14 PM Bright Chen <bri...@datatorrent.com> wrote:

> Hi Sandesh,
> I think it’s maybe better to keep it into Jira.
>
> Do you mean keep the key in other Kafka topic or the key is in fact the
> key of Kafka Message which represent user tuple?
> If it  is separate key, how to keep the relation between key and value?
> If Key is the key of Kafka message, basically, it will change the expected
> data. As I understand, the key here is just used for recovery, it’s not the
> data user required. And the data which write to the Kafka probably need to
> be decided by the customer logic.
>
> Think about a customer build two applications with our operator, the first
> application write data to Kafka, the second one read data from Kafka. And
> at the very beginning, the first application was implemented by a
> none-exactly once output operator, and then changed to exactly once
> operator. I think the customer don’t expect to change the second
> application. But the second application has to be changed if it’s logic
> depended on key.
>
> thanks
> Bright
>
> > On May 13, 2016, at 12:37 PM, Sandesh Hegde <sand...@datatorrent.com>
> wrote:
> >
> > Hi All,
> >
> > I am working on Kafka 0.9 output operator and one of the requirement is
> to
> > implement Exactly Once Output operator. Here is the one possible idea,
> > please give your feedback or suggest new design.
> >
> >
> -
> >
> > Use *Key* to store meta information which is used during recovery.
> >
> > Operator users will use *Value* to store their key-value pair and
> implement
> > the Kafka partitioning accordingly.
> >
> > Format of the *Key* is as specified below:
> >
> >
> >
> > Key = 1. OperatorName#ApexPartitionId#WindowId#Message#MessageId ( During
> > message write )
> >
> > 2. OperatorName#ApexPartitionId#WindowId#CheckPoint ( During end
> > Window )
> >
> > During End window, checkpoint marker is written to all the Kafka
> partitions
> > of the topic.
> >
> > Every message is given a message id, counter-reset every window, and then
> > written to Kafka.
> >
> > During recovery, Kafka partitions are read until the last checkpoint
> > message from this operator is reached and the partially written window is
> > constructed.
> >
> >
> 
> >
> > Note: Existing Kafka exactly once operator, ( Kafka 0.8 ) also needs to
> be
> > re-written.
> >
> > Thanks
>
>