[Feature Proposal] Add metrics dropwizards like gauges, meters, histogram etc to Apex Platform

2018-05-08 Thread Deepak Narkhede
Hi Community,

I want to propose addition of metrics like gauges, meters, counters and
historgram for the following components.
1) Addition of metrics for Container Stats.
2) Addition of metrics for Operator Stats.
3) Addition of  metrics for  Stram Application Master stats.
4) Addition of metrics for JVM related stats for all containers.

To implement them would be using metrics dropwizard api's. (
http://metrics.dropwizard.io/)
Use cases:
1) Can be directly pushed to external visualisation system like Graphite.
2) Can be viewed in visualVM tools through JMX.
3) Can be outputted to console.
4) It is also possible to push the metrics to custom sink.

We will also need to write sinks and reporter, if required for custom sinks.

Design/Implementation approach:
Way #1:
1) Create new annotations like @MetricTypeGauge, @MetricTypeMeter,
@MetricTypeCounter, @MetricTypeHistogram. They can be both fields and
methods.
2) Add them to respective methods or fields like StreamingContainer,
StreamingAppMasterService for extraction of relevant metrics.
3) While Node creation ( InputNode/GeneticNode/OiONode), we create and
initialise  the metrics registry depending on components.
4) While collectMetrics() part of operator runner thread ( InputNode.run
/GenericNode.run), we actually invoke the annotations methods and collect
different types of metrics.
5) We can have a sink which pushes the metrics to reporter like Console,
JMX etc.

Way #2:
Use existing AutoMetrics annotations, convert some metrics to different
types like gauge, counter etc..But this cannot be done generically as we
don't know the types. Still more investigation is going on this approach.

I would prefer first way.

Note: There are some complications, if two operators are deployed on same
jvm conatiner. But  I think it can be resolved by creating two different
metrics registry with unique id from JVM.

Let me know your thoughts on this.

Thanks,
Deepak


REST server for Apex application management

2018-05-08 Thread Thomas Weise
Hi,

I wanted to make you aware that there is a base implementation for a REST
service to manage Apex applications here:
https://github.com/atrato/atrato-server

It is ASL and everyone is welcome to use it or contribute to it. If there
is sufficient interest it can also be discussed to add it to apex-core in
the future.

Thanks,
Thomas


Re: [RESULT] [VOTE] Apache Apex Core Release 3.7.0 (RC1)

2018-05-08 Thread Thomas Weise
The md5 related change needs to be reflected on
http://apex.apache.org/release.html

Chinmay, are you going to create the 3.7.0 sandbox release in
https://github.com/chinmaykolhatkar/docker-pool ?

Also, after addition of the binary build the release instructions will need
more updating...



On Tue, Apr 24, 2018 at 9:51 AM, Pramod Immaneni 
wrote:

> Yes Thomas will complete it in a day or two.
>
> Thanks
>
> On Tue, Apr 24, 2018 at 9:15 AM, Thomas Weise  wrote:
>
> > Hi Pramod,
> >
> > Are you going to complete the release?
> >
> > Thanks
> >
> >
> > On Wed, Apr 18, 2018 at 9:31 PM, Pramod Immaneni  >
> > wrote:
> >
> > > The vote concludes and passes.
> > >
> > > binding +1 (3)
> > >
> > > Justin Mclean
> > > Vlad Rozov
> > > Thomas Weise
> > >
> > > Will complete rest of the release activities.
> > >
> > > Thanks for verifying and voting in the release.
> > >
> > > Pramod.
> > >
> > > On Tue, Apr 17, 2018 at 7:07 PM, Thomas Weise  wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > - verified signatures
> > > > - build from source archive
> > > > - tests pass
> > > > - run pi demo on YARN 2.7.1
> > > >
> > > > When updating the download page after the release, please also remove
> > the
> > > > md5 links from it.
> > > >
> > > > Thanks,
> > > > Thomas
> > > >
> > > >
> > > > On Sat, Apr 14, 2018 at 11:48 AM, Pramod Immaneni <
> > > pra...@datatorrent.com>
> > > > wrote:
> > > >
> > > > > Dear Community,
> > > > >
> > > > > Please vote on the following Apache Apex Core 3.7.0 release
> > candidate.
> > > > >
> > > > > This is a source release with binary artifacts published to Maven.
> > > > >
> > > > > List of all issues fixed:  https://s.apache.org/fWT8
> > > > > User documentation: https://apex.apache.org/docs/apex-3.7/
> > > > >
> > > > > Staging directory:
> > > > > https://dist.apache.org/repos/dist/dev/apex/apache-apex-
> > > core-3.7.0-RC1/
> > > > > Source zip:
> > > > > https://dist.apache.org/repos/dist/dev/apex/apache-apex-
> > > > > core-3.7.0-RC1/apache-apex-core-3.7.0-source-release.zip
> > > > > Source tar.gz:
> > > > > https://dist.apache.org/repos/dist/dev/apex/apache-apex-
> > > > > core-3.7.0-RC1/apache-apex-core-3.7.0-source-release.tar.gz
> > > > > Maven staging repository:
> > > > > https://repository.apache.org/content/repositories/
> > orgapacheapex-1033
> > > > >
> > > > > Git source:
> > > > > https://github.com/apache/apex-core/tree/v3.7.0-RC1
> > > > > (commit:cd0b0d9f31b3a198425440b66c52802d1e592b4e)
> > > > >
> > > > > PGP key:
> > > > > http://pgp.mit.edu:11371/pks/lookup?op=vindex&search=
> > pra...@apache.org
> > > > > (Key: 239E728D)
> > > > > KEYS file:
> > > > > https://dist.apache.org/repos/dist/release/apex/KEYS
> > > > >
> > > > > More information at:
> > > > > http://apex.apache.org
> > > > >
> > > > > Please try the release and vote; vote will be open for 72 hours.
> > > > >
> > > > > [ ] +1 approve (and what verification was done)
> > > > > [ ] -1 disapprove (and reason why)
> > > > >
> > > > > http://www.apache.org/foundation/voting.html
> > > > >
> > > > > How to verify release candidate:
> > > > >
> > > > > http://apex.apache.org/verification.html
> > > > >
> > > > > Thanks,
> > > > > Pramod
> > > > >
> > > >
> > >
> >
>


Re: Apex-core build/release steps improvements proposal

2018-05-08 Thread Thomas Weise
Thanks for bringing that up. Docker in this context is only a convenient
way to create a sandbox. There is other work that would need to be happen
to package applications as Docker images and deploy them on platforms such
as Kubernetes.

Thanks


On Thu, May 3, 2018 at 8:12 PM, Ananth G  wrote:

> +1 to all 3 considering we are trying to centralise the code.
>
> 2 should be redone eventually as part of
> https://issues.apache.org/jira/browse/APEXCORE-796 ? But the design for
> this needs to be seen in the broader context of some of the points
> mentioned below:
>
> Regarding 3, I agree that the current image is tightly coupled to bigtop.
> While making it independent of bigtop is a starting step, I believe we
> might need to revisit our thinking around as to how we would like to
> implement containerisation for Apex in the first place.
>
>
> There are multiple design items to be resolved for Apex containerisation:
>
> 1. Apex community needs to evaluate both Hadoop based and Hadoop free
> architectures. For non-hadoop based architectures, we need to solve DFS
> alternatives as well as the resource manager alternatives. Tickets like
> https://issues.apache.org/jira/browse/APEXCORE-724 will bring this design
> issue in more detail I believe.
>
> 2. Consider how Apex applications will be built as part of the build
> process that results in a docker image of the Apex application ( That would
> contain application code , malhar operators etc)
>
> 3. Consider how we would like to make use of Hadoop 3 support for Docker
> https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/
> DockerContainers.html
>
>
>
>
> Just curious about the docker implementation: Is the end goal of the docker
> image to provide a sandbox for
>
> 1. Evaluating Apex or
> 2. Make Apex installable binary as an image or
> 3. Make Apex applications aligned with a docker build process ( Ex: Python
> libraries installed on the image as part of the application code )?
>
> The reason I raise these questions is that it does not make much sense to
> bundle a cluster in a box with any distribution ( dockerizing a Hadoop
> cluster is non-trivial and I have not heard good success stories around
> this approach so far that can be enabled for production). The docker image
> that embeds a Hadoop binary is thus only useful for evaluation wherein
> everything is contained in the same image and nothing more.
>
> My suspicion is that we will anyways would revisit this approach if our
> goals are 2 and/or 3 as well. Perhaps we will address these questions as
> part of https://issues.apache.org/jira/browse/APEXCORE-724 and
> https://issues.apache.org/jira/browse/APEXCORE-796.
>
> Regards,
> Ananth
>
> On Fri, May 4, 2018 at 10:31 AM, Vlad Rozov  wrote:
>
> > +1 to all 3.
> >
> > Thank you,
> >
> > Vlad
> >
> >
> > On 5/3/18 07:03, Thomas Weise wrote:
> >
> >> +1 to all of this
> >>
> >> There are existing JIRAs that you can assign / add to:
> >>
> >> https://issues.apache.org/jira/browse/APEXCORE-727
> >>
> >> Thanks!
> >>
> >>
> >>
> >> On Thu, May 3, 2018 at 4:26 AM, Chinmay Kolhatkar 
> >> wrote:
> >>
> >> Hello Community,
> >>>
> >>> I want to propose following improvements for apex-core build and
> related
> >>> steps:
> >>>
> >>> 1. Most (probably all of the open source project) has the a binary
> >>> release
> >>> package of the software and not just the source release package.
> >>> Currently
> >>> we have only source package. Luckily there are few places (outside of
> >>> apache apex) where binary packages of apex has been created for
> different
> >>> purposes : https://github.com/atrato/apex-cli-package &
> >>> https://github.com/apache/bigtop)
> >>>
> >>> Proposal here is generate this binary release package as a part of
> build
> >>> process of apex-core.
> >>>
> >>>
> >>> 2. Currently, the docker build that is being created for apex is built
> >>> one
> >>> of my personal repository (https://github.com/chinmaykol
> >>> hatkar/docker-pool
> >>> ).
> >>> While I don't mind hosting the content (Dockerfile etc...) in my
> >>> repository, I believe it make sense to host this in apex-core
> repository.
> >>> This way, there is a possibility of using docker github triggers for
> >>> building the docker image from release branches.
> >>>
> >>>
> >>> 3. Currently the docker build uses hadoop and apex specific packages
> from
> >>> bigtop deb repo & CI. (See
> >>> https://github.com/chinmaykolhatkar/docker-pool/
> >>> blob/master/apex/ubuntu/app/setup.sh
> >>> for more details)
> >>> While use of hadoop packages from bigtop repo is fine, we also need to
> >>> rely
> >>> on bigtop contribution to update apex component and then build from
> >>> bigtop
> >>> CI for getting apex.deb package. Basically our docker image generation
> >>> process gets blocked on bigtop source update to generate the updated
> apex
> >>> deb.
> >>> As we technically don't need to depend on bigtop to generate the apex
> >>> binary, the proposal here is to generate binary 

[jira] [Commented] (APEXCORE-813) Update docker build steps to use apex bin dist package

2018-05-08 Thread Thomas Weise (JIRA)

[ 
https://issues.apache.org/jira/browse/APEXCORE-813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467492#comment-16467492
 ] 

Thomas Weise commented on APEXCORE-813:
---

I would prefer the ability to create images from the master snapshot. That 
could be accomplished by publishing the snapshot dist build to the Maven repo 
and then pick it up from there during the container build (for master/latest).

Once the release branch is created, the Docker build could be pointed to the 
release binaries and triggered in dockerhub.

 

> Update docker build steps to use apex bin dist package
> --
>
> Key: APEXCORE-813
> URL: https://issues.apache.org/jira/browse/APEXCORE-813
> Project: Apache Apex Core
>  Issue Type: Task
>Reporter: Chinmay Kolhatkar
>Assignee: Chinmay Kolhatkar
>Priority: Major
>
> Currently docker build steps picks up apex deb from bigtop CI for installing 
> in apex docker image.
> Because of this docker image generation becomes dependent on update to bigtop 
> and then CI build to be generated from that.
> Instead the the binary package of apex can be used in docker image.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)