Re: [DISCUSS] NPM / Node Problems

2017-11-26 Thread RaghuMitra Kandikonda
Looking at some of the build failure emails and past experience i
would suggest having a node & npm version check in our build scripts
and moving dependency management to yarn.

We need not restrict the build to a specific version of node & npm but
we can surely suggest a min version required to build UI successfully.

-Raghu



On Fri, Nov 24, 2017 at 10:21 PM, Simon Elliston Ball
 wrote:
> Agreeing with Nick, it seems like the main reason people are building 
> themselves, and hitting all these environmental issues, is that we do not as 
> a project produce binary release artefacts (the rpms which users could just 
> install) and instead leave that for the commercial distributors to do.
>
> Yarn may help with some of the dependency version issues we’re having, but 
> not afaik with the core missing library headers / build tools / node and npm 
> version issue, those would seem to fit a documentation fix and improvements 
> to platform-info to flag the problems, so this can then be a pre-flight check 
> tool as well as a diagnostic tool.
>
> Another option I would put on the table is to standardise our build 
> environment, so that the non-java bits are run in a standard docker image or 
> something fo the sort, that way we can take control of all the environmental 
> and OS dependent pieces, much as we do right now with the rpm build sections 
> of the mpack build.
>
> The challenge here will be adding the relevant maven support. At the moment 
> we’re relying on the maven npm and node build plugins, this would likely need 
> replacing with something custom and a challenge to support to go dow this 
> route.
>
> Perhaps the real answer here is to push people who are just kicking the tyres 
> towards a binary distribution, or at least rpm artefacts as part of the 
> Apache release to give them a head start for a happy path on a known good OS 
> environment.
>
> Simon
>
>> On 24 Nov 2017, at 16:01, Nick Allen  wrote:
>>
>> Yes, it is a problem.  I think you've identified a couple important things
>> that we could address in parallel.  I see these as challenges we need to
>> solve for the dev community.
>>
>> (1) NPM is causing us some major headaches.  Which version do we require?
>> How do I install that version (on Mac, Windows, Linux)?  Does YARN help
>> here at all?
>>
>> (2) Can we automate the prerequisite checks that we currently do manually
>> with `platform-info.sh`?  An automated check could run and fail as part of
>> the build or deployment process.
>>
>>
>>
>> More importantly though is that users should not have to build Metron at
>> all.  They should not have to worry about installing NPM and the rest of
>> the development tooling.   Here are some options that are not mutually
>> exclusive.
>>
>>
>> (1) Create an image in Atlas that has Metron fully installed.  A new user
>> could run single node Metron on their laptop with a single command and the
>> only prereqs would be Vagrant and Virtualbox.  We could cut new images for
>> each Metron release.  Or selectively cut new dev images from master as we
>> see fit.
>>
>> (2) Distribute the Metron RPMs (and the MPack tarball?) so that users can
>> install Metron on a cluster without having to build it.
>>
>>
>>
>>
>>
>>
>> On Fri, Nov 24, 2017 at 10:11 AM, Otto Fowler 
>> wrote:
>>
>>> It seems like it is getting *very* common for people to have trouble
>>> building recently. Errors with NPM and Node seen common, with fixes ranging
>>> from updating c/c++ libs to the version of npm/node.
>>>
>>> There has to be a better way to do this.
>>>
>>>   -
>>>
>>>   Are we out of date or missing requirements in our documentation?
>>>   -
>>>
>>>   Does our documentation need to be updated for building?
>>>   -
>>>
>>>   Is there a better way in maven to check the versions required for some
>>>   of these things and fail faster with a better message?
>>>   -
>>>
>>>   Are we building correctly or are we asking for trouble?
>>>
>>> The ability to build metron is pretty important, and it seems that people
>>> are having a lot of trouble related to the new technologies in alerts and
>>> config ui.
>>>
>


Re: Using Storm Resource Aware Scheduler

2017-11-26 Thread Ali Nazemian
Sounds great, Simon. We will work on refactoring our design to be aligned
with Metadata feature. As long as we can use the same parser, there is no
technical reason that we cannot use the same feed to handle it. However, I
need to check it for more details to understand how complex it would be to
merge different tenants at this moment. Hopefully, it shouldn't be too
complex.

BTW, I haven't had any permission to close this ticket, so I have just
created a duplicate link to the main ticket as you mentioned.

Cheers,
Ali

On Mon, Nov 27, 2017 at 9:06 AM, Simon Elliston Ball <
si...@simonellistonball.com> wrote:

> The multi-tenancy though meta-data method mentioned is designed to solve
> exactly that problem and has been in the project for some time now. The
> goal would be to have one topology per data schema and use the key to
> communicate tenant meta-data. See https://archive.apache.org/
> dist/metron/0.4.1/site-book/metron-platform/metron-
> parsers/index.html#Metadata  dist/metron/0.4.1/site-book/metron-platform/metron-
> parsers/index.html#Metadata> for details.
>
> The storm issue you mention is something for the storm project to look at,
> so we can’t really comment on their behalf here, but yeah, it will be nice
> to have storm do some of the tuning for us at some point.
>
> Not that the UI already has the tuning parameters you’re talking about in
> the latest version, so there is no need for the new JIRA (
> https://issues.apache.org/jira/browse/METRON-1330 <
> https://issues.apache.org/jira/browse/METRON-1330>). It should be closed
> as a duplicate of https://issues.apache.org/jira/browse/METRON-1161 <
> https://issues.apache.org/jira/browse/METRON-1161>.
>
> Simon
>
> > On 26 Nov 2017, at 02:15, Ali Nazemian  wrote:
> >
> > Oops, I didn't know that. Happy Thanksgiving.
> >
> > Thanks, Otto and Simon.
> >
> > As you are aware of our use cases, with the current limitations of
> > multi-tenancy support, we are creating a feed per tenant per device.
> > Sometimes the amount of traffic we are receiving per each tenant and per
> > each device is way less than dedicating one storm slot for it.
> Therefore, I
> > was hoping to make it at least theoretically possible to tune resources
> > more wisely, but it is not going to be easy at all. This is probably a
> use
> > case that storm auto-scaling mechanism would be very nice to have.
> >
> > https://issues.apache.org/jira/browse/STORM-594
> >
> > On the other side, I can recall there was a PR to address multi-tenancy
> by
> > adding meta-data to Kafka topic. However, I lost track of that feature,
> so
> > maybe this situation can be tackled at another level by merging different
> > parsers.
> >
> > I will create a Jira ticket to add an ability in UI to tune Metron parser
> > feeds at Storm level. Right now it is a little hard to maintain tuning
> > configurations per each parser, and as soon as somebody restarts them
> from
> > Management-UI/Ambari, it will be overwritten.
> >
> >
> > Cheers,
> > Ali
> >
> > On Sat, Nov 25, 2017 at 3:36 AM, Simon Elliston Ball <
> > si...@simonellistonball.com> wrote:
> >
> >> Implementing the resource aware scheduler would be decidedly
> non-trivial.
> >> Every topology will need additional configuration to tune for things
> like
> >> memory sizes, which is not going to buy you much change. So, at the
> >> micro-tuning level of parser this doesn’t make a lot of sense.
> >>
> >> However, it may be relevant to consider separate tuning for parsers in
> >> general vs the core enrichment and indexing topologies (potentially also
> >> for separate indexing topologies when this comes in) and the resource
> >> scheduler could provide a theoretical benefit there.
> >>
> >> Specifying resource requirements per parser topology might sound like a
> >> good idea, but if your parsers are working the way they should, they
> should
> >> be using a small amount of memory as their default size, and achieving
> >> additional resource use by multiplying workers and executors (to get
> higher
> >> usage per slot) and balance the load that way. To be honest, the only
> >> difference you’re going to get from the RAS is to add a bunch of tuning
> >> parameters which allow slightly different granularity of units for
> things
> >> like memory.
> >>
> >> The other RAS feature which might be a good add is prioritisation of
> >> different parser topologies, but again, this is probably not something
> you
> >> want to push hard on unless you are severely limited in resources (in
> which
> >> case, why not just add another node, it will be cheaper than spending
> all
> >> that time micro-tuning the resource requirements for each data feed).
> >>
> >> Right now we do allow a lot of micro tuning of parallelism around things
> >> like the count of executor threads, which is achieves roughly the
> >> equivalent of the cpu based limits in the RAS.
> >>
> >> TL;DR:
> >>
> >> If you’re not using resource 

Re: [DISCUSS] Upcoming Release

2017-11-26 Thread Matt Foley
Hope everyone (at least in the U.S.) had a great Thanksgiving holiday.
Regarding status of the release effort, still pending METRON-1252, so not 
making the release branch yet.

Regards,
--Matt

On 11/17/17, 1:32 PM, "Matt Foley"  wrote:

(With release manager hat on)

The community has proposed a release of Metron in the near future, focusing 
on Meta-alerts running in Elasticsearch.
Congrats on getting so many of the below already done.  At this point, only 
METRON-1252, and the discussion of how to handle joint release of the Metron 
bro plugin, remain as gating items for the release.  I project these will be 
resolved next week, so let’s propose the following:

Sometime next week, after the last bits are done, I’ll start the release 
process and create the release branch.

The proposed new version will be 0.4.2, unless there are backward 
incompatible changes that support making it 0.5.0.
Currently there are NO included Jiras labeled ‘backward-incompatible’, nor 
having Docs Text indicating so.
***If anyone knows that some of the commits included since 0.4.1 introduce 
backward incompatibility, please say so now on this thread, and mark the Jira 
as such.***

The 90 or so jiras/commits already in master branch since 0.4.1 are listed 
below.
Thanks,
--Matt

METRON-1301 Alerts UI - Sorting on Triage Score Unexpectedly Filters 
Some Records (nickwallen) closes apache/metron#832
METRON-1294 IP addresses are not formatted correctly in facet and group 
results (merrimanr) closes apache/metron#827
METRON-1291 Kafka produce REST endpoint does not work in a Kerberized 
cluster (merrimanr) closes apache/metron#826
METRON-1290 Only first 10 alerts are update when a MetaAlert status is 
changed to inactive (justinleet) closes apache/metron#842
METRON-1311 Service Check Should Check Elasticsearch Index Templates 
(nickwallen) closes apache/metron#839
METRON-1289 Alert fields are lost when a MetaAlert is created 
(merrimanr) closes apache/metron#824
METRON-1309 Change metron-deployment to pull the plugin from 
apache/metron-bro-plugin-kafka (JonZeolla) closes apache/metron#837
METRON-1310 Template Delete Action Deletes Search Indices (nickwallen) 
closes apache/metron#838
METRON-1275: Fix Metron Documentation closes apache/incubator-metron#833
METRON-1295 Unable to Configure Logging for REST API (nickwallen) 
closes apache/metron#828
METRON-1307 Force install of java8 since java9 does not appear to work 
with the scripts (brianhurley via ottobackwards) closes apache/metron#835
METRON-1296 Full Dev Fails to Deploy Index Templates (nickwallen via 
cestella) closes apache/incubator-metron#829
METRON-1281 Remove hard-coded indices from the Alerts UI (merrimanr) 
closes apache/metron#821
METRON-1287 Full Dev Fails When Installing EPEL Repository (nickwallen) 
closes apache/metron#820
METRON-1267 Alerts UI returns a 404 when refreshing the alerts-list 
page (iraghumitra via merrimanr) closes apache/metron#819
METRON-1283 Install Elasticsearch template as a part of the mpack 
startup scripts (anandsubbu via nickwallen) closes apache/metron#817
METRON-1254: Conditionals as map keys do not function in Stellar closes 
apache/incubator-metron#801
METRON-1261 Apply bro security patch (JonZeolla via ottobackwards) 
closes apache/metron#805
METRON-1284 Remove extraneous dead query in ElasticsearchDao 
(justinleet) closes apache/metron#818
METRON-1270: fix for warnings missing @return tag argument in 
metron-analytics/metron-profiler-common and metron-profiler-client closes 
apache/incubator-metron#810
METRON-1272 Hide child alerts from searches and grouping if they belong 
to meta alerts (justinleet) closes apache/metron#811
METRON-1224 Add time range selection to search control (iraghumitra via 
james-sirota) closes apache/metron#796
METRON-1280 0.4.1 -> 0.4.2 missed a couple of projects (cestella via 
justinleet) closes apache/metron#816
METRON-1243: Add a REST endpoint which allows us to get a list of all 
indice closes apache/incubator-metron#797
METRON-1196 Increment master version number to 0.4.2 for on-going 
development (mattf-horton) closes apache/metron#767
METRON-1278 Strip Build Status widget from root README.md 
in site-book build (mattf-horton) closes apache/metron#815
METRON-1274 Master has failure in StormControllerIntegrationTest 
(merrimanr) closes apache/metron#813
METRON-1266 Profiler - SASL Authentication Failed (nickwallen) closes 
apache/metron#809
METRON-1260 Include Alerts UI in Ambari Service Check (nickwallen) 
closes apache/metron#804
METRON-1251: Typo and formatting fixes for metron-rest README closes 
apache/incubator-metron#800
METRON-1241: Enable the REST API to use a cache for 

Re: Using Storm Resource Aware Scheduler

2017-11-26 Thread Simon Elliston Ball
The multi-tenancy though meta-data method mentioned is designed to solve 
exactly that problem and has been in the project for some time now. The goal 
would be to have one topology per data schema and use the key to communicate 
tenant meta-data. See 
https://archive.apache.org/dist/metron/0.4.1/site-book/metron-platform/metron-parsers/index.html#Metadata
 

 for details.

The storm issue you mention is something for the storm project to look at, so 
we can’t really comment on their behalf here, but yeah, it will be nice to have 
storm do some of the tuning for us at some point. 

Not that the UI already has the tuning parameters you’re talking about in the 
latest version, so there is no need for the new JIRA 
(https://issues.apache.org/jira/browse/METRON-1330 
). It should be closed as a 
duplicate of https://issues.apache.org/jira/browse/METRON-1161 
. 

Simon

> On 26 Nov 2017, at 02:15, Ali Nazemian  wrote:
> 
> Oops, I didn't know that. Happy Thanksgiving.
> 
> Thanks, Otto and Simon.
> 
> As you are aware of our use cases, with the current limitations of
> multi-tenancy support, we are creating a feed per tenant per device.
> Sometimes the amount of traffic we are receiving per each tenant and per
> each device is way less than dedicating one storm slot for it. Therefore, I
> was hoping to make it at least theoretically possible to tune resources
> more wisely, but it is not going to be easy at all. This is probably a use
> case that storm auto-scaling mechanism would be very nice to have.
> 
> https://issues.apache.org/jira/browse/STORM-594
> 
> On the other side, I can recall there was a PR to address multi-tenancy by
> adding meta-data to Kafka topic. However, I lost track of that feature, so
> maybe this situation can be tackled at another level by merging different
> parsers.
> 
> I will create a Jira ticket to add an ability in UI to tune Metron parser
> feeds at Storm level. Right now it is a little hard to maintain tuning
> configurations per each parser, and as soon as somebody restarts them from
> Management-UI/Ambari, it will be overwritten.
> 
> 
> Cheers,
> Ali
> 
> On Sat, Nov 25, 2017 at 3:36 AM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
> 
>> Implementing the resource aware scheduler would be decidedly non-trivial.
>> Every topology will need additional configuration to tune for things like
>> memory sizes, which is not going to buy you much change. So, at the
>> micro-tuning level of parser this doesn’t make a lot of sense.
>> 
>> However, it may be relevant to consider separate tuning for parsers in
>> general vs the core enrichment and indexing topologies (potentially also
>> for separate indexing topologies when this comes in) and the resource
>> scheduler could provide a theoretical benefit there.
>> 
>> Specifying resource requirements per parser topology might sound like a
>> good idea, but if your parsers are working the way they should, they should
>> be using a small amount of memory as their default size, and achieving
>> additional resource use by multiplying workers and executors (to get higher
>> usage per slot) and balance the load that way. To be honest, the only
>> difference you’re going to get from the RAS is to add a bunch of tuning
>> parameters which allow slightly different granularity of units for things
>> like memory.
>> 
>> The other RAS feature which might be a good add is prioritisation of
>> different parser topologies, but again, this is probably not something you
>> want to push hard on unless you are severely limited in resources (in which
>> case, why not just add another node, it will be cheaper than spending all
>> that time micro-tuning the resource requirements for each data feed).
>> 
>> Right now we do allow a lot of micro tuning of parallelism around things
>> like the count of executor threads, which is achieves roughly the
>> equivalent of the cpu based limits in the RAS.
>> 
>> TL;DR:
>> 
>> If you’re not using resource pools for different users and using the idea
>> that prioritisation can lead to arbitrary kills, all you’re getting is a
>> slightly different way of tuning knobs that already exist, but you would
>> get a slightly different granularity. Also, we would have to rewrite all
>> the topology code to add the config endpoints for CPU and memory estimates.
>> 
>> Simon
>> 
>>> On 24 Nov 2017, at 07:56, Ali Nazemian  wrote:
>>> 
>>> Any help regarding this question would be appreciated.
>>> 
>>> 
>>> On Thu, Nov 23, 2017 at 8:57 AM, Ali Nazemian 
>> wrote:
>>> 
 30 mins average of CPU load by checking Ambari.
 
 On 23 Nov. 2017 00:51, "Otto Fowler"  wrote:
 
 How are you measuring the