Re: Using Storm Resource Aware Scheduler

Simon Elliston Ball Sun, 26 Nov 2017 14:07:54 -0800

The multi-tenancy though meta-data method mentioned is designed to solve 
exactly that problem and has been in the project for some time now. The goal 
would be to have one topology per data schema and use the key to communicate 
tenant meta-data. See 
https://archive.apache.org/dist/metron/0.4.1/site-book/metron-platform/metron-parsers/index.html#Metadata
 
<https://archive.apache.org/dist/metron/0.4.1/site-book/metron-platform/metron-parsers/index.html#Metadata>
 for details.


The storm issue you mention is something for the storm project to look at, so 
we can’t really comment on their behalf here, but yeah, it will be nice to have 
storm do some of the tuning for us at some point. 

Not that the UI already has the tuning parameters you’re talking about in the 
latest version, so there is no need for the new JIRA 
(https://issues.apache.org/jira/browse/METRON-1330 
<https://issues.apache.org/jira/browse/METRON-1330>). It should be closed as a 
duplicate of https://issues.apache.org/jira/browse/METRON-1161 
<https://issues.apache.org/jira/browse/METRON-1161>. 

Simon

> On 26 Nov 2017, at 02:15, Ali Nazemian <alinazem...@gmail.com> wrote:
> 
> Oops, I didn't know that. Happy Thanksgiving.
> 
> Thanks, Otto and Simon.
> 
> As you are aware of our use cases, with the current limitations of
> multi-tenancy support, we are creating a feed per tenant per device.
> Sometimes the amount of traffic we are receiving per each tenant and per
> each device is way less than dedicating one storm slot for it. Therefore, I
> was hoping to make it at least theoretically possible to tune resources
> more wisely, but it is not going to be easy at all. This is probably a use
> case that storm auto-scaling mechanism would be very nice to have.
> 
> https://issues.apache.org/jira/browse/STORM-594
> 
> On the other side, I can recall there was a PR to address multi-tenancy by
> adding meta-data to Kafka topic. However, I lost track of that feature, so
> maybe this situation can be tackled at another level by merging different
> parsers.
> 
> I will create a Jira ticket to add an ability in UI to tune Metron parser
> feeds at Storm level. Right now it is a little hard to maintain tuning
> configurations per each parser, and as soon as somebody restarts them from
> Management-UI/Ambari, it will be overwritten.
> 
> 
> Cheers,
> Ali
> 
> On Sat, Nov 25, 2017 at 3:36 AM, Simon Elliston Ball <
> si...@simonellistonball.com> wrote:
> 
>> Implementing the resource aware scheduler would be decidedly non-trivial.
>> Every topology will need additional configuration to tune for things like
>> memory sizes, which is not going to buy you much change. So, at the
>> micro-tuning level of parser this doesn’t make a lot of sense.
>> 
>> However, it may be relevant to consider separate tuning for parsers in
>> general vs the core enrichment and indexing topologies (potentially also
>> for separate indexing topologies when this comes in) and the resource
>> scheduler could provide a theoretical benefit there.
>> 
>> Specifying resource requirements per parser topology might sound like a
>> good idea, but if your parsers are working the way they should, they should
>> be using a small amount of memory as their default size, and achieving
>> additional resource use by multiplying workers and executors (to get higher
>> usage per slot) and balance the load that way. To be honest, the only
>> difference you’re going to get from the RAS is to add a bunch of tuning
>> parameters which allow slightly different granularity of units for things
>> like memory.
>> 
>> The other RAS feature which might be a good add is prioritisation of
>> different parser topologies, but again, this is probably not something you
>> want to push hard on unless you are severely limited in resources (in which
>> case, why not just add another node, it will be cheaper than spending all
>> that time micro-tuning the resource requirements for each data feed).
>> 
>> Right now we do allow a lot of micro tuning of parallelism around things
>> like the count of executor threads, which is achieves roughly the
>> equivalent of the cpu based limits in the RAS.
>> 
>> TL;DR:
>> 
>> If you’re not using resource pools for different users and using the idea
>> that prioritisation can lead to arbitrary kills, all you’re getting is a
>> slightly different way of tuning knobs that already exist, but you would
>> get a slightly different granularity. Also, we would have to rewrite all
>> the topology code to add the config endpoints for CPU and memory estimates.
>> 
>> Simon
>> 
>>> On 24 Nov 2017, at 07:56, Ali Nazemian <alinazem...@gmail.com> wrote:
>>> 
>>> Any help regarding this question would be appreciated.
>>> 
>>> 
>>> On Thu, Nov 23, 2017 at 8:57 AM, Ali Nazemian <alinazem...@gmail.com>
>> wrote:
>>> 
>>>> 30 mins average of CPU load by checking Ambari.
>>>> 
>>>> On 23 Nov. 2017 00:51, "Otto Fowler" <ottobackwa...@gmail.com> wrote:
>>>> 
>>>> How are you measuring the utilization?
>>>> 
>>>> 
>>>> On November 22, 2017 at 08:12:51, Ali Nazemian (alinazem...@gmail.com)
>>>> wrote:
>>>> 
>>>> Hi all,
>>>> 
>>>> 
>>>> One of the issues that we are dealing with is the fact that not all of
>>>> the Metron feeds have the same type of resource requirements. For
>> example,
>>>> we have some feeds that even a single Strom slot is way more than what
>> it
>>>> needs. We thought we could make it more utilised in total by limiting at
>>>> least the amount of available heap space per feed to the parser topology
>>>> worker. However, since Storm scheduler relies on available slots, it is
>>>> very hard and almost impossible to utilise the cluster in the scenario
>>>> that
>>>> there will be lots of different topologies with different requirements
>>>> running at the same time. Therefore, on a daily basis, we can see that
>> for
>>>> example one of the Storm hosts is 120% utilised and another is 20%
>>>> utilised! I was wondering whether we can address this situation by using
>>>> Storm Resource Aware scheduler or not.
>>>> 
>>>> P.S: it would be very nice to have a functionality to tune Storm
>>>> topology-related parameters per feed in the GUI (for example in
>> Management
>>>> UI).
>>>> 
>>>> 
>>>> Regards,
>>>> Ali
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> --
>>> A.Nazemian
>> 
>> 
> 
> 
> -- 
> A.Nazemian

Re: Using Storm Resource Aware Scheduler

Reply via email to