Re: Problem in Flink 1.3.2 with Mesos task managers offers

2017-09-20 Thread Francisco Gonzalez Barea
Hello Eron,

Thank you for your reply, we will take a look at this.

Regards


On 19 Sep 2017, at 22:37, Eron Wright 
> wrote:

Hello, the current behavior is that Flink holds onto received offers for up to 
two minutes while it attempts to provision the TMs.   Flink can combine small 
offers to form a single TM, to combat fragmentation that develops over time in 
a Mesos cluster.   Are you saying that unused offers aren't being released 
after two minutes?

There's a log entry you should see in the JM log whenever an offer is released:
LOG.info(s"Declined offer ${lease.getId} from 
${lease.hostname()} "
  + s"of ${lease.memoryMB()} MB, ${lease.cpuCores()} cpus.")

The timeout value isn't configurable at the moment, but if you're willing to 
experiment by building Flink from source, you may adjust the two minute timeout 
to something lower as follows.   In the `MesosFlinkResourceManager` class, edit 
the `createOptimizer` method to call `withLeaseOfferExpirySecs` on the 
`TaskScheduler.Builder` object.

Let us know if that helps and we'll make the timeout configurable.
-Eron

On Tue, Sep 19, 2017 at 8:58 AM, Francisco Gonzalez Barea 
> wrote:
Hello guys,

We have a flink 1.3.2 session deployed from Marathon json to Mesos with some of 
the following parameters as environment variables:


"flink_mesos.initial-tasks": "8",
"flink_mesos.resourcemanager.tasks.mem": "4096",

And other environment variables including zookeeper, etc.

The mesos cluster is used for diferents applications (kafka, ad-hoc...), and 
have fragmentation into the agents. Our problem is that the flink session is 
getting all offers, even small ones. In case there are not enough offers to 
suit that configuration, it gets all of them, so there are no resources and 
offers free for other applications.

So the question would be what is the right configuration in these cases to 
avoid using all resources for the same flink session.

Thanks in advance.
Regards

This message is private and confidential. If you have received this message in 
error, please notify the sender or 
serviced...@piksel.com and remove it from your 
system.

Piksel Inc is a company registered in the United States, 2100 Powers Ferry Road 
SE, Suite 400, Atlanta, GA 
30339




Re: Problem in Flink 1.3.2 with Mesos task managers offers

2017-09-19 Thread Eron Wright
Hello, the current behavior is that Flink holds onto received offers for up
to two minutes while it attempts to provision the TMs.   Flink can combine
small offers to form a single TM, to combat fragmentation that develops
over time in a Mesos cluster.   Are you saying that unused offers aren't
being released after two minutes?

There's a log entry you should see in the JM log whenever an offer is
released:
LOG.info(s"Declined offer ${lease.getId} from ${lease.hostname()} "
  + s"of ${lease.memoryMB()} MB, ${lease.cpuCores()} cpus.")

The timeout value isn't configurable at the moment, but if you're willing
to experiment by building Flink from source, you may adjust the two minute
timeout to something lower as follows.   In the `MesosFlinkResourceManager`
class, edit the `createOptimizer` method to call `withLeaseOfferExpirySecs`
on the `TaskScheduler.Builder` object.

Let us know if that helps and we'll make the timeout configurable.
-Eron

On Tue, Sep 19, 2017 at 8:58 AM, Francisco Gonzalez Barea <
francisco.gonza...@piksel.com> wrote:

> Hello guys,
>
> We have a flink 1.3.2 session deployed from Marathon json to Mesos with
> some of the following parameters as environment variables:
>
>
> *"flink_mesos.initial-tasks": "8",*
> *"flink_mesos.resourcemanager.tasks.mem": "4096",*
>
>
> And other environment variables including zookeeper, etc.
>
> The mesos cluster is used for diferents applications (kafka, ad-hoc...),
> and have fragmentation into the agents. Our problem is that the flink
> session is getting all offers, even small ones. In case there are not
> enough offers to suit that configuration, it gets all of them, so there are
> no resources and offers free for other applications.
>
> So the question would be what is the right configuration in these cases to
> avoid using all resources for the same flink session.
>
> Thanks in advance.
> Regards
>
> This message is private and confidential. If you have received this
> message in error, please notify the sender or serviced...@piksel.com and
> remove it from your system.
>
> Piksel Inc is a company registered in the United States, 2100 Powers
> Ferry Road SE, Suite 400, Atlanta, GA 30339
> 
>


Problem in Flink 1.3.2 with Mesos task managers offers

2017-09-19 Thread Francisco Gonzalez Barea
Hello guys,

We have a flink 1.3.2 session deployed from Marathon json to Mesos with some of 
the following parameters as environment variables:


"flink_mesos.initial-tasks": "8",
"flink_mesos.resourcemanager.tasks.mem": "4096",

And other environment variables including zookeeper, etc.

The mesos cluster is used for diferents applications (kafka, ad-hoc...), and 
have fragmentation into the agents. Our problem is that the flink session is 
getting all offers, even small ones. In case there are not enough offers to 
suit that configuration, it gets all of them, so there are no resources and 
offers free for other applications.

So the question would be what is the right configuration in these cases to 
avoid using all resources for the same flink session.

Thanks in advance.
Regards

This message is private and confidential. If you have received this message in 
error, please notify the sender or serviced...@piksel.com and remove it from 
your system.

Piksel Inc is a company registered in the United States, 2100 Powers Ferry Road 
SE, Suite 400, Atlanta, GA 30339