RE: [E] writeBufferLowWaterMark cannot be greater than writeBufferHighWaterMark error

2016-12-27 Thread kanagaraj . vengidasamy
Hi Fabian,

In my case, Since it is stream processing, after that error , that task manager 
 stuck and not taking any new messages.
Can you  let me know how many task managers can I run  in 8 core x 32GB  
machine.?  I am using Heap size as 4096 for each task manager.

Thanks

[Verizon]

Kanagaraj Vengidasamy
RTCI

7701 E Telecom PKWY
Temple Terrace, FL 33637

O 813.978.4372 | M 813.455.9757

  [Twitter]    
[LinkedIn] 

From: Fabian Hueske [mailto:fhue...@gmail.com]
Sent: Tuesday, December 27, 2016 2:53 PM
To: user@flink.apache.org
Cc: Ufuk Celebi
Subject: Re: [E] writeBufferLowWaterMark cannot be greater than 
writeBufferHighWaterMark error

Hi,
I reproduced the issue with Flink 1.1.4 and the 1.2.0 release branch.
The WARN log statement and the IllegalArgumentException are thrown by Netty.
Not sure what the implications are. My batch jobs finished successfully, so 
maybe the bad configuration options are just ignored.
Would be good to check this though.

Thanks,
Fabian

2016-12-27 16:10 GMT+01:00 
>:
Thanks Fabian,

For default 65536 also got the same error . That is the reason 
increased to see it is helping or not.

Thanks

[Verizon]

Kanagaraj Vengidasamy
RTCI

7701 E Telecom PKWY
Temple Terrace, FL 33637

O 813.978.4372 | M 813.455.9757

  [Twitter]    
[LinkedIn] 

From: Fabian Hueske [mailto:fhue...@gmail.com]
Sent: Tuesday, December 27, 2016 10:01 AM
To: user@flink.apache.org
Cc: Ufuk Celebi
Subject: Re: [E] writeBufferLowWaterMark cannot be greater than 
writeBufferHighWaterMark error

Hi Kanagaraj,
I would assume that the issue is caused by this configuration parameter:

taskmanager.memory.segment-size: 131072
I think the maximum possible value given Netty's "writeBufferHighWaterMark" 
parameter is 65536.
There might be a way to tune Netty's parameters but I don't know how to do 
that. Maybe Ufuk (in CC) knows better.

Is there any particular reason why want to increase the network buffer size 
instead of keeping the default?
Did you find that the default parameter gives insufficient performance?
Best, Fabian

2016-12-26 20:52 GMT+01:00 
>:

Hi,

I am getting writeBufferLowWaterMark cannot be greater than 
writeBufferHighWaterMark  error frequently ...  and those task managers not 
processing messages after that error.
 What could be wrong in my configuration?  What I need to do to avoid this 
error.?

Have 8x32 VM's - 8 machines ( running 35 task managers - each one has 8 slots)
taskmanager.heap.mb: 4096
taskmanager.numberOfTaskSlots: 8
taskmanager.network.numberOfBuffers: 22000
taskmanager.memory.segment-size: 131072


2016-12-26 14:26:06,548|  |WARN  io.netty.bootstrap.ServerBootstrap  - Failed 
to set a channel option: [id: 0xf1ef59e6, 
/138.83.31.4:60812 => 
/138.83.31.9:41304]
java.lang.IllegalArgumentException: writeBufferLowWaterMark cannot be greater 
than writeBufferHighWaterMark (65536): 131073
at 
io.netty.channel.DefaultChannelConfig.setWriteBufferLowWaterMark(DefaultChannelConfig.java:334)
at 
io.netty.channel.socket.DefaultSocketChannelConfig.setWriteBufferLowWaterMark(DefaultSocketChannelConfig.java:332)
at 
io.netty.channel.socket.DefaultSocketChannelConfig.setWriteBufferLowWaterMark(DefaultSocketChannelConfig.java:35)
at 
io.netty.channel.DefaultChannelConfig.setOption(DefaultChannelConfig.java:183)
at 
io.netty.channel.socket.DefaultSocketChannelConfig.setOption(DefaultSocketChannelConfig.java:121)
at 
io.netty.bootstrap.ServerBootstrap$ServerBootstrapAcceptor.channelRead(ServerBootstrap.java:238)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at 
io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 

Re: [E] writeBufferLowWaterMark cannot be greater than writeBufferHighWaterMark error

2016-12-27 Thread Fabian Hueske
Hi,

I reproduced the issue with Flink 1.1.4 and the 1.2.0 release branch.

The WARN log statement and the IllegalArgumentException are thrown by Netty.
Not sure what the implications are. My batch jobs finished successfully, so
maybe the bad configuration options are just ignored.
Would be good to check this though.

Thanks,
Fabian

2016-12-27 16:10 GMT+01:00 :

> Thanks Fabian,
>
>
>
> For default 65536 also got the same error . That is the
> reason increased to see it is helping or not.
>
>
>
> Thanks
>
>
> [image: Verizon] 
>
> Kanagaraj Vengidasamy
> RTCI
>
> 7701 E Telecom PKWY
> Temple Terrace, FL 33637
>
> O 813.978.4372 <(813)%20978-4372> | M 813.455.9757 <(813)%20455-9757>
>
> [image: Facebook]   [image: Twitter]
>   [image: LinkedIn]
> 
>
>
>
> *From:* Fabian Hueske [mailto:fhue...@gmail.com]
> *Sent:* Tuesday, December 27, 2016 10:01 AM
> *To:* user@flink.apache.org
> *Cc:* Ufuk Celebi
> *Subject:* Re: [E] writeBufferLowWaterMark cannot be greater than
> writeBufferHighWaterMark error
>
>
>
> Hi Kanagaraj,
>
> I would assume that the issue is caused by this configuration parameter:
>
> taskmanager.memory.segment-size: 131072
>
> I think the maximum possible value given Netty's
> "writeBufferHighWaterMark" parameter is 65536.
>
> There might be a way to tune Netty's parameters but I don't know how to do
> that. Maybe Ufuk (in CC) knows better.
>
>
> Is there any particular reason why want to increase the network buffer
> size instead of keeping the default?
>
> Did you find that the default parameter gives insufficient performance?
>
> Best, Fabian
>
>
>
> 2016-12-26 20:52 GMT+01:00 :
>
>
> Hi,
>
> I am getting writeBufferLowWaterMark cannot be greater than
> writeBufferHighWaterMark  error frequently ...  and those task managers not
> processing messages after that error.
>  What could be wrong in my configuration?  What I need to do to avoid this
> error.?
>
> Have 8x32 VM's - 8 machines ( running 35 task managers - each one has 8
> slots)
> taskmanager.heap.mb: 4096
> taskmanager.numberOfTaskSlots: 8
> taskmanager.network.numberOfBuffers: 22000
> taskmanager.memory.segment-size: 131072
>
>
> 2016-12-26 14:26:06,548|  |WARN  io.netty.bootstrap.ServerBootstrap  -
> Failed to set a channel option: [id: 0xf1ef59e6, /138.83.31.4:60812 => /
> 138.83.31.9:41304]
> java.lang.IllegalArgumentException: writeBufferLowWaterMark cannot be
> greater than writeBufferHighWaterMark (65536): 131073
> at io.netty.channel.DefaultChannelConfig.
> setWriteBufferLowWaterMark(DefaultChannelConfig.java:334)
> at io.netty.channel.socket.DefaultSocketChannelConfig.
> setWriteBufferLowWaterMark(DefaultSocketChannelConfig.java:332)
> at io.netty.channel.socket.DefaultSocketChannelConfig.
> setWriteBufferLowWaterMark(DefaultSocketChannelConfig.java:35)
> at io.netty.channel.DefaultChannelConfig.setOption(
> DefaultChannelConfig.java:183)
> at io.netty.channel.socket.DefaultSocketChannelConfig.setOption(
> DefaultSocketChannelConfig.java:121)
> at io.netty.bootstrap.ServerBootstrap$ServerBootstrapAcceptor.
> channelRead(ServerBootstrap.java:238)
> at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:324)
> at io.netty.channel.DefaultChannelPipeline.fireChannelRead(
> DefaultChannelPipeline.java:847)
> at io.netty.channel.nio.AbstractNioMessageChannel$
> NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
> at io.netty.channel.nio.NioEventLoop.processSelectedKey(
> NioEventLoop.java:511)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(
> NioEventLoop.java:468)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeys(
> NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
>
>
> Thanks
> Kanagaraj Vengidasamy
> RTCI
>
> 7701 E Telecom PKWY
> Temple Terrace, FL 33637
>
> O 813.978.4372 <(813)%20978-4372> | M 813.455.9757 <(813)%20455-9757>
>
>
>
>
>


Re: Monitoring REST API

2016-12-27 Thread Shannon Carey
Although Flink exposes some metrics in the API/UI, it probably only does that 
because it was easy to do and convenient for users. However, I don't think 
Flink is intended to be a complete monitoring solution for your cluster.

Instead, you should take a look at collectd https://collectd.org/ which is 
meant for monitoring OS-level metrics and has, for example, a Graphite plugin 
which you can use to write to a Graphite server or statsd instance… or you can 
integrate it some other way depending on what you have & what you want.

-Shannon

From: Lydia Ickler >
Date: Wednesday, December 21, 2016 at 12:55 PM
To: >
Subject: Monitoring REST API

Hi all,

I have a question regarding the Monitoring REST API;

I want to analyze the behavior of my program with regards to I/O MiB/s, Network 
MiB/s and CPU % as the authors of this paper did. 
(https://hal.inria.fr/hal-01347638v2/document)
From the JSON file at http:master:8081/jobs/jobid/ I get a summary including 
the information of read/write records and read/write bytes.
Unfortunately the entries of Network or CPU are either (unknown) or 0.0. I am 
running my program on a cluster with up to 32 nodes.

Where can I find the values for e.g. CPU or Network?

Thanks in advance!
Lydia



Re: Reading worker-local input files

2016-12-27 Thread Fabian Hueske
Hi Robert,

this is indeed a bit tricky to do. The problem is mostly with the
generation of the input splits, setup of Flink, and the scheduling of tasks.

1) you have to ensure that on each worker at least one DataSource task is
scheduled. The easiest way to do this is to have a bare metal setup (no
YARN) and a single TaskManager per worker. Each TM should have the same
number of slots and the DataSource should have a parallelism of #TMs *
slots. This will ensure that the same number of DataSource tasks is started
on each worker.

2) you need to tweak the input split generation. Flink's FileInputFormat
assume that it can access all files to be read via a distributed file
system. Your InputFormat should have a parameter for the list of
taskmanager (hostnames, IP addresses) and the number of slots per TM. The
InputFormat.createInputSplits() should create one input split for each
parallel task. Each split should have (hostname, local index)

3) you need to tweak the input split assignment. You need to provide a
custom input split provider that ensures that splits are only assigned to
the correct task manager. Otherwise it might happen that a TaskManager
processes the split of another TM and some data is read twice while other
data is not read at all.

4) once a split is assigned to a task the InputFormat.open() method is
called. Based on the local index, the task should decide which files (or
parts of files) it needs to read. This decision must be deterministic (only
depend on local index) and ensure that all data (files / parts of files)
are read exactly once (you'll need the number of slots per host for that).

You see, this is not trivial. Moreover, such a setup is not flexible, quite
fragile, and not fault tolerant.
Since you need to read local files are not available anywhere else, your
job will fail if a TM goes down.

If possible, I would recommend to move the data into a distributed file
system.

Best,
Fabian

2016-12-27 13:04 GMT+01:00 Robert Schmidtke :

> Hi everyone,
>
> I'm using Flink and/or Hadoop on my cluster, and I'm having them generate
> log data in each worker node's /local folder (regular mount point). Now I
> would like to process these files using Flink, but I'm not quite sure how I
> could tell Flink to use each worker node's /local folder as input path,
> because I'd expect Flink to look in the /local folder of the submitting
> node only. Do I have to put these files into HDFS or is there a way to tell
> Flink the file:///local file URI refers to worker-local data? Thanks in
> advance for any hints and best
>
> Robert
>
> --
> My GPG Key ID: 336E2680
>


RE: [E] writeBufferLowWaterMark cannot be greater than writeBufferHighWaterMark error

2016-12-27 Thread kanagaraj . vengidasamy
Thanks Fabian,

For default 65536 also got the same error . That is the reason 
increased to see it is helping or not.

Thanks

[Verizon]

Kanagaraj Vengidasamy
RTCI

7701 E Telecom PKWY
Temple Terrace, FL 33637

O 813.978.4372 | M 813.455.9757

  [Twitter]    
[LinkedIn] 

From: Fabian Hueske [mailto:fhue...@gmail.com]
Sent: Tuesday, December 27, 2016 10:01 AM
To: user@flink.apache.org
Cc: Ufuk Celebi
Subject: Re: [E] writeBufferLowWaterMark cannot be greater than 
writeBufferHighWaterMark error

Hi Kanagaraj,
I would assume that the issue is caused by this configuration parameter:

taskmanager.memory.segment-size: 131072
I think the maximum possible value given Netty's "writeBufferHighWaterMark" 
parameter is 65536.
There might be a way to tune Netty's parameters but I don't know how to do 
that. Maybe Ufuk (in CC) knows better.

Is there any particular reason why want to increase the network buffer size 
instead of keeping the default?
Did you find that the default parameter gives insufficient performance?
Best, Fabian

2016-12-26 20:52 GMT+01:00 
>:

Hi,

I am getting writeBufferLowWaterMark cannot be greater than 
writeBufferHighWaterMark  error frequently ...  and those task managers not 
processing messages after that error.
 What could be wrong in my configuration?  What I need to do to avoid this 
error.?

Have 8x32 VM's - 8 machines ( running 35 task managers - each one has 8 slots)
taskmanager.heap.mb: 4096
taskmanager.numberOfTaskSlots: 8
taskmanager.network.numberOfBuffers: 22000
taskmanager.memory.segment-size: 131072


2016-12-26 14:26:06,548|  |WARN  io.netty.bootstrap.ServerBootstrap  - Failed 
to set a channel option: [id: 0xf1ef59e6, 
/138.83.31.4:60812 => 
/138.83.31.9:41304]
java.lang.IllegalArgumentException: writeBufferLowWaterMark cannot be greater 
than writeBufferHighWaterMark (65536): 131073
at 
io.netty.channel.DefaultChannelConfig.setWriteBufferLowWaterMark(DefaultChannelConfig.java:334)
at 
io.netty.channel.socket.DefaultSocketChannelConfig.setWriteBufferLowWaterMark(DefaultSocketChannelConfig.java:332)
at 
io.netty.channel.socket.DefaultSocketChannelConfig.setWriteBufferLowWaterMark(DefaultSocketChannelConfig.java:35)
at 
io.netty.channel.DefaultChannelConfig.setOption(DefaultChannelConfig.java:183)
at 
io.netty.channel.socket.DefaultSocketChannelConfig.setOption(DefaultSocketChannelConfig.java:121)
at 
io.netty.bootstrap.ServerBootstrap$ServerBootstrapAcceptor.channelRead(ServerBootstrap.java:238)
at 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:339)
at 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:324)
at 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:847)
at 
io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
at 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
at 
io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)


Thanks
Kanagaraj Vengidasamy
RTCI

7701 E Telecom PKWY
Temple Terrace, FL 33637

O 813.978.4372 | M 813.455.9757





Re: Is there some way to use broadcastSet in streaming ?

2016-12-27 Thread Fabian Hueske
Hi,

no, broadcast sets are not available in the DataStream API.
There might be other ways to achieve similar functionality, but the optimal
solution depends on the use case.
If you give a few details about what you would like to do, we might be able
to suggest alternatives.

Best, Fabian

2016-12-27 3:38 GMT+01:00 刘喆 :

> hi, everyone
> Is there some way to use broadcastSet in streaming ?  It seams that
> broadcastSet is only usable in batch.
>


Re: [E] writeBufferLowWaterMark cannot be greater than writeBufferHighWaterMark error

2016-12-27 Thread Fabian Hueske
Hi Kanagaraj,

I would assume that the issue is caused by this configuration parameter:

taskmanager.memory.segment-size: 131072

I think the maximum possible value given Netty's "writeBufferHighWaterMark"
parameter is 65536.
There might be a way to tune Netty's parameters but I don't know how to do
that. Maybe Ufuk (in CC) knows better.

Is there any particular reason why want to increase the network buffer size
instead of keeping the default?
Did you find that the default parameter gives insufficient performance?

Best, Fabian

2016-12-26 20:52 GMT+01:00 :

>
> Hi,
>
> I am getting writeBufferLowWaterMark cannot be greater than
> writeBufferHighWaterMark  error frequently ...  and those task managers not
> processing messages after that error.
>  What could be wrong in my configuration?  What I need to do to avoid this
> error.?
>
> Have 8x32 VM's - 8 machines ( running 35 task managers - each one has 8
> slots)
> taskmanager.heap.mb: 4096
> taskmanager.numberOfTaskSlots: 8
> taskmanager.network.numberOfBuffers: 22000
> taskmanager.memory.segment-size: 131072
>
>
> 2016-12-26 14:26:06,548|  |WARN  io.netty.bootstrap.ServerBootstrap  -
> Failed to set a channel option: [id: 0xf1ef59e6, /138.83.31.4:60812 => /
> 138.83.31.9:41304]
> java.lang.IllegalArgumentException: writeBufferLowWaterMark cannot be
> greater than writeBufferHighWaterMark (65536): 131073
> at io.netty.channel.DefaultChannelConfig.
> setWriteBufferLowWaterMark(DefaultChannelConfig.java:334)
> at io.netty.channel.socket.DefaultSocketChannelConfig.
> setWriteBufferLowWaterMark(DefaultSocketChannelConfig.java:332)
> at io.netty.channel.socket.DefaultSocketChannelConfig.
> setWriteBufferLowWaterMark(DefaultSocketChannelConfig.java:35)
> at io.netty.channel.DefaultChannelConfig.setOption(
> DefaultChannelConfig.java:183)
> at io.netty.channel.socket.DefaultSocketChannelConfig.setOption(
> DefaultSocketChannelConfig.java:121)
> at io.netty.bootstrap.ServerBootstrap$ServerBootstrapAcceptor.
> channelRead(ServerBootstrap.java:238)
> at io.netty.channel.AbstractChannelHandlerContext.
> invokeChannelRead(AbstractChannelHandlerContext.java:339)
> at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(
> AbstractChannelHandlerContext.java:324)
> at io.netty.channel.DefaultChannelPipeline.fireChannelRead(
> DefaultChannelPipeline.java:847)
> at io.netty.channel.nio.AbstractNioMessageChannel$
> NioMessageUnsafe.read(AbstractNioMessageChannel.java:93)
> at io.netty.channel.nio.NioEventLoop.processSelectedKey(
> NioEventLoop.java:511)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(
> NioEventLoop.java:468)
> at io.netty.channel.nio.NioEventLoop.processSelectedKeys(
> NioEventLoop.java:382)
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
> at io.netty.util.concurrent.SingleThreadEventExecutor$2.
> run(SingleThreadEventExecutor.java:111)
> at java.lang.Thread.run(Thread.java:745)
>
>
> Thanks
> Kanagaraj Vengidasamy
> RTCI
>
> 7701 E Telecom PKWY
> Temple Terrace, FL 33637
>
> O 813.978.4372 | M 813.455.9757
>
>
>
>


Re: Compiling Flink for Scala 2.11

2016-12-27 Thread Fabian Hueske
Hi Markus,

thanks for reporting this issue. This bug was introduced when the opt.xml
file was added to the repository a few days ago.
There are two open JIRAs, FLINK-5392 and FLINK-5396, each one with a pull
request to fix the problem.

Best, Fabian

2016-12-26 2:16 GMT+01:00 M. Dale :

> I cloned the Apache Flink source code from https://github.com/apache/flink
> and
> want to build the 1.2-SNAPSHOT with Scala 2.11.
>
> git clone g...@github.com:apache/flink.git
> cd flink
> git checkout remotes/origin/release-1.2
>
> Following instructions from the 1.2 docs at
> https://ci.apache.org/projects/flink/flink-docs-release-1.2/
> setup/building.html
> I then:
>
> tools/change-scala-version.sh 2.11
> mvn clean install -DskipTests (I am using Maven 3.3.9)
>
> This gets to flink-dist and then:
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-assembly-plugin:2.4:single
> (opt)
> on project flink-dist_2.11: Failed to create assembly: Error adding file
> to archive:
> //flink/flink-dist/../flink-libraries/flink-cep/
> target/flink-cep_2.10-1.2-SNAPSHOT.jar isn't a file.
>
> It seems that flink/flink-dist/src/main/assemblies/opt.xml does not get
> changed
> by the change-scala-version.sh script and still has some 2.10 dependencies.
>
> Did I miss a step? There are 2.11 artifacts in the Maven repos. I see the
> parent pom has a scala.version variable but mostly
> the scala version is just hardcoded in the poms and controlled via the
> tools
> script. Changing the _2.10 references in opt.xml allows for successful
> compile.
> Please advise whether I should file a Jira and pull request to add opt.xml
> to
> the change-scala-version.sh script? Or is there a new/different way to
> build for 2.11?
>
> Thanks,
> Markus
>
>


Reading worker-local input files

2016-12-27 Thread Robert Schmidtke
Hi everyone,

I'm using Flink and/or Hadoop on my cluster, and I'm having them generate
log data in each worker node's /local folder (regular mount point). Now I
would like to process these files using Flink, but I'm not quite sure how I
could tell Flink to use each worker node's /local folder as input path,
because I'd expect Flink to look in the /local folder of the submitting
node only. Do I have to put these files into HDFS or is there a way to tell
Flink the file:///local file URI refers to worker-local data? Thanks in
advance for any hints and best

Robert

-- 
My GPG Key ID: 336E2680