Re: Task and Operator Monitoring via JMX / naming

2016-10-15 Thread Chesnay Schepler
Hello Philipp, there is certainly something very wrong here. What you _should_ see is 6 entries, 1 for each operator; 2-3 more for the tasks the operators are executed in and the taskmanager stuff. Usually, operator metrics use the name that you configured, like "TokenMapStream", whereas

Re: Task and Operator Monitoring via JMX / naming

2016-10-15 Thread Philipp Bussche
Thanks Chesnay, this is on Flink 1.1.3 Please also note that e.g. the first item in the list which has the custom metric attached to it starts with a leading "(". It might be that the parsing of the names is not working quite as expected. I was trying to find out where these names come from but

Re: Task and Operator Monitoring via JMX / naming

2016-10-15 Thread Chesnay Schepler
Hello Philipp, the relevant names are stored in the OperatorMetricGroup/TaskMetricGroup classes in flink-runtime. The name for a task is extracted directly from the TaskDeploymentDescriptor in TaskManagerJobMetricGroup#addTask(). The name for a streaming operator that the metric system uses

"java.net.SocketException: No buffer space available (maximum connections reached?)" when reading from HDFS

2016-10-15 Thread Yassine MARZOUGUI
Hi all, I'm reading a large number of small files from HDFS in batch mode (about 20 directories, each directory contains about 3000 files, using recursive.file.enumeration=true), and each time, at about 200 GB of received data, my job fails with the following exception: java.io.IOException: