Re: Where is Flink 0.10.1 documentation?

2016-01-08 Thread Stephan Ewen
Hi!

I think we missed updating the variable "version" in the "docs/_config.yml"
for the 0.10.1 release.

Would be good to update it and push a new version of the docs.

Greetings,
Stephan

On Fri, Jan 8, 2016 at 6:51 AM, Chiwan Park  wrote:

> Hi squirrels,
>
> I connected to
> https://ci.apache.org/projects/flink/flink-docs-release-0.10/ to read
> Flink 0.10.1 documentation. But the documentation version is still 0.10.0.
> Although there are only few or no differences between 0.10.0 and 0.10.1, I
> think that documentation version should be updated to announce latest
> stable version to newcomers.
>
> Is there any problem to update doc?
>
> Regards,
> Chiwan Park
>
>
>


Working with storm compatibility layer

2016-01-08 Thread Shinhyung Yang
Howdies to everyone,

I'm trying to use the storm compatibility layer on Flink 0.10.1. The
original storm topology works fine on Storm 0.9.5 and I have
incorporated FlinkLocalCluster, FlinkTopologyBuilder, and
FlinkTopology classes according to the programming guide
(https://ci.apache.org/projects/flink/flink-docs-release-0.10/apis/storm_compatibility.html).
I'm running it on Oracle Java 8 (1.8.0_66-b17) on Centos 7 (7.2.1511).
What happens is, it seems to be going all the way to submitTopology
method without any problem, however it doesn't invoke open method of
Spout class but declareOutputFields method is called for multiple
times and the program exits silently. Do you guys have any idea what's
going on here or have any suggestions? If needed, then please ask me
for more information.

Thank you for reading.
With best regards,
Shinhyung Yang


Re: Kinesis Connector

2016-01-08 Thread Tzu-Li (Gordon) Tai
Hi Giancarlo,

Since it has been a while since the last post and there hasn't been a JIRA
ticket opened for Kinesis connector yet, I'm wondering how you are doing on
the Kinesis connector and hope to help out with this feature :)

I've opened a JIRA (https://issues.apache.org/jira/browse/FLINK-3211),
finished the Kinesis sink, and half way through the Kinesis consumer. Would
you like to merge our current efforts so that we can complete this feature
ASAP for the AWS user community?

Thankfully,
Gordon



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kinesis-Connector-tp2872p4206.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: Where is Flink 0.10.1 documentation?

2016-01-08 Thread Robert Metzger
I updated the version in the "release-0.10" branch, that should fix the
issue: http://git-wip-us.apache.org/repos/asf/flink/commit/1d05dbe5

On Fri, Jan 8, 2016 at 10:25 AM, Stephan Ewen  wrote:

> Hi!
>
> I think we missed updating the variable "version" in the
> "docs/_config.yml" for the 0.10.1 release.
>
> Would be good to update it and push a new version of the docs.
>
> Greetings,
> Stephan
>
> On Fri, Jan 8, 2016 at 6:51 AM, Chiwan Park  wrote:
>
>> Hi squirrels,
>>
>> I connected to
>> https://ci.apache.org/projects/flink/flink-docs-release-0.10/ to read
>> Flink 0.10.1 documentation. But the documentation version is still 0.10.0.
>> Although there are only few or no differences between 0.10.0 and 0.10.1, I
>> think that documentation version should be updated to announce latest
>> stable version to newcomers.
>>
>> Is there any problem to update doc?
>>
>> Regards,
>> Chiwan Park
>>
>>
>>
>


Re: Where is Flink 0.10.1 documentation?

2016-01-08 Thread Chiwan Park
Great, thanks. :)

> On Jan 8, 2016, at 6:44 PM, Robert Metzger  wrote:
> 
> I updated the version in the "release-0.10" branch, that should fix the 
> issue: http://git-wip-us.apache.org/repos/asf/flink/commit/1d05dbe5
> 
> On Fri, Jan 8, 2016 at 10:25 AM, Stephan Ewen  wrote:
> Hi!
> 
> I think we missed updating the variable "version" in the "docs/_config.yml" 
> for the 0.10.1 release.
> 
> Would be good to update it and push a new version of the docs.
> 
> Greetings,
> Stephan
> 
> On Fri, Jan 8, 2016 at 6:51 AM, Chiwan Park  wrote:
> Hi squirrels,
> 
> I connected to https://ci.apache.org/projects/flink/flink-docs-release-0.10/ 
> to read Flink 0.10.1 documentation. But the documentation version is still 
> 0.10.0. Although there are only few or no differences between 0.10.0 and 
> 0.10.1, I think that documentation version should be updated to announce 
> latest stable version to newcomers.
> 
> Is there any problem to update doc?
> 
> Regards,
> Chiwan Park

Regards,
Chiwan Park




Security in Flink

2016-01-08 Thread Sourav Mazumder
Hi,

Can anyone point me to ant documentation on support for Security in Flink ?

The type of information I'm looking for are -

1. How do I do user level authentication to ensure that a job is
submitted/deleted/modified by the right user ? Is it possible though the
web client ?
2. Authentication across multiple slave nodes (where the task managers are
running) and driver program so that they can communicate with each other
3. Support for SSL/encryption for data exchanged happening across the slave
nodes
4. Support for pluggable authentication with existing solution like LDAP

If not there today is there a roadmap for these security features ?

Regards,
Sourav


Re: Problem to show logs in task managers

2016-01-08 Thread Ana M. Martinez
Thanks for the tip Robert! It was a good idea to rule out other possible 
causes, but I am afraid that is not the problem. If we stick to the 
WordCountExample (for simplicity), the Exception is thrown if placed into the 
flatMap function.

I am going to try to re-write my problem and all the settings below:

When I try to aggregate all logs:
 $yarn logs -applicationId application_1452250761414_0005

the following message is retrieved:
16/01/08 13:32:37 INFO client.RMProxy: Connecting to ResourceManager at 
ip-172-31-33-221.us-west-2.compute.internal/172.31.33.221:8032
/var/log/hadoop-yarn/apps/hadoop/logs/application_1452250761414_0005does not 
exist.
Log aggregation has not completed or is not enabled.

(Tried the same command a few minutes later and got the same message, so might 
it be that log aggregation is not properly enabled??)

I am going to carefully enumerate all the steps I have followed (and settings) 
to see if someone can identify why the Logger messages from CORE nodes (in an 
Amazon cluster) are not shown.

1) Enable yarn.log-aggregation-enable property to true in 
/etc/alternatives/hadoop-conf/yarn-site.xml.

2) Include log messages in my WordCountExample as follows:

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.java.DataSet;
import org.apache.flink.api.java.ExecutionEnvironment;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.core.fs.FileSystem;
import org.apache.flink.util.Collector;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import java.io.Serializable;
import java.util.ArrayList;
import java.util.List;


public class WordCountExample {
static Logger logger = LoggerFactory.getLogger(WordCountExample.class);

public static void main(String[] args) throws Exception {
final ExecutionEnvironment env = 
ExecutionEnvironment.getExecutionEnvironment();

logger.info("Entering application.");

DataSet text = env.fromElements(
"Who's there?",
"I think I hear them. Stand, ho! Who's there?");

List elements = new ArrayList();
elements.add(0);


DataSet set = env.fromElements(new TestClass(elements));

DataSet> wordCounts = text
.flatMap(new LineSplitter())
.withBroadcastSet(set, "set")
.groupBy(0)
.sum(1);

wordCounts.writeAsText(“output.txt", FileSystem.WriteMode.OVERWRITE);


}

public static class LineSplitter implements FlatMapFunction> {

static Logger loggerLineSplitter = 
LoggerFactory.getLogger(LineSplitter.class);

@Override
public void flatMap(String line, Collector> 
out) {
loggerLineSplitter.info("Logger in LineSplitter.flatMap");

for (String word : line.split(" ")) {
out.collect(new Tuple2(word, 1));
//throw new RuntimeException("LineSplitter class 
called");
}

}
}

public static class TestClass implements Serializable {
private static final long serialVersionUID = -2932037991574118651L;

static Logger loggerTestClass = 
LoggerFactory.getLogger("TestClass.class");

List integerList;
public TestClass(List integerList){
this.integerList=integerList;
loggerTestClass.info("Logger in TestClass");
}


}
}

3) Start a yarn-cluster and execute my program with the following command:

$./bin/flink run -m yarn-cluster -yn 1 -ys 4 -yjm 1024 -ytm 1024 -c 
eu.amidst.flinklink.examples.WordCountExample ../flinklink.jar


4) The output in the log folder is as follows:

13:31:04,945 INFO  org.apache.flink.client.CliFrontend  
 - 

13:31:04,947 INFO  org.apache.flink.client.CliFrontend  
 -  Starting Command Line Client (Version: 0.10.0, Rev:ab2cca4, Date:10.11.2015 
@ 13:50:14 UTC)
13:31:04,947 INFO  org.apache.flink.client.CliFrontend  
 -  Current user: hadoop
13:31:04,947 INFO  org.apache.flink.client.CliFrontend  
 -  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.65-b01
13:31:04,947 INFO  org.apache.flink.client.CliFrontend  
 -  Maximum heap size: 3344 MiBytes
13:31:04,947 INFO  org.apache.flink.client.CliFrontend  
 -  JAVA_HOME: /etc/alternatives/jre
13:31:04,950 INFO  org.apache.flink.client.CliFrontend  
 -  Hadoop version: 2.6.0
13:31:04,950 INFO  org.apache.flink.client.CliFrontend  
 -  JVM Options:
13:31:04,950 INFO  org.apache.flink.client.CliFrontend  
 - 

Re: Problem to show logs in task managers

2016-01-08 Thread Till Rohrmann
You’re right that the log statements of the LineSplitter are in the logs of
the cluster nodes, because that’s where the LineSplitter code is executed.
In contrast, you create a TestClass on the client when you submit the
program. Therefore, you see the logging statement “Logger in TestClass” on
the command line or in the cli log file.

So I would assume that the problem is Yarn’s log aggregation. Either your
configuration is not correct or there are still some Yarn containers
running after the Flink job has finished. Yarn will only show you the logs
after all containers are terminated. Maybe you could check that.
Alternatively, you can try to retrieve the taskmanager logs manually by
going to the machine where your yarn container was executed. Then under
hadoop/logs/userlogs you should find somewhere the logs.

Cheers,
Till
​

On Fri, Jan 8, 2016 at 3:50 PM, Ana M. Martinez  wrote:

> Thanks for the tip Robert! It was a good idea to rule out other possible
> causes, but I am afraid that is not the problem. If we stick to the
> WordCountExample (for simplicity), the Exception is thrown if placed into
> the flatMap function.
>
> I am going to try to re-write my problem and all the settings below:
>
> When I try to aggregate all logs:
>  $yarn logs -applicationId application_1452250761414_0005
>
> the following message is retrieved:
> 16/01/08 13:32:37 INFO client.RMProxy: Connecting to ResourceManager at
> ip-172-31-33-221.us-west-2.compute.internal/172.31.33.221:8032
> /var/log/hadoop-yarn/apps/hadoop/logs/application_1452250761414_0005does
> not exist.
> Log aggregation has not completed or is not enabled.
>
> (Tried the same command a few minutes later and got the same message, so
> might it be that log aggregation is not properly enabled??)
>
> I am going to carefully enumerate all the steps I have followed (and
> settings) to see if someone can identify why the Logger messages from CORE
> nodes (in an Amazon cluster) are not shown.
>
> 1) Enable yarn.log-aggregation-enable property to true
> in /etc/alternatives/hadoop-conf/yarn-site.xml.
>
> 2) Include log messages in my WordCountExample as follows:
>
> import org.apache.flink.api.common.functions.FlatMapFunction;
> import org.apache.flink.api.java.DataSet;
> import org.apache.flink.api.java.ExecutionEnvironment;
> import org.apache.flink.api.java.tuple.Tuple2;
> import org.apache.flink.core.fs.FileSystem;
> import org.apache.flink.util.Collector;
> import org.slf4j.Logger;
> import org.slf4j.LoggerFactory;
>
> import java.io.Serializable;
> import java.util.ArrayList;
> import java.util.List;
>
>
> public class WordCountExample {
> static Logger logger = LoggerFactory.getLogger(WordCountExample.class);
>
> public static void main(String[] args) throws Exception {
> final ExecutionEnvironment env = 
> ExecutionEnvironment.getExecutionEnvironment();
>
> logger.info("Entering application.");
>
> DataSet text = env.fromElements(
> "Who's there?",
> "I think I hear them. Stand, ho! Who's there?");
>
> List elements = new ArrayList();
> elements.add(0);
>
>
> DataSet set = env.fromElements(new TestClass(elements));
>
> DataSet> wordCounts = text
> .flatMap(new LineSplitter())
> .withBroadcastSet(set, "set")
> .groupBy(0)
> .sum(1);
>
> wordCounts.writeAsText(*“*output.txt", 
> FileSystem.WriteMode.OVERWRITE);
>
>
> }
>
> public static class LineSplitter implements FlatMapFunction Tuple2> {
>
> static Logger loggerLineSplitter = 
> LoggerFactory.getLogger(LineSplitter.class);
>
> @Override
> public void flatMap(String line, Collector> 
> out) {
> loggerLineSplitter.info("Logger in LineSplitter.flatMap");
>
>   for (String word : line.split(" ")) {
>   out.collect(new Tuple2(word, 1));
>   //throw new RuntimeException("LineSplitter class 
> called");
>   }
>
> }
> }
>
> public static class TestClass implements Serializable {
> private static final long serialVersionUID = -2932037991574118651L;
>
> static Logger loggerTestClass = 
> LoggerFactory.getLogger("TestClass.class");
>
> List integerList;
> public TestClass(List integerList){
> this.integerList=integerList;
> loggerTestClass.info("Logger in TestClass");
> }
>
>
> }
> }
>
> 3) Start a yarn-cluster and execute my program with the following command:
>
> $./bin/flink run -m yarn-cluster -yn 1 -ys 4 -yjm 1024 -ytm 1024 -c
> eu.amidst.flinklink.examples.WordCountExample ../flinklink.jar
>
>
> 4) The output in the log folder is as follows:
>
> 13:31:04,945 INFO  org.apache.flink.client.CliFrontend
>   -
>