Re: Flink plus Elastic Search plus Kibana

2016-01-18 Thread HungChang
Found the answer here

http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-Elasticsearch-connector-support-for-elasticsearch-2-0-td3910.html



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-plus-Elastic-Search-plus-Kibana-tp4340p4343.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.


Re: InvalidTypesException - Input mismatch: Basic type 'Integer' expected but was 'Long'

2016-01-18 Thread Biplob Biswas
Hi Till,

I am using flink 0.10.1 and if i am not wrong it corresponds to the
1.0-Snapshot you mentioned.

[image: Inline image 1]

If wrong, please suggest what should I do to fix it.

Thanks & Regards
Biplob Biswas

On Mon, Jan 18, 2016 at 11:23 AM, Till Rohrmann 
wrote:

> Hi Biplob,
>
> which version of Flink are you using? With version 1.0-SNAPSHOT, I cannot
> reproduce your problem.
>
> Cheers,
> Till
> ​
>
> On Sun, Jan 17, 2016 at 4:56 PM, Biplob Biswas 
> wrote:
>
>> Hi,
>>
>> I am getting the following exception when i am using the map function
>>
>> Exception in thread "main"
>>> org.apache.flink.api.common.functions.InvalidTypesException: The return
>>> type of function 'computeWeightedDistribution(GraphWeighted.java:73)' could
>>> not be determined automatically, due to type erasure. You can give type
>>> information hints by using the returns(...) method on the result of the
>>> transformation call, or by letting your function implement the
>>> 'ResultTypeQueryable' interface.
>>> at org.apache.flink.api.java.DataSet.getType(DataSet.java:176)
>>> at org.apache.flink.api.java.DataSet.groupBy(DataSet.java:692)
>>> at aim3.GraphWeighted.computeWeightedDistribution(GraphWeighted.java:74)
>>> at aim3.SlashdotZooInDegree.main(SlashdotZooInDegree.java:39)
>>> Caused by: org.apache.flink.api.common.functions.InvalidTypesException:
>>> Input mismatch: Basic type 'Integer' expected but was 'Long'.
>>> at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.validateInputType(TypeExtractor.java:767)
>>> at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:276)
>>> at
>>> org.apache.flink.api.java.typeutils.TypeExtractor.getMapReturnTypes(TypeExtractor.java:110)
>>> at org.apache.flink.api.java.DataSet.map(DataSet.java:213)
>>> at aim3.GraphWeighted.computeWeightedDistribution(GraphWeighted.java:73)
>>> ... 1 more
>>
>>
>>
>> This is the part of the code which I am trying to run :
>>
>> DataSet> distinctVertex = sourceVertex
>>>  .union(destinationVertex)
>>>  .groupBy(0)
>>>  .aggregate(Aggregations.SUM, 1);
>>> // Compute the degrees (degree, count)
>>>
>>>  DataSet> degreeCount = distinctVertex
>>>  .map(new DegreeMapper())
>>>  .groupBy(0)
>>>  .aggregate(Aggregations.SUM, 1);
>>
>>
>>
>> and the error I am getting is at this line *.map(new DegreeMapper())*
>>
>> Also, the degree mapper is a simply map function which emits the second
>> column and 1 as follows:
>>
>>>
>>> public static class DegreeMapper implements
>>> MapFunction, Tuple2> {
>>> private static final long serialVersionUID = 1L;
>>> public Tuple2 map(Tuple2 input) throws
>>> Exception {
>>> return new Tuple2(input.f1, 1);
>>> }
>>> }
>>
>>
>>
>> Now I am lost as to what I did wrong and why I am getting that error, any
>> help would be appreciated.
>>
>> Thanks a lot.
>>
>> Thanks & Regards
>> Biplob Biswas
>>
>
>


Re: Flink plus Elastic Search plus Kibana

2016-01-18 Thread Maximilian Michels
Hi Sendoh,

At the time the article was created, Elasticsearch 2.0 was only in the
making and by the time of publishing it had just been released. That's
why we used version 1.7.3. There is currently no 2.X version of the
Flink adapter but that will change very soon. There is an issue and a
pending pull request: https://issues.apache.org/jira/browse/FLINK-3115

Cheers,
Max

On Mon, Jan 18, 2016 at 12:56 PM, HungChang  wrote:
> Hi,
>
> Recently I read this post about Flink+Elastic Search+Kibana.
> https://www.elastic.co/blog/building-real-time-dashboard-applications-with-apache-flink-elasticsearch-and-kibana
>
> Can I ask why the Elastic Search version 1.7.3 is selected? What would be
> the potential issues with the newer versions?
>
> Best,
>
> Sendoh
>
>
>
> --
> View this message in context: 
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-plus-Elastic-Search-plus-Kibana-tp4340.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive at 
> Nabble.com.


Re: InvalidTypesException - Input mismatch: Basic type 'Integer' expected but was 'Long'

2016-01-18 Thread Till Rohrmann
Hi Biplob,

no version 0.10.1 and 1.0-SNAPSHOT are different. Could you bump your Flink
version to the latter and try again if you can reproduce your problem?

Cheers,
Till
​

On Mon, Jan 18, 2016 at 2:24 PM, Biplob Biswas 
wrote:

> Hi Till,
>
> I am using flink 0.10.1 and if i am not wrong it corresponds to the
> 1.0-Snapshot you mentioned.
>
> [image: Inline image 1]
>
> If wrong, please suggest what should I do to fix it.
>
> Thanks & Regards
> Biplob Biswas
>
> On Mon, Jan 18, 2016 at 11:23 AM, Till Rohrmann 
> wrote:
>
>> Hi Biplob,
>>
>> which version of Flink are you using? With version 1.0-SNAPSHOT, I
>> cannot reproduce your problem.
>>
>> Cheers,
>> Till
>> ​
>>
>> On Sun, Jan 17, 2016 at 4:56 PM, Biplob Biswas 
>> wrote:
>>
>>> Hi,
>>>
>>> I am getting the following exception when i am using the map function
>>>
>>> Exception in thread "main"
 org.apache.flink.api.common.functions.InvalidTypesException: The return
 type of function 'computeWeightedDistribution(GraphWeighted.java:73)' could
 not be determined automatically, due to type erasure. You can give type
 information hints by using the returns(...) method on the result of the
 transformation call, or by letting your function implement the
 'ResultTypeQueryable' interface.
 at org.apache.flink.api.java.DataSet.getType(DataSet.java:176)
 at org.apache.flink.api.java.DataSet.groupBy(DataSet.java:692)
 at aim3.GraphWeighted.computeWeightedDistribution(GraphWeighted.java:74)
 at aim3.SlashdotZooInDegree.main(SlashdotZooInDegree.java:39)
 Caused by: org.apache.flink.api.common.functions.InvalidTypesException:
 Input mismatch: Basic type 'Integer' expected but was 'Long'.
 at
 org.apache.flink.api.java.typeutils.TypeExtractor.validateInputType(TypeExtractor.java:767)
 at
 org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:276)
 at
 org.apache.flink.api.java.typeutils.TypeExtractor.getMapReturnTypes(TypeExtractor.java:110)
 at org.apache.flink.api.java.DataSet.map(DataSet.java:213)
 at aim3.GraphWeighted.computeWeightedDistribution(GraphWeighted.java:73)
 ... 1 more
>>>
>>>
>>>
>>> This is the part of the code which I am trying to run :
>>>
>>> DataSet> distinctVertex = sourceVertex
  .union(destinationVertex)
  .groupBy(0)
  .aggregate(Aggregations.SUM, 1);
 // Compute the degrees (degree, count)

  DataSet> degreeCount = distinctVertex
  .map(new DegreeMapper())
  .groupBy(0)
  .aggregate(Aggregations.SUM, 1);
>>>
>>>
>>>
>>> and the error I am getting is at this line *.map(new DegreeMapper())*
>>>
>>> Also, the degree mapper is a simply map function which emits the second
>>> column and 1 as follows:
>>>

 public static class DegreeMapper implements
 MapFunction, Tuple2> {
 private static final long serialVersionUID = 1L;
 public Tuple2 map(Tuple2 input) throws
 Exception {
 return new Tuple2(input.f1, 1);
 }
 }
>>>
>>>
>>>
>>> Now I am lost as to what I did wrong and why I am getting that error,
>>> any help would be appreciated.
>>>
>>> Thanks a lot.
>>>
>>> Thanks & Regards
>>> Biplob Biswas
>>>
>>
>>
>


Re: Accessing configuration in RichFunction

2016-01-18 Thread Christian Kreutzfeldt
Hi Robert,

using the constructor is actually the selected way. Using the existing
lifecycle method was an idea to integrate it more with the existing
framework design ;-)

Best
  Christian

2016-01-18 13:38 GMT+01:00 Robert Metzger :

> Hi Christian,
>
> I think the DataStream API does not allow you to pass any parameter to the
> open(Configuration) method.
> That method is only used in the DataSet (Batch) API, and its use is
> discouraged.
>
> A much better option to pass a Configuration into your function is as
> follows:
>
>
> Configuration mapConf = new Configuration();
> mapConf.setDouble("somthing", 1.2);
>
> DataStream> counts =
> // split up the lines in pairs (2-tuples) containing: (word,1)
>
> text.flatMap(new Tokenizer(mapConf))
> // group by the tuple field "0" and sum up tuple field "1"
>   .keyBy(0).sum(1);
>
>
> And in the Tokenizer:
>
> public static final class Tokenizer implements FlatMapFunction Tuple2> {
>private final Configuration mapConf;
>
>public Tokenizer(Configuration mapConf) {
>   this.mapConf = mapConf;
>}
>
>
> This works as long as the type you're passing is serializable.
>
>
>
> On Mon, Jan 18, 2016 at 12:43 PM, Christian Kreutzfeldt 
> wrote:
>
>> Hi Max,
>>
>> maybe I explained it a bit mistakable ;-)
>>
>> I have a stream-based application which contains a RichFilterFunction
>> implementation. The parent provides a lifecycle method open
>> (open(Configuration)) which receives a Configuration object as input. I
>> would like to use this call to pass options into the operator instance.
>>
>> Unfortunately, I found no hint where and how to provide the information
>> such that I receive them at the described method. Actually, I am accessing
>> the surrounding runtime context to retrieve the global job parameters where
>> I extract the desired information from. But for some reasons I do not want
>> the operator to receive its setup information from the provided
>> Configuration instance ;-)
>>
>> That's why I am looking for the place where the configuration object is
>> created and passed into the rich filter function. I would like to insert
>> dedicated information for a dedicated filter instance.
>>
>> Best
>>   Christian
>>
>>
>> 2016-01-18 12:30 GMT+01:00 Maximilian Michels :
>>
>>> Hi Christian,
>>>
>>> For your implementation, would it suffice to pass a Configuration with
>>> your RichFilterFunction? You said the global job parameters are not
>>> passed on to your user function? Can you confirm this is a bug?
>>>
>>> Cheers,
>>> Max
>>>
>>> On Wed, Jan 13, 2016 at 10:59 AM, Christian Kreutzfeldt
>>>  wrote:
>>> > Hi Fabian,
>>> >
>>> > thanks for your quick response. I just figured out that I forgot to
>>> mention
>>> > a small but probably relevant detail: I am working with the streaming
>>> api.
>>> >
>>> > Although there is a way to access the overall job settings, I need a
>>> > solution to "reduce" the view on configuration options available on
>>> operator
>>> > level.
>>> > For example, I would like to pass instance specific settings like an
>>> > operator identifier but there might be different operators in the
>>> overall
>>> > program.
>>> >
>>> > Best
>>> >   Christian
>>> >
>>> > 2016-01-13 10:52 GMT+01:00 Fabian Hueske :
>>> >>
>>> >> Hi Christian,
>>> >>
>>> >> the open method is called by the Flink workers when the parallel
>>> tasks are
>>> >> initialized.
>>> >> The configuration parameter is the configuration object of the
>>> operator.
>>> >> You can set parameters in the operator config as follows:
>>> >>
>>> >> DataSet text = ...
>>> >> DataSet wc = text.flatMap(new
>>> >> Tokenizer()).getParameters().setString("myKey", "myVal");
>>> >>
>>> >> Best, Fabian
>>> >>
>>> >>
>>> >> 2016-01-13 10:29 GMT+01:00 Christian Kreutzfeldt :
>>> >>>
>>> >>> Hi
>>> >>>
>>> >>> While working on a RichFilterFunction implementation I was
>>> wondering, if
>>> >>> there is a much better way to access configuration
>>> >>> options read from file during startup. Actually, I am using
>>> >>> getRuntimeContext().getExecutionConfig().getGlobalJobParameters()
>>> >>> to get access to my settings.
>>> >>>
>>> >>> Reason for that is, that the Configuration parameter provided to the
>>> open
>>> >>> function does not carry my settings. That is probably
>>> >>> the case as I use
>>> >>> this.executionEnvironment.getConfig().setGlobalJobParameters(cfg) to
>>> pass my
>>> >>> configuration into the environment
>>> >>> which in turn is not passed on as part of the open call - I found no
>>> >>> other way to handle configuration ;-)
>>> >>>
>>> >>> My question is: who is responsible for calling the open function,
>>> where
>>> >>> does the configuration parameter has its origins aka where
>>> >>> is its content taken from and is it possible to define somewhere in
>>> the
>>> 

Re: Compile fails with scala 2.11.4

2016-01-18 Thread Ritesh Kumar Singh
Thanks for the replies.

@Chiwan, I am switching back to scala_2.10.4 for the time being. I was
using scala_2.11.4 as this is the version I've compiled spark with. But
anyways, I can wait for the bug to be resolved.

@Robert, the commands were as follows:
$tools/change-scala-version.sh 2.11
$mvn clean install -DskipTests -Dscala.version=2.11.4

I hope I'm doing it right ?

Thanks,

*Ritesh Kumar Singh,*
*https://riteshtoday.wordpress.com/* 

On Mon, Jan 18, 2016 at 12:03 PM, Robert Metzger 
wrote:

> How did start the Flink for Scala 2.11 compilation ?
>
> On Mon, Jan 18, 2016 at 11:41 AM, Chiwan Park 
> wrote:
>
>> Hi Ritesh,
>>
>> This problem seems already reported [1]. Flink community is investigating
>> this issue. I think that if you don’t need Scala 2.11, use Scala 2.10 until
>> the issue is solved.
>>
>> [1]:
>> http://mail-archives.apache.org/mod_mbox/flink-user/201601.mbox/%3CCAB6CeiZ_2snN-piXzd3gHnyQePu_PA0Ro7qXUF8%3DVTxoyL0YyA%40mail.gmail.com%3E
>>
>> > On Jan 18, 2016, at 7:24 PM, Ritesh Kumar Singh <
>> riteshoneinamill...@gmail.com> wrote:
>> >
>> > [ERROR]
>> /home/flink/flink-runtime/src/test/scala/org/apache/flink/runtime/jobmanager/JobManagerITCase.scala:703:
>> error: can't expand macros compiled by previous versions of Scala
>> > [ERROR]   assert(cachedGraph2.isArchived)
>> > [ERROR]   ^
>> > [ERROR] one error found
>> > [INFO]
>> 
>> > [INFO] Reactor Summary:
>> > [INFO]
>> > [INFO] flink .. SUCCESS [
>> 24.820 s]
>> > [INFO] flink-annotations .. SUCCESS [
>> 2.755 s]
>> > [INFO] flink-shaded-hadoop  SUCCESS [
>> 0.208 s]
>> > [INFO] flink-shaded-hadoop2 ... SUCCESS [
>> 15.627 s]
>> > [INFO] flink-shaded-include-yarn-tests  SUCCESS [
>> 17.076 s]
>> > [INFO] flink-shaded-curator ... SUCCESS [
>> 0.200 s]
>> > [INFO] flink-shaded-curator-recipes ... SUCCESS [
>> 2.751 s]
>> > [INFO] flink-shaded-curator-test .. SUCCESS [
>> 0.355 s]
>> > [INFO] flink-core . SUCCESS [
>> 33.052 s]
>> > [INFO] flink-java . SUCCESS [
>> 10.224 s]
>> > [INFO] flink-runtime .. FAILURE
>> [01:23 min]
>> > [INFO] flink-optimizer  SKIPPED
>> >
>> >
>> > Any workaround for scala_2.11.4 or do I have to switch back to
>> scala_2.10.4 ?
>> >
>> > Thanks,
>> > Ritesh Kumar Singh,
>> > https://riteshtoday.wordpress.com/
>> >
>>
>> Regards,
>> Chiwan Park
>>
>>
>


Re: InvalidTypesException - Input mismatch: Basic type 'Integer' expected but was 'Long'

2016-01-18 Thread Till Rohrmann
Hi Biplob,

which version of Flink are you using? With version 1.0-SNAPSHOT, I cannot
reproduce your problem.

Cheers,
Till
​

On Sun, Jan 17, 2016 at 4:56 PM, Biplob Biswas 
wrote:

> Hi,
>
> I am getting the following exception when i am using the map function
>
> Exception in thread "main"
>> org.apache.flink.api.common.functions.InvalidTypesException: The return
>> type of function 'computeWeightedDistribution(GraphWeighted.java:73)' could
>> not be determined automatically, due to type erasure. You can give type
>> information hints by using the returns(...) method on the result of the
>> transformation call, or by letting your function implement the
>> 'ResultTypeQueryable' interface.
>> at org.apache.flink.api.java.DataSet.getType(DataSet.java:176)
>> at org.apache.flink.api.java.DataSet.groupBy(DataSet.java:692)
>> at aim3.GraphWeighted.computeWeightedDistribution(GraphWeighted.java:74)
>> at aim3.SlashdotZooInDegree.main(SlashdotZooInDegree.java:39)
>> Caused by: org.apache.flink.api.common.functions.InvalidTypesException:
>> Input mismatch: Basic type 'Integer' expected but was 'Long'.
>> at
>> org.apache.flink.api.java.typeutils.TypeExtractor.validateInputType(TypeExtractor.java:767)
>> at
>> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:276)
>> at
>> org.apache.flink.api.java.typeutils.TypeExtractor.getMapReturnTypes(TypeExtractor.java:110)
>> at org.apache.flink.api.java.DataSet.map(DataSet.java:213)
>> at aim3.GraphWeighted.computeWeightedDistribution(GraphWeighted.java:73)
>> ... 1 more
>
>
>
> This is the part of the code which I am trying to run :
>
> DataSet> distinctVertex = sourceVertex
>>  .union(destinationVertex)
>>  .groupBy(0)
>>  .aggregate(Aggregations.SUM, 1);
>> // Compute the degrees (degree, count)
>>
>>  DataSet> degreeCount = distinctVertex
>>  .map(new DegreeMapper())
>>  .groupBy(0)
>>  .aggregate(Aggregations.SUM, 1);
>
>
>
> and the error I am getting is at this line *.map(new DegreeMapper())*
>
> Also, the degree mapper is a simply map function which emits the second
> column and 1 as follows:
>
>>
>> public static class DegreeMapper implements
>> MapFunction, Tuple2> {
>> private static final long serialVersionUID = 1L;
>> public Tuple2 map(Tuple2 input) throws
>> Exception {
>> return new Tuple2(input.f1, 1);
>> }
>> }
>
>
>
> Now I am lost as to what I did wrong and why I am getting that error, any
> help would be appreciated.
>
> Thanks a lot.
>
> Thanks & Regards
> Biplob Biswas
>


Re: Compile fails with scala 2.11.4

2016-01-18 Thread Chiwan Park
Hi Ritesh,

This problem seems already reported [1]. Flink community is investigating this 
issue. I think that if you don’t need Scala 2.11, use Scala 2.10 until the 
issue is solved.

[1]: 
http://mail-archives.apache.org/mod_mbox/flink-user/201601.mbox/%3CCAB6CeiZ_2snN-piXzd3gHnyQePu_PA0Ro7qXUF8%3DVTxoyL0YyA%40mail.gmail.com%3E

> On Jan 18, 2016, at 7:24 PM, Ritesh Kumar Singh 
>  wrote:
> 
> [ERROR] 
> /home/flink/flink-runtime/src/test/scala/org/apache/flink/runtime/jobmanager/JobManagerITCase.scala:703:
>  error: can't expand macros compiled by previous versions of Scala
> [ERROR]   assert(cachedGraph2.isArchived)
> [ERROR]   ^
> [ERROR] one error found
> [INFO] 
> 
> [INFO] Reactor Summary:
> [INFO] 
> [INFO] flink .. SUCCESS [ 24.820 
> s]
> [INFO] flink-annotations .. SUCCESS [  2.755 
> s]
> [INFO] flink-shaded-hadoop  SUCCESS [  0.208 
> s]
> [INFO] flink-shaded-hadoop2 ... SUCCESS [ 15.627 
> s]
> [INFO] flink-shaded-include-yarn-tests  SUCCESS [ 17.076 
> s]
> [INFO] flink-shaded-curator ... SUCCESS [  0.200 
> s]
> [INFO] flink-shaded-curator-recipes ... SUCCESS [  2.751 
> s]
> [INFO] flink-shaded-curator-test .. SUCCESS [  0.355 
> s]
> [INFO] flink-core . SUCCESS [ 33.052 
> s]
> [INFO] flink-java . SUCCESS [ 10.224 
> s]
> [INFO] flink-runtime .. FAILURE [01:23 
> min]
> [INFO] flink-optimizer  SKIPPED
> 
> 
> Any workaround for scala_2.11.4 or do I have to switch back to scala_2.10.4 ?
> 
> Thanks,
> Ritesh Kumar Singh,
> https://riteshtoday.wordpress.com/
> 

Regards,
Chiwan Park



Flink Stream: collect in an array all records within a window

2016-01-18 Thread Saiph Kappa
Hi,

After performing a windowAll() on a DataStream[String], is there any method
to collect and return an array with all Strings within a window (similar to
.collect in Spark).

I basically want to ship all strings in a window to a remote server through
a socket, and want to use the same socket connection for all strings that I
send. The method .addSink iterates over all records, but does the provided
function runs on the flink client or on the server?

Thanks.


Re: Redeployements and state

2016-01-18 Thread Maximilian Michels
The documentation layout changed in the master. Then new URL:
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html

On Thu, Jan 14, 2016 at 2:21 PM, Niels Basjes  wrote:
> Yes, that is exactly the type of solution I was looking for.
>
> I'll dive into this.
> Thanks guys!
>
> Niels
>
> On Thu, Jan 14, 2016 at 11:55 AM, Ufuk Celebi  wrote:
>>
>> Hey Niels,
>>
>> as Gabor wrote, this feature has been merged to the master branch
>> recently.
>>
>> The docs are online here:
>> https://ci.apache.org/projects/flink/flink-docs-master/apis/savepoints.html
>>
>> Feel free to report back your experience with it if you give it a try.
>>
>> – Ufuk
>>
>> > On 14 Jan 2016, at 11:09, Gábor Gévay  wrote:
>> >
>> > Hello,
>> >
>> > You are probably looking for this feature:
>> > https://issues.apache.org/jira/browse/FLINK-2976
>> >
>> > Best,
>> > Gábor
>> >
>> >
>> >
>> >
>> > 2016-01-14 11:05 GMT+01:00 Niels Basjes :
>> >> Hi,
>> >>
>> >> I'm working on a streaming application using Flink.
>> >> Several steps in the processing are state-full (I use custom Windows
>> >> and
>> >> state-full operators ).
>> >>
>> >> Now if during a normal run an worker fails the checkpointing system
>> >> will be
>> >> used to recover.
>> >>
>> >> But what if the entire application is stopped (deliberately) or
>> >> stops/fails
>> >> because of a problem?
>> >>
>> >> At this moment I have three main reasons/causes for doing this:
>> >> 1) The application just dies because of a bug on my side or a problem
>> >> like
>> >> for example this (which I'm actually confronted with):  Failed to
>> >> Update
>> >> HDFS Delegation Token for long running application in HA mode
>> >> https://issues.apache.org/jira/browse/HDFS-9276
>> >> 2) I need to rebalance my application (i.e. stop, change parallelism,
>> >> start)
>> >> 3) I need a new version of my software to be deployed. (i.e. I fixed a
>> >> bug,
>> >> changed the topology and need to continue)
>> >>
>> >> I assume the solution will be in some part be specific for my
>> >> application.
>> >> The question is what features exist in Flink to support such a clean
>> >> "continue where I left of" scenario?
>> >>
>> >> --
>> >> Best regards / Met vriendelijke groeten,
>> >>
>> >> Niels Basjes
>>
>
>
>
> --
> Best regards / Met vriendelijke groeten,
>
> Niels Basjes


Re: Security in Flink

2016-01-18 Thread Maximilian Michels
Hi Welly,

There is no fixed timeline yet but we plan to make progress in terms
of authentication and encryption after the 1.0.0 release.

Cheers,
Max

On Wed, Jan 13, 2016 at 8:34 AM, Welly Tambunan  wrote:
> Hi Stephan,
>
> Thanks a lot for the explanation.
>
> Is there any timeline on when this will be released ? I guess this one will
> be the important for our case if we want Flink to be deployed in production.
>
> Cheers
>
> On Tue, Jan 12, 2016 at 6:19 PM, Stephan Ewen  wrote:
>>
>> Hi Sourav!
>>
>> If you want to use Flink in a cluster where neither Hadoop/YARN (not soon
>> Mesos) is available, then I assume you have installed Flink in a standalone
>> mode on the cluster already.
>>
>> There is no support in Flink currently to manage user authentication. Few
>> thoughts on how that may evolve
>>
>> 1) It should be not too hard to add authentication to the web dashboard.
>> That way, if the cluster is otherwise blocked off (the master's RPC ports
>> are firewalled), one would have restricted job starts.
>>
>> 2) We plan to add authenticated / encrypted connections soon. With that,
>> the client that submits the program would need to have access to the
>> keystore or key and the corresponding password to connect.
>>
>> Greetings,
>> Stephan
>>
>>
>>
>> On Mon, Jan 11, 2016 at 3:46 PM, Sourav Mazumder
>>  wrote:
>>>
>>> Thanks Steven for your details response. Things are more clear to me now.
>>>
>>> A follow up Qs -
>>> Looks like most of the security support depends on Hadoop ? What happens
>>> if anyone wants to use Flink with Hadoop (in a cluster where Hadoop is not
>>> there) ?
>>>
>>> Regards,
>>> Sourav
>>>
>>> On Sun, Jan 10, 2016 at 12:41 PM, Stephan Ewen  wrote:

 Hi Sourav!

 There is user-authentication support in Flink via the Hadoop / Kerberos
 infrastructure. If you run Flink on YARN, it should seamlessly work that
 Flink acquires the Kerberos tokens of the user that submits programs, and
 authenticate itself at YARN, HDFS, and HBase with that.

 If you run Flink standalone, Flink can still authenticate at HDFS/HBase
 via Kerberos, with a bit of manual help by the user (running kinit on the
 workers).

 With Kafka 0.9 and Flink's upcoming connector
 (https://github.com/apache/flink/pull/1489), streaming programs can
 authenticate themselves as stream brokers via SSL (and read via encrypted
 connections).


 What we have on the roadmap for the coming months it the following:
   - Encrypt in-flight data streams that are exchanged between worker
 nodes (TaskManagers).
   - Encrypt the coordination messages between client/master/workers.
 Note that these refer to encryption between Flink's own components only,
 which would use transient keys generated just for a specific job or session
 (hence would not need any user involvement).


 Let us know if that answers your questions, and if that meets your
 requirements.

 Greetings,
 Stephan


 On Fri, Jan 8, 2016 at 3:23 PM, Sourav Mazumder
  wrote:
>
> Hi,
>
> Can anyone point me to ant documentation on support for Security in
> Flink ?
>
> The type of information I'm looking for are -
>
> 1. How do I do user level authentication to ensure that a job is
> submitted/deleted/modified by the right user ? Is it possible though the 
> web
> client ?
> 2. Authentication across multiple slave nodes (where the task managers
> are running) and driver program so that they can communicate with each 
> other
> 3. Support for SSL/encryption for data exchanged happening across the
> slave nodes
> 4. Support for pluggable authentication with existing solution like
> LDAP
>
> If not there today is there a roadmap for these security features ?
>
> Regards,
> Sourav


>>>
>>
>
>
>
> --
> Welly Tambunan
> Triplelands
>
> http://weltam.wordpress.com
> http://www.triplelands.com


Re: Cancel Job

2016-01-18 Thread Matthias J. Sax
Hi,

currently, messaged in flight will be dropped if a streaming job gets
canceled.

There is already WIP to add a STOP signal which allows for a clean
shutdown of a streaming job. This should get merged soon and will be
available in Flink 1.0.

You can follow the JIRA an PR here:
https://issues.apache.org/jira/browse/FLINK-2111
https://github.com/apache/flink/pull/750

-Matthias


On 01/18/2016 08:26 PM, Don Frascuchon wrote:
> Hi,
> 
> When some streaming job is manually canceled, what's about the messages
> in process ? Flink's engine wait to task finish process  messages inside
> (some like apache-storm) ? If not, there is a safe way for stop
> streaming jobs ?
> 
> Thanks in advance!
> Best regards



signature.asc
Description: OpenPGP digital signature


Re: Results of testing Flink quickstart against 0.10-SNAPSHOT and 1.0-SNAPSHOT (re. Dependency on non-existent org.scalamacros:quasiquotes_2.11:)

2016-01-18 Thread Prez Cannady
One correction, line 3 of 1.0-SHAPSHOT source build should read “checked out 
master branch (snapshot version 1.0-SNAPSHOT)."

Prez Cannady  
p: 617 500 3378  
e: revp...@opencorrelate.org   
GH: https://github.com/opencorrelate   
LI: https://www.linkedin.com/in/revprez   









> On Jan 19, 2016, at 12:41 AM, Prez Cannady  wrote:
> 
> Sent this to d...@flink.apache.org , but that 
> might not be the appropriate forum for it.
> 
> Finally got a chance to sit down and run through a few tests. Upfront, I have 
> been able to resolve my issue sufficiently to move forward, but seems there’s 
> an issue with the current bits for both 1.0-SNAPSHOT and 0.10-SNAPSHOT in the 
> remote Maven repos.
> 
> Notes
> 
> wordcount-processing  is 
> a customized version of the Flink quickstart archetype I’m using to test 
> Flink integration with Spring Boot. It is instrumented for Maven and Gradle 
> build and execution.
> I’m targeting Scala 2.11 and Flink 0.10.
> 0.10-SNAPSHOT source build
> 
> Steps
> 
> Checked out release–0.10 branch (snapshot version 0.10-SNAPSHOT).
> Built with mvn clean install -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dscala.version=2.11.7 -Dscala.binary.version=2.11.
> Ran wordcount-process with mvn clean spring-boot:run 
> -Drun.arguments=“localhost,”.
> Ran wordcount-process with gradle bootRun -Drun.arguments=“localhost ”.
> Result
> 
> Maven execution of test succeeds without incident.
> Gradle execution of test succeeds without incident.
> 0.10-SNAPSHOT with fetched binary dependencies
> 
> Steps
> 
> Cleaned out local maven repository with rm -rf 
> $HOME/.m2/repo/org/apache/flink && rm -rf $HOME/.m2/repo/org/scala*.
> Cleaned out local gradle repository with rm -rf 
> $HOME/.gradle/caches/modules-2/files-2.1/org.apache.flink && rm -rf 
> $HOME/.gradle/caches/modules-2/files-2.1/org.scala*.
> Ran wordcount-process with mvn clean spring-boot:run 
> -Drun.arguments=“localhost,”.
> Ran wordcount-process with gradle bootRun -Drun.arguments=“localhost ”.
> Result
> 
> Maven build completed without incident. Maven execution error’d out with 
> issue supposedly resolved with pull request 1511 
> .
> Gradle execution of test succeeds without incident.
> 1.0-SNAPSHOT source build
> 
> Steps
> 
> Cleaned out local maven repository with rm -rf 
> $HOME/.m2/repo/org/apache/flink && rm -rf $HOME/.m2/repo/org/scala*.
> Cleaned out local gradle repository with rm -rf 
> $HOME/.gradle/caches/modules-2/files-2.1/org.apache.flink && rm -rf 
> $HOME/.gradle/caches/modules-2/files-2.1/org.scala*.
> Checked out release–0.10 branch (snapshot version 0.10-SNAPSHOT).
> Built with mvn clean install -DskipTests=true -Dmaven.javadoc.skip=true 
> -Dscala.version=2.11.7 -Dscala.binary.version=2.11.
> Ran wordcount-process with mvn clean spring-boot:run 
> -Drun.arguments=“localhost,”.
> Ran wordcount-process with gradle bootRun -Drun.arguments=“localhost ”.
> Result
> 
> Maven build completed without incident. Maven execution error’d out with 
> issue supposedly resolved with pull request 1511 
> .
> Gradle execution of test succeeds without incident.
> 1.0-SNAPSHOT with fetched binary dependencies
> 
> Steps
> 
> Cleaned out local maven repository with rm -rf 
> $HOME/.m2/repo/org/apache/flink && rm -rf $HOME/.m2/repo/org/scala*.
> Cleaned out local gradle repository with rm -rf 
> $HOME/.gradle/caches/modules-2/files-2.1/org.apache.flink && rm -rf 
> $HOME/.gradle/caches/modules-2/files-2.1/org.scala*.
> Ran wordcount-process with mvn clean spring-boot:run 
> -Drun.arguments=“localhost,”.
> Ran wordcount-process with gradle bootRun -Drun.arguments=“localhost ”.
> Result
> 
> Maven build error’d out with
> could not find implicit value for evidence parameter of type 
> org.apache.flink.api.common.typeinfo.TypeInformation[String], and
> can't expand macros compiled by previous versions of Scala[ERROR] val text = 
> env.fromElements("To be, or not to be,--that is the question:—“.
> (Gist ).
> Gradle build error’d out with java.lang.NoClassDefFoundError: 
> scala/reflect/macros/Context (gist 
> ).
> Discussion
> 
> As it stands, I can move forward with building 0.10-SNAPSHOT from source for 
> my own purposes.
> I’m guessing pull request 1511  
> hasn’t made it into the the upstream snapshot repos yet.
> I’m not sure as to why 1.0-SNAPSHOT is suffering from a macros incident and 
> not 0.10-SNAPSHOT. Both are bringing in dependencies (quasiquotes, 
> scala-library, scala-reflect, and a bunch of akka-* stuff) compiled for 2.10, 
> but 

Results of testing Flink quickstart against 0.10-SNAPSHOT and 1.0-SNAPSHOT (re. Dependency on non-existent org.scalamacros:quasiquotes_2.11:)

2016-01-18 Thread Prez Cannady
Sent this to d...@flink.apache.org , but that 
might not be the appropriate forum for it.

Finally got a chance to sit down and run through a few tests. Upfront, I have 
been able to resolve my issue sufficiently to move forward, but seems there’s 
an issue with the current bits for both 1.0-SNAPSHOT and 0.10-SNAPSHOT in the 
remote Maven repos.

Notes

wordcount-processing  is a 
customized version of the Flink quickstart archetype I’m using to test Flink 
integration with Spring Boot. It is instrumented for Maven and Gradle build and 
execution.
I’m targeting Scala 2.11 and Flink 0.10.
0.10-SNAPSHOT source build

Steps

Checked out release–0.10 branch (snapshot version 0.10-SNAPSHOT).
Built with mvn clean install -DskipTests=true -Dmaven.javadoc.skip=true 
-Dscala.version=2.11.7 -Dscala.binary.version=2.11.
Ran wordcount-process with mvn clean spring-boot:run 
-Drun.arguments=“localhost,”.
Ran wordcount-process with gradle bootRun -Drun.arguments=“localhost ”.
Result

Maven execution of test succeeds without incident.
Gradle execution of test succeeds without incident.
0.10-SNAPSHOT with fetched binary dependencies

Steps

Cleaned out local maven repository with rm -rf $HOME/.m2/repo/org/apache/flink 
&& rm -rf $HOME/.m2/repo/org/scala*.
Cleaned out local gradle repository with rm -rf 
$HOME/.gradle/caches/modules-2/files-2.1/org.apache.flink && rm -rf 
$HOME/.gradle/caches/modules-2/files-2.1/org.scala*.
Ran wordcount-process with mvn clean spring-boot:run 
-Drun.arguments=“localhost,”.
Ran wordcount-process with gradle bootRun -Drun.arguments=“localhost ”.
Result

Maven build completed without incident. Maven execution error’d out with issue 
supposedly resolved with pull request 1511 
.
Gradle execution of test succeeds without incident.
1.0-SNAPSHOT source build

Steps

Cleaned out local maven repository with rm -rf $HOME/.m2/repo/org/apache/flink 
&& rm -rf $HOME/.m2/repo/org/scala*.
Cleaned out local gradle repository with rm -rf 
$HOME/.gradle/caches/modules-2/files-2.1/org.apache.flink && rm -rf 
$HOME/.gradle/caches/modules-2/files-2.1/org.scala*.
Checked out release–0.10 branch (snapshot version 0.10-SNAPSHOT).
Built with mvn clean install -DskipTests=true -Dmaven.javadoc.skip=true 
-Dscala.version=2.11.7 -Dscala.binary.version=2.11.
Ran wordcount-process with mvn clean spring-boot:run 
-Drun.arguments=“localhost,”.
Ran wordcount-process with gradle bootRun -Drun.arguments=“localhost ”.
Result

Maven build completed without incident. Maven execution error’d out with issue 
supposedly resolved with pull request 1511 
.
Gradle execution of test succeeds without incident.
1.0-SNAPSHOT with fetched binary dependencies

Steps

Cleaned out local maven repository with rm -rf $HOME/.m2/repo/org/apache/flink 
&& rm -rf $HOME/.m2/repo/org/scala*.
Cleaned out local gradle repository with rm -rf 
$HOME/.gradle/caches/modules-2/files-2.1/org.apache.flink && rm -rf 
$HOME/.gradle/caches/modules-2/files-2.1/org.scala*.
Ran wordcount-process with mvn clean spring-boot:run 
-Drun.arguments=“localhost,”.
Ran wordcount-process with gradle bootRun -Drun.arguments=“localhost ”.
Result

Maven build error’d out with
could not find implicit value for evidence parameter of type 
org.apache.flink.api.common.typeinfo.TypeInformation[String], and
can't expand macros compiled by previous versions of Scala[ERROR] val text = 
env.fromElements("To be, or not to be,--that is the question:—“.
(Gist ).
Gradle build error’d out with java.lang.NoClassDefFoundError: 
scala/reflect/macros/Context (gist 
).
Discussion

As it stands, I can move forward with building 0.10-SNAPSHOT from source for my 
own purposes.
I’m guessing pull request 1511  
hasn’t made it into the the upstream snapshot repos yet.
I’m not sure as to why 1.0-SNAPSHOT is suffering from a macros incident and not 
0.10-SNAPSHOT. Both are bringing in dependencies (quasiquotes, scala-library, 
scala-reflect, and a bunch of akka-* stuff) compiled for 2.10, but 
0.10-SNAPSHOT does not invite the macro-related errors some of us have seen 
when building against 1.0-SNAPSHOT dependencies.

Prez Cannady  
p: 617 500 3378  
e: revp...@opencorrelate.org   
GH: https://github.com/opencorrelate   
LI: https://www.linkedin.com/in/revprez   











Re: Cancel Job

2016-01-18 Thread Don Frascuchon
Thanks Matthias !

El lun., 18 ene. 2016 a las 20:51, Matthias J. Sax ()
escribió:

> Hi,
>
> currently, messaged in flight will be dropped if a streaming job gets
> canceled.
>
> There is already WIP to add a STOP signal which allows for a clean
> shutdown of a streaming job. This should get merged soon and will be
> available in Flink 1.0.
>
> You can follow the JIRA an PR here:
> https://issues.apache.org/jira/browse/FLINK-2111
> https://github.com/apache/flink/pull/750
>
> -Matthias
>
>
> On 01/18/2016 08:26 PM, Don Frascuchon wrote:
> > Hi,
> >
> > When some streaming job is manually canceled, what's about the messages
> > in process ? Flink's engine wait to task finish process  messages inside
> > (some like apache-storm) ? If not, there is a safe way for stop
> > streaming jobs ?
> >
> > Thanks in advance!
> > Best regards
>
>