Re: [DISCUSS] Secure Flink clusters

2016-05-17 Thread Henry Saputra
Eron, Could you please do also loop me in in the early discussions since we are interested on deploying Flink as standalone to access secure data via Kerberized access. I also was talking to Owen from HDFS at the Apache Big Data and there could be some work we can ask to be done in the Hadoop com

[jira] [Created] (FLINK-3921) StringParser not specifying encoding to use

2016-05-17 Thread Tatu Saloranta (JIRA)
Tatu Saloranta created FLINK-3921: - Summary: StringParser not specifying encoding to use Key: FLINK-3921 URL: https://issues.apache.org/jira/browse/FLINK-3921 Project: Flink Issue Type: Impro

Re: Using secure cluster resources without authentication

2016-05-17 Thread Wright, Eron
I believe that to really protect the cluster from unauthorized use requires that the cluster endpoints (notably Akka) perform an authorization check. The 'secure flink’ design doc outlines various measures to achieve that. Stefano I’ll reach out to have a sync-up meeting and to incorporate your

Re: [DISCUSS] Secure Flink clusters

2016-05-17 Thread Wright, Eron
Thanks to all who reviewed the document.It appears we have a good plan and I'm filing JIRA issues accordingly. Robert, I'm in touch with Max, Stephan, and Stefano.I’ll update the thread when we have a better sense of the timing. The work will clearly span a couple of releases. Eron

Re: remote debugging

2016-05-17 Thread Flavio Pompermaier
Done ;) On Tue, May 17, 2016 at 5:37 PM, Robert Metzger wrote: > Okay, I gave you permissions. > > On Tue, May 17, 2016 at 5:22 PM, Flavio Pompermaier > wrote: > > > I've just signed up as f.pompermaier > > > > Thanks! > > > > On Tue, May 17, 2016 at 5:04 PM, Robert Metzger > > wrote: > > > >

Re: remote debugging

2016-05-17 Thread Robert Metzger
Okay, I gave you permissions. On Tue, May 17, 2016 at 5:22 PM, Flavio Pompermaier wrote: > I've just signed up as f.pompermaier > > Thanks! > > On Tue, May 17, 2016 at 5:04 PM, Robert Metzger > wrote: > > > Can you give me your wiki user id, then I can give you permissions. > > > > On Tue, May

Re: Using secure cluster resources without authentication

2016-05-17 Thread Robert Metzger
I'm not sure if doing the check in the CliFrontend is really effective. A "hacker" could just create a custom flink build without that check and still submit a job to the job manager. On Thu, May 5, 2016 at 2:51 PM, Stefano Baghino < stefano.bagh...@radicalbit.io> wrote: > Apologies for being t

Re: [DISCUSS] Secure Flink clusters

2016-05-17 Thread Robert Metzger
Hi Eron, thanks a lot for putting so much effort into the design document. You've probably spend a lot of time to come up with it! I have to admit that I'm not that familiar with the topic, so I probably need to re-read it again to digest it completely. What are your plans for implementing the pr

Re: remote debugging

2016-05-17 Thread Flavio Pompermaier
I've just signed up as f.pompermaier Thanks! On Tue, May 17, 2016 at 5:04 PM, Robert Metzger wrote: > Can you give me your wiki user id, then I can give you permissions. > > On Tue, May 17, 2016 at 3:56 PM, Flavio Pompermaier > wrote: > > > No I can't edit that page :( > > > > On Tue, May 17,

Re: remote debugging

2016-05-17 Thread Robert Metzger
Can you give me your wiki user id, then I can give you permissions. On Tue, May 17, 2016 at 3:56 PM, Flavio Pompermaier wrote: > No I can't edit that page :( > > On Tue, May 17, 2016 at 3:46 PM, Stefano Baghino < > stefano.bagh...@radicalbit.io> wrote: > > > Thanks Flavio, > > > > perhaps it wou

Re: remote debugging

2016-05-17 Thread Flavio Pompermaier
No I can't edit that page :( On Tue, May 17, 2016 at 3:46 PM, Stefano Baghino < stefano.bagh...@radicalbit.io> wrote: > Thanks Flavio, > > perhaps it would be a nice addition to the Wiki page, would you care to > contribute your suggestion? :) > > On Tue, May 17, 2016 at 3:22 PM, Flavio Pompermai

Re: remote debugging

2016-05-17 Thread Stefano Baghino
Thanks Flavio, perhaps it would be a nice addition to the Wiki page, would you care to contribute your suggestion? :) On Tue, May 17, 2016 at 3:22 PM, Flavio Pompermaier wrote: > Hi to all, > > for debugging Flink from Eclipse this is what you have to do: > >1. go to 'Run' -> 'Debug configu

Re: remote debugging

2016-05-17 Thread Flavio Pompermaier
Hi to all, for debugging Flink from Eclipse this is what you have to do: 1. go to 'Run' -> 'Debug configurations...' 2. Create a new 'Remote Java Application' 3. In the 'Connect' tab choose: 1. the project to debug 2. Connection type 'Standard (Socket Attach)' 3. Connec

Re: Performance and accuracy of Flink iterations

2016-05-17 Thread Vasiliki Kalavri
Hi Greg, I think there is confusion between what delta means in the "delta iteration operator" of Flink and the "delta approximate implementation" of an algorithm, such as in PageRank. Assuming that we have a graph with a set of vertices and an iterative fixpoint algorithm that updates the vertex

Re: Partition problem

2016-05-17 Thread Till Rohrmann
Hi Andrew, I think in the end it boils down to counting the number of rows/finding the maximum index in the set of rows if you want to partition your matrix into blocks where the row indices are monotonically increasing. Without this information none of the described methods (range partition or cu

Re: Performance and accuracy of Flink iterations

2016-05-17 Thread Till Rohrmann
Hi Greg, as far as I know there has not been an exhaustive comparison to what extent the delta iterations can achieve the same accuracy as bulk iterations or how much accuracy you'll lose. I think it strongly depends on the problem. For example, graph algorithms such as connected components should

[jira] [Created] (FLINK-3920) Distributed Linear Algebra: block-based matrix

2016-05-17 Thread Simone Robutti (JIRA)
Simone Robutti created FLINK-3920: - Summary: Distributed Linear Algebra: block-based matrix Key: FLINK-3920 URL: https://issues.apache.org/jira/browse/FLINK-3920 Project: Flink Issue Type: Ne

[jira] [Created] (FLINK-3919) Distributed Linear Algebra: row-based matrix

2016-05-17 Thread Simone Robutti (JIRA)
Simone Robutti created FLINK-3919: - Summary: Distributed Linear Algebra: row-based matrix Key: FLINK-3919 URL: https://issues.apache.org/jira/browse/FLINK-3919 Project: Flink Issue Type: New

Re: remote debugging

2016-05-17 Thread Stefano Baghino
That would be pretty neat, in fact by modifying bin/flink-daemon.sh you'd end up having all managers listening for a remote debugger and it would be a problem if you want the JVM to suspend on start. On Tue, May 17, 2016 at 2:22 PM, Greg Hogan wrote: > I also just modify the startup scripts but

Re: remote debugging

2016-05-17 Thread Greg Hogan
I also just modify the startup scripts but would it be better to have variants of env.java.opts specific to the JobManager, TaskManager, client, etc.? On Tue, May 17, 2016 at 5:24 AM, Stephan Ewen wrote: > Hey Stefano! > > I think that question is bound to come up again. I created a page in the

Re: [PROPOSAL] Structure the Flink Open Source Development

2016-05-17 Thread Stephan Ewen
Hi! Thanks for all the comments, and the positive resonance! Looks like so far all are in favor. I would next add a section to the Wiki and the "How to Contribute" Guide on this structure, incorporating the component split of Optimizer and Client. After that, let's get started with gathering can

Re: remote debugging

2016-05-17 Thread Maximilian Michels
Thanks for documenting, Stefano! I remember, this came in very handy when debugging the Yarn integration and tests. On Tue, May 17, 2016 at 12:32 PM, Stefano Baghino wrote: > > It's there, I've left the Eclipse paragraph empty, unfortunately I have no > experience with remote debugging using it.

Re: remote debugging

2016-05-17 Thread Stefano Baghino
It's there, I've left the Eclipse paragraph empty, unfortunately I have no experience with remote debugging using it. On Tue, May 17, 2016 at 12:29 PM, Stephan Ewen wrote: > Super, thanks! > > On Tue, May 17, 2016 at 11:32 AM, Stefano Baghino < > stefano.bagh...@radicalbit.io> wrote: > > > +1, g

Re: remote debugging

2016-05-17 Thread Stephan Ewen
Super, thanks! On Tue, May 17, 2016 at 11:32 AM, Stefano Baghino < stefano.bagh...@radicalbit.io> wrote: > +1, great idea, I should've had it myself. :) > > I'll do it today, thanks for creating the page. > > On Tue, May 17, 2016 at 11:24 AM, Stephan Ewen wrote: > > > Hey Stefano! > > > > I thin

[jira] [Created] (FLINK-3918) Not configuring any time characteristic leads to a ClassCastException

2016-05-17 Thread Robert Metzger (JIRA)
Robert Metzger created FLINK-3918: - Summary: Not configuring any time characteristic leads to a ClassCastException Key: FLINK-3918 URL: https://issues.apache.org/jira/browse/FLINK-3918 Project: Flink

Re: remote debugging

2016-05-17 Thread Stefano Baghino
+1, great idea, I should've had it myself. :) I'll do it today, thanks for creating the page. On Tue, May 17, 2016 at 11:24 AM, Stephan Ewen wrote: > Hey Stefano! > > I think that question is bound to come up again. I created a page in the > Flink Wiki to document this. > > If you have a few mo

Re: remote debugging

2016-05-17 Thread Stephan Ewen
Hey Stefano! I think that question is bound to come up again. I created a page in the Flink Wiki to document this. If you have a few moments, would be great if you could add your description there: https://cwiki.apache.org/confluence/display/FLINK/Remote+Debugging+of+Flink+Clusters (it is linked

[jira] [Created] (FLINK-3917) Remove mouse focus from plan visualizer

2016-05-17 Thread Flavio Pompermaier (JIRA)
Flavio Pompermaier created FLINK-3917: - Summary: Remove mouse focus from plan visualizer Key: FLINK-3917 URL: https://issues.apache.org/jira/browse/FLINK-3917 Project: Flink Issue Type: I

[jira] [Created] (FLINK-3916) Allow generic types passing the Table API

2016-05-17 Thread Timo Walther (JIRA)
Timo Walther created FLINK-3916: --- Summary: Allow generic types passing the Table API Key: FLINK-3916 URL: https://issues.apache.org/jira/browse/FLINK-3916 Project: Flink Issue Type: Improvement

Re: Partition problem

2016-05-17 Thread Fabian Hueske
Hi Andrew, I am not sure that I fully understand your requirements. Please correct me if I some of my assumptions are not correct. Requirements: - Rows must be partitioned in partitions of consecutive row ids, i.e., rows 0 to 10 in partition 0, rows 11 to 20 in partition 1, etc.. - Rows of both i