netty channel for data transfer

2015-08-11 Thread wangzhijiang999
As I know, each TaskManager has NettyConnectionManager component and  all the tasks in the TaskManager will user  that to transfer data. In the PartitionRequestClientFactory, the nettyClient will make a connection based on connectionId, and the connectionId consists of socketAddress and

Re: On some GUI tools for building Flink Streaming data flows...

2015-08-11 Thread mnxfst
Hi Gyula, hi Slim, being the core developer of SPQR, I am very glad to see the framework referenced on the Apache Flink dev list ;-) Following the same approach as Flink - using real data streaming instead of mini-batching - SPQR adds some ad hoc'ness features to introduce more flexibility into

[jira] [Created] (FLINK-2507) Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector

2015-08-11 Thread fangfengbin (JIRA)
fangfengbin created FLINK-2507: -- Summary: Rename the function tansformAndEmit in org.apache.flink.stormcompatibility.wrappers.AbstractStormCollector Key: FLINK-2507 URL:

Re: netty channel for data transfer

2015-08-11 Thread Stephan Ewen
Hi! You can think of it the following way: When you execute a join, the data shuffled from node A to node B for the left side of the join will have a separate TCP connection than the data shuffled from node A to node B for the right side of the join. That is currently important to avoid

[jira] [Created] (FLINK-2506) HBase table that is distributed over more than 1 region server

2015-08-11 Thread Lydia Ickler (JIRA)
Lydia Ickler created FLINK-2506: --- Summary: HBase table that is distributed over more than 1 region server Key: FLINK-2506 URL: https://issues.apache.org/jira/browse/FLINK-2506 Project: Flink

Re: On some GUI tools for building Flink Streaming data flows...

2015-08-11 Thread Márton Balassi
Hey Christian, Thanks for the insider view on SPQR. I have to agree with Gyula that dynamic topology build is not the highest priority for Flink currently, but certainly a very interesting feature and one that has already been requested by a couple of users. As for none of the open source

答复: Some problems about Flink applications

2015-08-11 Thread huangwei (G)
Hi Stephan and Matthias, Sorry for replying late. I`ve double checked that this class StormSpoutWrapper is really exist in my jar file. And it got the same trouble when I ran the flink-storm-compatibililty-example- corresponding word-count-storm. The way I built my Throughput application was

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Paris Carbone
Hi Andra and nice to meet you btw :) It sounds like very fancy way to deal with skew, I like the idea even though I am not a graph analytics expert. Have you ran any experiments or benchmarks to see when this preferable ? Users should be aware when they will get benefits by using it since node

[jira] [Created] (FLINK-2508) Confusing sharing of StreamExecutionEnvironment

2015-08-11 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2508: --- Summary: Confusing sharing of StreamExecutionEnvironment Key: FLINK-2508 URL: https://issues.apache.org/jira/browse/FLINK-2508 Project: Flink Issue Type: Bug

Re: On some GUI tools for building Flink Streaming data flows...

2015-08-11 Thread Stephan Ewen
Hi Christian! Sounds like a very cool proposal, and it seems that Flink and SPQR could complement each other very well. If you are interested to see this on top of Flink, let's have a chat to see how to best get started with this - what would be the easiest points of integration. As said, Flink

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Vasiliki Kalavri
Hi Andra, thanks for offering to add this work to Gelly and for starting the discussion! How do you think this would look like from an API point of view? Is it easy to make it transparent to the application? Could you give us a simple example of what you have in mind? Apart from usability, we

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Andra Lungu
Hi Vasia, I shall polish the functions a bit, but this is more or less what I had in mind: GSA Jaccard [what we have in Gelly right now]: https://github.com/andralungu/gelly-partitioning/blob/master/src/main/java/example/GSAJaccardSimilarityMeasure.java The same version with node split:

[jira] [Created] (FLINK-2509) Improve error messages when user code classes are not found

2015-08-11 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2509: --- Summary: Improve error messages when user code classes are not found Key: FLINK-2509 URL: https://issues.apache.org/jira/browse/FLINK-2509 Project: Flink

Re: 答复: Some problems about Flink applications

2015-08-11 Thread Matthias J. Sax
Three comments 1) If StormSpoutWrapper is in your jar, is it located in the correct directory (must be same as package name)? 2) If you are using FlinkTopologyBuilder, you need to package as shown in StormWordCountRemoteBySubmitter example, using an additional assembly file. (The first examples

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Andra Lungu
Hi Paris, Nice to virtually meet you too :) Maybe it makes sense to share my freshest chart: https://drive.google.com/file/d/0BwnaKJcSLc43Qm9fZV9RUE5zT1E/view?usp=sharing This is for the Community Detection algorithm [1] in which you basically find communities by continuously rescoring

Re: 答复: Some problems about Flink applications

2015-08-11 Thread Stephan Ewen
We are seeing these class loader issues a lot as of late. Seems that packaging the classes is trickier than anticipated. Here is a pull request to add some diagnostics info on a ClassNotFoundException: https://github.com/apache/flink/pull/1008 On Tue, Aug 11, 2015 at 3:29 PM, Matthias J. Sax

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Samia Khalid
Dear Andra, The idea seems pretty nice. I wonder how you decide the threshold to separate the high degree vertices from the low degree vertices. Regards, Samia On Tue, Aug 11, 2015 at 3:41 PM, Andra Lungu lungu.an...@gmail.com wrote: Hi Paris, Nice to virtually meet you too :) Maybe it

Re: [Proposal] Addition to Gelly

2015-08-11 Thread Andra Lungu
Hi Samia, A good method to statistically determine skewed vertices was beyond the purpose of my thesis. Unfortunately, the statistical methods that fit a power law distribution don't do a good job. So what I do is that I plot the degree distribution and then visually determine the threshold. That

[jira] [Created] (FLINK-2510) KafkaConnector should access partition metadata from master/cluster

2015-08-11 Thread Stephan Ewen (JIRA)
Stephan Ewen created FLINK-2510: --- Summary: KafkaConnector should access partition metadata from master/cluster Key: FLINK-2510 URL: https://issues.apache.org/jira/browse/FLINK-2510 Project: Flink

[jira] [Created] (FLINK-2511) Potential resource leak due to unclosed InputStream in FlinkZooKeeperQuorumPeer.java

2015-08-11 Thread Ted Yu (JIRA)
Ted Yu created FLINK-2511: - Summary: Potential resource leak due to unclosed InputStream in FlinkZooKeeperQuorumPeer.java Key: FLINK-2511 URL: https://issues.apache.org/jira/browse/FLINK-2511 Project: Flink

Multiple control flows in a program

2015-08-11 Thread Sachin Goel
I'm writing a utility to split a data set randomly into several parts and return an Array of data sets. However, whenever I operate on any of these *subsets, *the program basically start from the original data set, and the split is performed again. To ensure that these subsets are mutually

[jira] [Created] (FLINK-2512) Add client.close() before throw RuntimeException

2015-08-11 Thread fangfengbin (JIRA)
fangfengbin created FLINK-2512: -- Summary: Add client.close() before throw RuntimeException Key: FLINK-2512 URL: https://issues.apache.org/jira/browse/FLINK-2512 Project: Flink Issue Type: Bug