Like you said, it depends on the use case. The GroupReduceFunction is a
generalization of the traditional reduce. Thus, it is more powerful.
However, it is also executed differently; a GroupReduceFunction requires
the whole group to be materialized and passed at once. If your program
doesn't
hi,
i want to know what is the difference between FLink and Hadoop?
--
Fawzya Ramadan Sayed,
Teaching Assistant,
Computer Science Department,
Faculty of Computers and Information,
Fayoum University
Hi Matthias,
Thank you for taking the time to analyze Flink's invocation behavior. I
like your proposal. I'm not sure whether it is a good idea to scan the
entire JAR for main methods. Sometimes, main methods are added solely for
testing purposes and don't really serve any practical use. However,
Hi.
Hadoop is a framework for reliable, scalable, distributed computing. So, there
are many components for this purpose such as HDFS, YARN and Hadoop MapReduce.
Flink is an alternative to Hadoop MapReduce component. It has also some tools
to make map-reduce program and extends it to support
Pardon, what I said is not completely right. Both functions are
incrementally constructed. This seems obvious for the reduce function but
is also true for the GroupReduce because it receives the values as an
Iterable which, under the hood, can be constructed incrementally as well.
One other
Thank you for working on this.
My responses are inline below:
(Flavio)
My suggestion is to create a specific Flink interface to get also
description of a job and standardize parameter passing.
I've recently merged the ParameterTool which is solving the standardize
parameter passing problem
Makes sense to me. :)
One more thing: What about extending the ProgramDescription interface
to have multiple methods as Flavio suggested (with the config(...)
method that should be handle by the ParameterTool)
public interface FlinkJob {
/** The name to display in the job submission UI or
Thanks for your feedback.
I agree on the main method problem. For scanning and listing all stuff
that is found it's fine.
The tricky question is the automatic invocation mechanism, if -c flag
is not used, and no manifest program-class or Main-Class entry is found.
If multiple classes implement
Performance-wise, a GroupReduceFunction with Combiner should right not be
slightly faster than the ReduceFunction, but not much.
Long term, the ReduceFunction may become faster, because it will use hash
aggregation under the hood.
On Fri, May 22, 2015 at 11:58 AM, santosh_rajaguru
Hi,
streaming currently does not use any memory manager. All state is kept
in Java Objects on the Java Heap, for example an ArrayList for the
window buffer.
On Thu, May 21, 2015 at 11:56 PM, Henry Saputra henry.sapu...@gmail.com wrote:
Hi Stephan, Gyula, Paris,
How does streaming currently
Aljoscha Krettek created FLINK-2081:
---
Summary: Change order of restore state and open for Streaming
Operators
Key: FLINK-2081
URL: https://issues.apache.org/jira/browse/FLINK-2081
Project: Flink
Aljoscha is right. There are plans to migrate the streaming state to the
MemoryManager as well, but streaming state is not managed at this point.
What is managed in streaming jobs is the data buffered and cached in the
network stack. But that is a different memory pool than the memory manager.
We
Thanks Maximilian.
My use case is similar to the example given in the graph analysis.
In graph analysis, the reduce function used is a normal reduce function.
I executed that with both scenarios and your justification is right. the
normal reduce function have a combiner before sorting unlike the
Hi,
two more thoughts to this discussion:
1) looking at the commit history of CliFrontend, I found the
following closed issue and the closing pull request
* https://issues.apache.org/jira/browse/FLINK-1095
* https://github.com/apache/flink/pull/238
It stand in opposite of Flavio's
14 matches
Mail list logo