Flink is a distributed software for clusters. You need something like a
distributed file system. So that input file and output files can be
accessed from all nodes.
Each TM has a log directory where the execution logs are stored.
You can set additional properties to your output format by importing the
code in your IDE.
Am 07.08.17 um 16:03 schrieb P. Ramanjaneya Reddy:
Hi Timo,
Problem is resolved after copy input file to all tasks managers.
and where should generate outputfile? Is it in jobmanager or task manager?
Where can i see the execution logs to understand how word count done each
task manager?
By the way any option to overwride...?
08/07/2017 19:27:00 Keyed Aggregation -> Sink: Unnamed(1/1) switched to
FAILED
java.io.IOException: File or directory already exists. Existing files and
directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to
overwrite existing files and directories.
at
org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileSystem.java:763)
at
org.apache.flink.core.fs.SafetyNetWrapperFileSystem.initOutPathLocalFS(SafetyNetWrapperFileSystem.java:135)
at
org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:231)
at
org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:78)
at
org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction.open(OutputFormatSinkFunction.java:61)
at
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:745)
On Mon, Aug 7, 2017 at 6:49 PM, Timo Walther <twal...@apache.org> wrote:
Make sure that the file exists and is accessible from all Flink tasks
managers.
Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:
Thank you Timo.
root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Clust
er/rama/flink$
*./bin/flink
run ./examples/streaming/WordCount.jar --input
file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_out*
Execution of worcountjar gives error...
08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
java.io.FileNotFoundException: The provided file path
file:/home/root1/hamlet.txt does not exist.
at
org.apache.flink.streaming.api.functions.source.ContinuousFi
leMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
at
org.apache.flink.streaming.api.operators.StreamSource.run(
StreamSource.java:87)
at
org.apache.flink.streaming.api.operators.StreamSource.run(
StreamSource.java:55)
at
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
run(SourceStreamTask.java:95)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
StreamTask.java:263)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:748)
On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <twal...@apache.org> wrote:
Hi Ramanji,
you can find the source code of the examples here:
https://github.com/apache/flink/blob/master/flink-examples/
flink-examples-streaming/src/main/java/org/apache/flink/
streaming/examples/wordcount/WordCount.java
A general introduction how the cluster execution works can be found here:
https://ci.apache.org/projects/flink/flink-docs-release-1.4/
concepts/programming-model.html#programs-and-dataflows
https://ci.apache.org/projects/flink/flink-docs-release-1.4/
concepts/runtime.html
It might also be helpful to have a look at the web interface which can
show you a nice graph of the job.
I hope this helps. Feel free to ask further questions.
Regards,
Timo
Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
Hello Everyone,
I have followed the steps specified below link to Install & Run Apache
Flink on Multi-node Cluster.
http://data-flair.training/blogs/install-run-deploy-flink-
multi-node-cluster/
used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
using the command
" bin/flink run
/home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
/streaming/WordCount.jar"
able to run wordcount, but where can i see which input consider and
output
generated?
and how can i specify the input and output paths?
I'm trying to understand how the wordcount will work using Multi-node
Cluster.?
any suggestions will help me further understanding?
Thanks & Regards,
Ramanji.