Re: CLI help, documentation is confusing...

Till Rohrmann Fri, 13 Nov 2020 08:05:10 -0800

Great to hear that you solved the problem!

Cheers,
Till


On Fri, Nov 13, 2020 at 4:56 PM Marco Villalobos <mvillalo...@kineteque.com>
wrote:

> Hi Till,
>
> Thank you for following up.
>
> We were trying to set up s3 file sinks, and rocksdb with s3 checkpointing.
> We upgraded to Flink 1.11 and attempt to run the job in EMR.
>
> On startup, the logs showed an error that the flink-conf.yaml could not be
> found. I tried to trouble shoot the command line parameters, but the
> documentation was confusing me very much.
>
> My co-worker fixed the issue. It turns out that hadoop configuration files
> in EMR were not to set to work with the s3a protocol out of the box. Once
> we placed the correct values in the Hadoop configuration file, everything
> worked.
>
> Marco A. Villalobos
>
>
>
> On Nov 13, 2020, at 7:32 AM, Till Rohrmann <trohrm...@apache.org> wrote:
>
> Hi Marco,
>
> as Klou said, -m yarn-cluster should try to deploy a Yarn per job cluster
> on your Yarn cluster. Could you maybe share a bit more details about what
> is going wrong? E.g. the cli logs could be helpful to pinpoint the problem.
>
> I've tested that both `bin/flink run -m yarn-cluster
> examples/streaming/WindowJoin.jar` as well as `bin/flink run -t
> yarn-per-job examples/streamingWindowJoin.jar` start a Flink per job
> cluster.
>
> What was -yna supposed to do? -ynm should set the custom name of the Yarn
> application.
>
> @kkloudas <kklou...@apache.org> should we maybe improve the existing
> documentation to better reflect the usage of -t/--target? The CLI
> documentation [1] does not include a single example where we use the target
> option. Moreover, we could think about retiring -m yarn-cluster in favour
> of -t yarn-per-job. Moreover, should we somewhere document which
> `execution.target` are all supported? What do you think?
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/cli.html#job-submission-examples
>
> Cheers,
> Till
>
> On Tue, Nov 10, 2020 at 4:00 PM Kostas Kloudas <kklou...@gmail.com> wrote:
>
>> Hi Marco,
>>
>> I agree with you that the -m help message is misleading but I do not
>> think it has changed between releases.
>> You can specify the address of the jobmanager or, for example, you can
>> put "-m yarn-cluster" and depending on your environment setup Flink
>> will pick up a session cluster or will create a per-job cluster.
>> This was always the case.
>>
>> For the -t and -e the change is that -e was deprecated (although still
>> active) in favour of -t. But it still has the same meaning.
>>
>> Finally on how to run Flink on EMR, I am not an expert so I will pull
>> in Till who may have some input.
>>
>> Cheers,
>> Kostas
>>
>> On Mon, Nov 9, 2020 at 10:46 PM Marco Villalobos
>> <mvillalo...@kineteque.com> wrote:
>> >
>> > The flink CLI documentation says that the -m option is to specify the
>> job manager.
>> >
>> > but the examples are passing in an execution target.  I am quite
>> confused by this.
>> >
>> > ./bin/flink run -m yarn-cluster \
>> >                        ./examples/batch/WordCount.jar \
>> >                        --input hdfs:///user/hamlet.txt --output
>> hdfs:///user/wordcount_out
>> >
>> >
>> > So what is it?
>> >
>> > I am trying to run Flink in EMR 6.1.0 but I have failed.
>> >
>> > It appears as though some of the command line parameters changed from
>> version 1.10 to 1.11.
>> >
>> > For example, -yna is now -ynm.
>> >
>> > -e is now -t.
>> >
>> > But I am still confused by the -m option in both documentation.
>> >
>> > Can somebody please explain?
>> >
>>
>
>

Re: CLI help, documentation is confusing...

Reply via email to