Yeah I saw the 1.9 and tried it initially but the cluster flink version is
actually updated and in the 2.0 image flink==1.12 (I confirmed by looking
at the flink dashboard from the YARN ResourceManager web interface).

I checked out beam==2.30.0 and built
:runners:flink:1.12:job-server:shadowJar and we've made some progress...
there is no longer a jackson issue but we're now back to the docker
container issue :/

Where would I find the docker logs? I've attached the log with the stdout
from the main workflow invocation, though it looks pretty similar to the
log I sent you a couple emails back mistakenly. (I also included the
invocation at the top of the file).

Thanks again for helping me debug this, this is far more support than I
expected and I really appreciate it!

On Mon, Jul 12, 2021 at 2:29 PM Kyle Weaver <[email protected]> wrote:

> You weren't getting this error before, so I might recommend building the
> Flink job server jar from a release branch rather than from head. e.g. "git
> checkout upstream/release-2.29.0" and then build using the same command as
> before.
>
> Another thing I just noticed is that the Dataproc example says to set
> pipeline option "--flink_version=1.9", so I'm assuming 1.9 is the Flink
> version Dataproc installs by default. So you may also need to change your
> job server build command to use Flink version 1.9: "./gradlew
> :runners:flink:1.9:job-server:shadowJar". The wrinkle here is that support
> for Flink 1.9 was dropped, so 2.29.0 is the last release that includes
> support. So you will have to build from that branch or earlier.
>
> On Mon, Jul 12, 2021 at 11:13 AM Joey Tran <[email protected]>
> wrote:
>
>> Ah so sorry Kyle, I attached the wrong logs... This is the correct log
>>
>> On Mon, Jul 12, 2021 at 2:10 PM Kyle Weaver <[email protected]> wrote:
>>
>>> But got a new error jackson error having to do with trying to convert a
>>>> null into a float? I've attached the log. Thanks again for all your help.
>>>>
>>>
>>> I don't see the Jackson error in the logs. Instead there's this:
>>>
>>> IllegalStateException: No container running for id
>>> 5e882860579e1e4b9eac46418e4f21434d242316badb759298a9f510593fba34
>>>
>>> Which indicates the Python SDK container failed to start for some
>>> reason. Unfortunately when it tries to recover container logs that also
>>> fails with "Error response from daemon: configured logging driver does not
>>> support reading". I'm not sure how Docker is set up on Dataproc, but it may
>>> be difficult to debug this further unless we can get those container logs.
>>>
>>> On Mon, Jul 12, 2021 at 7:25 AM Joey Tran <[email protected]>
>>> wrote:
>>>
>>>> Ah I see what you mean now. Okay I just tried that:
>>>>
>>>> python wordcount.py --input kinglear.txt --output my_counts --runner
>>>> FlinkRunner --flink_master $FLINK_MASTER_URL --environment_type DOCKER
>>>> --flink_job_server_jar
>>>> ~/beam/runners/flink/1.13/job-server/build/libs/beam-runners-flink-1.13-job-server-2.32.0-SNAPSHOT.jar
>>>>
>>>> But got a new error jackson error having to do with trying to convert a
>>>> null into a float? I've attached the log. Thanks again for all your help.
>>>>
>>>> On Fri, Jul 9, 2021 at 6:35 PM Kyle Weaver <[email protected]> wrote:
>>>>
>>>>> The important thing is it needs to be the job server jar (has
>>>>> job-server in the path).
>>>>>
>>>>> On Fri, Jul 9, 2021 at 3:31 PM Joey Tran <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Hi kyle,
>>>>>>
>>>>>> I noticed the one you linked didnt work when i copied and pasted it
>>>>>> so chose the only one in the directory and that’s how when i got the 
>>>>>> error
>>>>>> message
>>>>>>
>>>>>> Thanks!
>>>>>> Joey
>>>>>>
>>>>>> On Fri, Jul 9, 2021 at 4:50 PM Kyle Weaver <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Joey, my mistake, I picked the wrong jar. The correct jar should
>>>>>>> be
>>>>>>> "runners/flink/1.13/job-server/build/libs/beam-runners-flink-1.13-job-server-2.31.0-SNAPSHOT.jar"
>>>>>>> (or similar depending on your Beam/Flink version choices).
>>>>>>>
>>>>>>> On Fri, Jul 9, 2021 at 1:43 PM Joey Tran <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> Thank you very much for the responses!
>>>>>>>>
>>>>>>>> I feel a bit better about using dataproc since it's not in beta
>>>>>>>> like flink on GKE.
>>>>>>>>
>>>>>>>> I rebuilt the flinkrunner as you specified but I still get an
>>>>>>>> error. I've attached the stdout from trying to run with the patched 
>>>>>>>> flink
>>>>>>>> runner.
>>>>>>>>
>>>>>>>> Here's instructions to get a cluster started and to the state right
>>>>>>>> before I run your patched flink runner instructions:
>>>>>>>>
>>>>>>>>
>>>>>>>> gcloud dataproc clusters create my-flink-cluster
>>>>>>>> --optional-components=FLINK,DOCKER --region=us-central1 
>>>>>>>> --image-version=2.0
>>>>>>>> --enable-component-gateway
>>>>>>>> gcloud compute ssh my-flink-cluster-m
>>>>>>>> curl
>>>>>>>> https://raw.githubusercontent.com/cs109/2015/master/Lectures/Lecture15b/sparklect/shakes/kinglear.txt
>>>>>>>> > kinglear.txt
>>>>>>>> curl
>>>>>>>> https://raw.githubusercontent.com/apache/beam/master/sdks/python/apache_beam/examples/wordcount.py
>>>>>>>> > wordcount.py
>>>>>>>> pip install apache_beam apache_beam[gcp]
>>>>>>>> . /usr/bin/flink-yarn-daemon
>>>>>>>> python wordcount.py --input kinglear.txt --output my_counts
>>>>>>>> --runner FlinkRunner --flink_master $FLINK_MASTER_URL 
>>>>>>>> --environment_type
>>>>>>>> DOCKER --flink_job_server_jar
>>>>>>>> beam/runners/flink/1.13/build/libs/beam-runners-flink-1.13-2.32.0-SNAPSHOT.jar
>>>>>>>>
>>>>>>>>
>>>>>>>> Let me know if anything stands out to you. Thanks again for the
>>>>>>>> support! Sorry if I'm missing something silly
>>>>>>>>
>>>>>>>> On Fri, Jul 9, 2021 at 3:20 PM Kyle Weaver <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> That's for Java only. Joey was asking about the portable (Python)
>>>>>>>>> example.
>>>>>>>>>
>>>>>>>>> On Fri, Jul 9, 2021 at 12:18 PM Tianzi Cai <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks Kyle so much for forwarding.
>>>>>>>>>>
>>>>>>>>>> I was literally just trying this myself and got stuck too (b/
>>>>>>>>>> <http://b/193180649>193180649 <http://b/193180649>). I finally
>>>>>>>>>> got it all to work. Please feel free to share with the customer. I 
>>>>>>>>>> can
>>>>>>>>>> give them repo.reader permission if needed.
>>>>>>>>>>
>>>>>>>>>>    1. Run this command to generate the canonical word count
>>>>>>>>>>    example.
>>>>>>>>>>    mvn archetype:generate \
>>>>>>>>>>        -DarchetypeGroupId=org.apache.beam \
>>>>>>>>>>
>>>>>>>>>>    -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
>>>>>>>>>>        -DarchetypeVersion=2.30.0 \
>>>>>>>>>>        -DgroupId=org.example \
>>>>>>>>>>        -DartifactId=word-count-beam \
>>>>>>>>>>        -Dversion="0.1" \
>>>>>>>>>>        -Dpackage=org.apache.beam.examples \
>>>>>>>>>>        -DinteractiveMode=false
>>>>>>>>>>    2. Make a few code changes (here
>>>>>>>>>>    
>>>>>>>>>> <https://source.cloud.google.com/tz-playground-bigdata/word-count-example/+/gcp:>
>>>>>>>>>>  are
>>>>>>>>>>    mine) then make sure that the code works with mvn compile
>>>>>>>>>>    exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount and
>>>>>>>>>>    you can see the aggregated results printed out.
>>>>>>>>>>    3. Run mvn package -Pflink-runner to get the packaged JARs.
>>>>>>>>>>    4. Upload the uber jar word-count-beam-bundled-0.1.jar to a
>>>>>>>>>>    Cloud Storage bucket. SSH into my Dataproc master node. Download 
>>>>>>>>>> the uber
>>>>>>>>>>    jar.
>>>>>>>>>>    5. flink run -c org.apache.beam.examples.WordCount
>>>>>>>>>>    word-count-beam-bundled-0.1.jar --runner=FlinkRunner
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 9, 2021 at 12:11 PM Kyle Weaver <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> If you're not committed to Dataproc, you may also want to try
>>>>>>>>>>> running it on GKE, which AFAIK doesn't have these issues.
>>>>>>>>>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/blob/master/docs/beam_guide.md
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jul 9, 2021 at 12:08 PM Kyle Weaver <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Joey,
>>>>>>>>>>>>
>>>>>>>>>>>> Jackson dependency issues are likely
>>>>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-10430. You will
>>>>>>>>>>>> have to manually patch it until a fix is available in an upcoming 
>>>>>>>>>>>> Beam
>>>>>>>>>>>> release.
>>>>>>>>>>>>
>>>>>>>>>>>> 1. Download Beam source from Github
>>>>>>>>>>>> 2. Check out a patch for the issue, such as
>>>>>>>>>>>> https://github.com/apache/beam/pull/14953
>>>>>>>>>>>> 3. Build the Flink runner using command "./gradlew
>>>>>>>>>>>> :runners:flink:1.13:job-server:shadowJar"
>>>>>>>>>>>> 4. Use the outputted Flink runner jar in your Python pipeline
>>>>>>>>>>>> options
>>>>>>>>>>>> "--flink_job_server_jar=runners/flink/1.13/build/libs/beam-runners-flink-1.13-2.31.0-SNAPSHOT.jar"
>>>>>>>>>>>>
>>>>>>>>>>>> For the "No container id" issue, can you share the full logs?
>>>>>>>>>>>>
>>>>>>>>>>>> +Tianzi Cai <[email protected]> +Anthony Mancuso
>>>>>>>>>>>> <[email protected]>
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Kyle
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jul 9, 2021 at 8:47 AM Ahmet Altay <[email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> /cc @Kyle Weaver <[email protected]>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jul 9, 2021 at 5:24 AM Joey Tran <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hello!
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm trying to just demo Beam/Flink and I tried following the
>>>>>>>>>>>>>> instructions with Google's Dataproc but I get a bunch of errors 
>>>>>>>>>>>>>> ranging
>>>>>>>>>>>>>> from jackson dependency issues to some issue about "No container 
>>>>>>>>>>>>>> id".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Does anyone know if these dataproc instructions[1] are
>>>>>>>>>>>>>> complete? I ran through it pretty much word for word and can't 
>>>>>>>>>>>>>> get a simple
>>>>>>>>>>>>>> wordcount going, I'm not sure if I'm somehow messing something 
>>>>>>>>>>>>>> up or
>>>>>>>>>>>>>> there's more necessary than just this doc instructs? FWIW I've 
>>>>>>>>>>>>>> been able to
>>>>>>>>>>>>>> run the java wordcount example fine, it seems like I only run 
>>>>>>>>>>>>>> into issues
>>>>>>>>>>>>>> when trying to follow the portable runner instructions
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks so much in advance for your help1 I'm not very
>>>>>>>>>>>>>> experienced with deploying these kinds of things but I wanted to 
>>>>>>>>>>>>>> do a demo
>>>>>>>>>>>>>> to show that Beam+Flink is a better solution than writing a 
>>>>>>>>>>>>>> framework myself
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://cloud.google.com/dataproc/docs/concepts/components/flink#portable_beam_jobs
>>>>>>>>>>>>>>
>>>>>>>>>>>>>

Attachment: no_container.log
Description: Binary data

Reply via email to