Yeah I saw the 1.9 and tried it initially but the cluster flink version is actually updated and in the 2.0 image flink==1.12 (I confirmed by looking at the flink dashboard from the YARN ResourceManager web interface).
I checked out beam==2.30.0 and built :runners:flink:1.12:job-server:shadowJar and we've made some progress... there is no longer a jackson issue but we're now back to the docker container issue :/ Where would I find the docker logs? I've attached the log with the stdout from the main workflow invocation, though it looks pretty similar to the log I sent you a couple emails back mistakenly. (I also included the invocation at the top of the file). Thanks again for helping me debug this, this is far more support than I expected and I really appreciate it! On Mon, Jul 12, 2021 at 2:29 PM Kyle Weaver <[email protected]> wrote: > You weren't getting this error before, so I might recommend building the > Flink job server jar from a release branch rather than from head. e.g. "git > checkout upstream/release-2.29.0" and then build using the same command as > before. > > Another thing I just noticed is that the Dataproc example says to set > pipeline option "--flink_version=1.9", so I'm assuming 1.9 is the Flink > version Dataproc installs by default. So you may also need to change your > job server build command to use Flink version 1.9: "./gradlew > :runners:flink:1.9:job-server:shadowJar". The wrinkle here is that support > for Flink 1.9 was dropped, so 2.29.0 is the last release that includes > support. So you will have to build from that branch or earlier. > > On Mon, Jul 12, 2021 at 11:13 AM Joey Tran <[email protected]> > wrote: > >> Ah so sorry Kyle, I attached the wrong logs... This is the correct log >> >> On Mon, Jul 12, 2021 at 2:10 PM Kyle Weaver <[email protected]> wrote: >> >>> But got a new error jackson error having to do with trying to convert a >>>> null into a float? I've attached the log. Thanks again for all your help. >>>> >>> >>> I don't see the Jackson error in the logs. Instead there's this: >>> >>> IllegalStateException: No container running for id >>> 5e882860579e1e4b9eac46418e4f21434d242316badb759298a9f510593fba34 >>> >>> Which indicates the Python SDK container failed to start for some >>> reason. Unfortunately when it tries to recover container logs that also >>> fails with "Error response from daemon: configured logging driver does not >>> support reading". I'm not sure how Docker is set up on Dataproc, but it may >>> be difficult to debug this further unless we can get those container logs. >>> >>> On Mon, Jul 12, 2021 at 7:25 AM Joey Tran <[email protected]> >>> wrote: >>> >>>> Ah I see what you mean now. Okay I just tried that: >>>> >>>> python wordcount.py --input kinglear.txt --output my_counts --runner >>>> FlinkRunner --flink_master $FLINK_MASTER_URL --environment_type DOCKER >>>> --flink_job_server_jar >>>> ~/beam/runners/flink/1.13/job-server/build/libs/beam-runners-flink-1.13-job-server-2.32.0-SNAPSHOT.jar >>>> >>>> But got a new error jackson error having to do with trying to convert a >>>> null into a float? I've attached the log. Thanks again for all your help. >>>> >>>> On Fri, Jul 9, 2021 at 6:35 PM Kyle Weaver <[email protected]> wrote: >>>> >>>>> The important thing is it needs to be the job server jar (has >>>>> job-server in the path). >>>>> >>>>> On Fri, Jul 9, 2021 at 3:31 PM Joey Tran <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi kyle, >>>>>> >>>>>> I noticed the one you linked didnt work when i copied and pasted it >>>>>> so chose the only one in the directory and that’s how when i got the >>>>>> error >>>>>> message >>>>>> >>>>>> Thanks! >>>>>> Joey >>>>>> >>>>>> On Fri, Jul 9, 2021 at 4:50 PM Kyle Weaver <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi Joey, my mistake, I picked the wrong jar. The correct jar should >>>>>>> be >>>>>>> "runners/flink/1.13/job-server/build/libs/beam-runners-flink-1.13-job-server-2.31.0-SNAPSHOT.jar" >>>>>>> (or similar depending on your Beam/Flink version choices). >>>>>>> >>>>>>> On Fri, Jul 9, 2021 at 1:43 PM Joey Tran <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi all, >>>>>>>> >>>>>>>> Thank you very much for the responses! >>>>>>>> >>>>>>>> I feel a bit better about using dataproc since it's not in beta >>>>>>>> like flink on GKE. >>>>>>>> >>>>>>>> I rebuilt the flinkrunner as you specified but I still get an >>>>>>>> error. I've attached the stdout from trying to run with the patched >>>>>>>> flink >>>>>>>> runner. >>>>>>>> >>>>>>>> Here's instructions to get a cluster started and to the state right >>>>>>>> before I run your patched flink runner instructions: >>>>>>>> >>>>>>>> >>>>>>>> gcloud dataproc clusters create my-flink-cluster >>>>>>>> --optional-components=FLINK,DOCKER --region=us-central1 >>>>>>>> --image-version=2.0 >>>>>>>> --enable-component-gateway >>>>>>>> gcloud compute ssh my-flink-cluster-m >>>>>>>> curl >>>>>>>> https://raw.githubusercontent.com/cs109/2015/master/Lectures/Lecture15b/sparklect/shakes/kinglear.txt >>>>>>>> > kinglear.txt >>>>>>>> curl >>>>>>>> https://raw.githubusercontent.com/apache/beam/master/sdks/python/apache_beam/examples/wordcount.py >>>>>>>> > wordcount.py >>>>>>>> pip install apache_beam apache_beam[gcp] >>>>>>>> . /usr/bin/flink-yarn-daemon >>>>>>>> python wordcount.py --input kinglear.txt --output my_counts >>>>>>>> --runner FlinkRunner --flink_master $FLINK_MASTER_URL >>>>>>>> --environment_type >>>>>>>> DOCKER --flink_job_server_jar >>>>>>>> beam/runners/flink/1.13/build/libs/beam-runners-flink-1.13-2.32.0-SNAPSHOT.jar >>>>>>>> >>>>>>>> >>>>>>>> Let me know if anything stands out to you. Thanks again for the >>>>>>>> support! Sorry if I'm missing something silly >>>>>>>> >>>>>>>> On Fri, Jul 9, 2021 at 3:20 PM Kyle Weaver <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> That's for Java only. Joey was asking about the portable (Python) >>>>>>>>> example. >>>>>>>>> >>>>>>>>> On Fri, Jul 9, 2021 at 12:18 PM Tianzi Cai <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Thanks Kyle so much for forwarding. >>>>>>>>>> >>>>>>>>>> I was literally just trying this myself and got stuck too (b/ >>>>>>>>>> <http://b/193180649>193180649 <http://b/193180649>). I finally >>>>>>>>>> got it all to work. Please feel free to share with the customer. I >>>>>>>>>> can >>>>>>>>>> give them repo.reader permission if needed. >>>>>>>>>> >>>>>>>>>> 1. Run this command to generate the canonical word count >>>>>>>>>> example. >>>>>>>>>> mvn archetype:generate \ >>>>>>>>>> -DarchetypeGroupId=org.apache.beam \ >>>>>>>>>> >>>>>>>>>> -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \ >>>>>>>>>> -DarchetypeVersion=2.30.0 \ >>>>>>>>>> -DgroupId=org.example \ >>>>>>>>>> -DartifactId=word-count-beam \ >>>>>>>>>> -Dversion="0.1" \ >>>>>>>>>> -Dpackage=org.apache.beam.examples \ >>>>>>>>>> -DinteractiveMode=false >>>>>>>>>> 2. Make a few code changes (here >>>>>>>>>> >>>>>>>>>> <https://source.cloud.google.com/tz-playground-bigdata/word-count-example/+/gcp:> >>>>>>>>>> are >>>>>>>>>> mine) then make sure that the code works with mvn compile >>>>>>>>>> exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount and >>>>>>>>>> you can see the aggregated results printed out. >>>>>>>>>> 3. Run mvn package -Pflink-runner to get the packaged JARs. >>>>>>>>>> 4. Upload the uber jar word-count-beam-bundled-0.1.jar to a >>>>>>>>>> Cloud Storage bucket. SSH into my Dataproc master node. Download >>>>>>>>>> the uber >>>>>>>>>> jar. >>>>>>>>>> 5. flink run -c org.apache.beam.examples.WordCount >>>>>>>>>> word-count-beam-bundled-0.1.jar --runner=FlinkRunner >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Jul 9, 2021 at 12:11 PM Kyle Weaver <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> If you're not committed to Dataproc, you may also want to try >>>>>>>>>>> running it on GKE, which AFAIK doesn't have these issues. >>>>>>>>>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/blob/master/docs/beam_guide.md >>>>>>>>>>> >>>>>>>>>>> On Fri, Jul 9, 2021 at 12:08 PM Kyle Weaver <[email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Joey, >>>>>>>>>>>> >>>>>>>>>>>> Jackson dependency issues are likely >>>>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-10430. You will >>>>>>>>>>>> have to manually patch it until a fix is available in an upcoming >>>>>>>>>>>> Beam >>>>>>>>>>>> release. >>>>>>>>>>>> >>>>>>>>>>>> 1. Download Beam source from Github >>>>>>>>>>>> 2. Check out a patch for the issue, such as >>>>>>>>>>>> https://github.com/apache/beam/pull/14953 >>>>>>>>>>>> 3. Build the Flink runner using command "./gradlew >>>>>>>>>>>> :runners:flink:1.13:job-server:shadowJar" >>>>>>>>>>>> 4. Use the outputted Flink runner jar in your Python pipeline >>>>>>>>>>>> options >>>>>>>>>>>> "--flink_job_server_jar=runners/flink/1.13/build/libs/beam-runners-flink-1.13-2.31.0-SNAPSHOT.jar" >>>>>>>>>>>> >>>>>>>>>>>> For the "No container id" issue, can you share the full logs? >>>>>>>>>>>> >>>>>>>>>>>> +Tianzi Cai <[email protected]> +Anthony Mancuso >>>>>>>>>>>> <[email protected]> >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Kyle >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jul 9, 2021 at 8:47 AM Ahmet Altay <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> /cc @Kyle Weaver <[email protected]> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, Jul 9, 2021 at 5:24 AM Joey Tran < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Hello! >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm trying to just demo Beam/Flink and I tried following the >>>>>>>>>>>>>> instructions with Google's Dataproc but I get a bunch of errors >>>>>>>>>>>>>> ranging >>>>>>>>>>>>>> from jackson dependency issues to some issue about "No container >>>>>>>>>>>>>> id". >>>>>>>>>>>>>> >>>>>>>>>>>>>> Does anyone know if these dataproc instructions[1] are >>>>>>>>>>>>>> complete? I ran through it pretty much word for word and can't >>>>>>>>>>>>>> get a simple >>>>>>>>>>>>>> wordcount going, I'm not sure if I'm somehow messing something >>>>>>>>>>>>>> up or >>>>>>>>>>>>>> there's more necessary than just this doc instructs? FWIW I've >>>>>>>>>>>>>> been able to >>>>>>>>>>>>>> run the java wordcount example fine, it seems like I only run >>>>>>>>>>>>>> into issues >>>>>>>>>>>>>> when trying to follow the portable runner instructions >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks so much in advance for your help1 I'm not very >>>>>>>>>>>>>> experienced with deploying these kinds of things but I wanted to >>>>>>>>>>>>>> do a demo >>>>>>>>>>>>>> to show that Beam+Flink is a better solution than writing a >>>>>>>>>>>>>> framework myself >>>>>>>>>>>>>> >>>>>>>>>>>>>> [1] >>>>>>>>>>>>>> https://cloud.google.com/dataproc/docs/concepts/components/flink#portable_beam_jobs >>>>>>>>>>>>>> >>>>>>>>>>>>>
no_container.log
Description: Binary data
