Where are those?

Yeah I saw Dataflow but unfortunately we have a use case that requires
custom image and higher scale than the dataflow quota allows

On Tue, Jul 13, 2021 at 2:00 PM Kyle Weaver <[email protected]> wrote:

> You can check the task manager logs as well to see if there is any
> additional information.
>
> Beam+Flink+Dataproc isn't unheard of, but Java is definitely more common
> than Python (and simpler to operate). And overall Dataflow is usually the
> preferred way to run Beam on GCP.
>
> On Tue, Jul 13, 2021 at 7:27 AM Joey Tran <[email protected]>
> wrote:
>
>> I found the jobmanager log from the yarn web interface [jobmanager_log
>> (1).txt].
>>
>> I didn't see any errors about malconfigured logging drivers this time.
>>
>> Is running flink/beam on dataproc a rare use case?
>>
>> On Mon, Jul 12, 2021 at 6:55 PM Kyle Weaver <[email protected]> wrote:
>>
>>> no_container.log is from the Python driver process, so it may be
>>> abbreviated. You should be able to find the unabridged log in either the
>>> Beam job server or Flink task manager. When container startup fails, Beam
>>> attempts to get the container logs. In your previous email
>>> (jobmanager_log.txt) there was a line "Error response from daemon:
>>> configured logging driver does not support reading". I'm guessing that is
>>> still the case now, but we should confirm. If it is the case, it's
>>> a Dataproc-Docker setup issue and unfortunately falls outside of my
>>> wheelhouse. It should be possible to configure logging drivers correctly,
>>> but I don't know how (or why it doesn't work as expected in the first
>>> place). https://docs.docker.com/config/containers/logging/configure/
>>>
>>> On Mon, Jul 12, 2021 at 12:36 PM Joey Tran <[email protected]>
>>> wrote:
>>>
>>>> Yeah I saw the 1.9 and tried it initially but the cluster flink version
>>>> is actually updated and in the 2.0 image flink==1.12 (I confirmed by
>>>> looking at the flink dashboard from the YARN ResourceManager web 
>>>> interface).
>>>>
>>>> I checked out beam==2.30.0 and built
>>>> :runners:flink:1.12:job-server:shadowJar and we've made some progress...
>>>> there is no longer a jackson issue but we're now back to the docker
>>>> container issue :/
>>>>
>>>> Where would I find the docker logs? I've attached the log with the
>>>> stdout from the main workflow invocation, though it looks pretty similar to
>>>> the log I sent you a couple emails back mistakenly. (I also included the
>>>> invocation at the top of the file).
>>>>
>>>> Thanks again for helping me debug this, this is far more support than I
>>>> expected and I really appreciate it!
>>>>
>>>> On Mon, Jul 12, 2021 at 2:29 PM Kyle Weaver <[email protected]>
>>>> wrote:
>>>>
>>>>> You weren't getting this error before, so I might recommend building
>>>>> the Flink job server jar from a release branch rather than from head. e.g.
>>>>> "git checkout upstream/release-2.29.0" and then build using the same
>>>>> command as before.
>>>>>
>>>>> Another thing I just noticed is that the Dataproc example says to set
>>>>> pipeline option "--flink_version=1.9", so I'm assuming 1.9 is the Flink
>>>>> version Dataproc installs by default. So you may also need to change your
>>>>> job server build command to use Flink version 1.9: "./gradlew
>>>>> :runners:flink:1.9:job-server:shadowJar". The wrinkle here is that support
>>>>> for Flink 1.9 was dropped, so 2.29.0 is the last release that includes
>>>>> support. So you will have to build from that branch or earlier.
>>>>>
>>>>> On Mon, Jul 12, 2021 at 11:13 AM Joey Tran <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Ah so sorry Kyle, I attached the wrong logs... This is the correct log
>>>>>>
>>>>>> On Mon, Jul 12, 2021 at 2:10 PM Kyle Weaver <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> But got a new error jackson error having to do with trying to
>>>>>>>> convert a null into a float? I've attached the log. Thanks again for 
>>>>>>>> all
>>>>>>>> your help.
>>>>>>>>
>>>>>>>
>>>>>>> I don't see the Jackson error in the logs. Instead there's this:
>>>>>>>
>>>>>>> IllegalStateException: No container running for id
>>>>>>> 5e882860579e1e4b9eac46418e4f21434d242316badb759298a9f510593fba34
>>>>>>>
>>>>>>> Which indicates the Python SDK container failed to start for some
>>>>>>> reason. Unfortunately when it tries to recover container logs that also
>>>>>>> fails with "Error response from daemon: configured logging driver does 
>>>>>>> not
>>>>>>> support reading". I'm not sure how Docker is set up on Dataproc, but it 
>>>>>>> may
>>>>>>> be difficult to debug this further unless we can get those container 
>>>>>>> logs.
>>>>>>>
>>>>>>> On Mon, Jul 12, 2021 at 7:25 AM Joey Tran <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Ah I see what you mean now. Okay I just tried that:
>>>>>>>>
>>>>>>>> python wordcount.py --input kinglear.txt --output my_counts
>>>>>>>> --runner FlinkRunner --flink_master $FLINK_MASTER_URL 
>>>>>>>> --environment_type
>>>>>>>> DOCKER --flink_job_server_jar
>>>>>>>> ~/beam/runners/flink/1.13/job-server/build/libs/beam-runners-flink-1.13-job-server-2.32.0-SNAPSHOT.jar
>>>>>>>>
>>>>>>>> But got a new error jackson error having to do with trying to
>>>>>>>> convert a null into a float? I've attached the log. Thanks again for 
>>>>>>>> all
>>>>>>>> your help.
>>>>>>>>
>>>>>>>> On Fri, Jul 9, 2021 at 6:35 PM Kyle Weaver <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> The important thing is it needs to be the job server jar (has
>>>>>>>>> job-server in the path).
>>>>>>>>>
>>>>>>>>> On Fri, Jul 9, 2021 at 3:31 PM Joey Tran <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi kyle,
>>>>>>>>>>
>>>>>>>>>> I noticed the one you linked didnt work when i copied and pasted
>>>>>>>>>> it so chose the only one in the directory and that’s how when i got 
>>>>>>>>>> the
>>>>>>>>>> error message
>>>>>>>>>>
>>>>>>>>>> Thanks!
>>>>>>>>>> Joey
>>>>>>>>>>
>>>>>>>>>> On Fri, Jul 9, 2021 at 4:50 PM Kyle Weaver <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Joey, my mistake, I picked the wrong jar. The correct jar
>>>>>>>>>>> should be
>>>>>>>>>>> "runners/flink/1.13/job-server/build/libs/beam-runners-flink-1.13-job-server-2.31.0-SNAPSHOT.jar"
>>>>>>>>>>> (or similar depending on your Beam/Flink version choices).
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Jul 9, 2021 at 1:43 PM Joey Tran <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you very much for the responses!
>>>>>>>>>>>>
>>>>>>>>>>>> I feel a bit better about using dataproc since it's not in beta
>>>>>>>>>>>> like flink on GKE.
>>>>>>>>>>>>
>>>>>>>>>>>> I rebuilt the flinkrunner as you specified but I still get an
>>>>>>>>>>>> error. I've attached the stdout from trying to run with the 
>>>>>>>>>>>> patched flink
>>>>>>>>>>>> runner.
>>>>>>>>>>>>
>>>>>>>>>>>> Here's instructions to get a cluster started and to the state
>>>>>>>>>>>> right before I run your patched flink runner instructions:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> gcloud dataproc clusters create my-flink-cluster
>>>>>>>>>>>> --optional-components=FLINK,DOCKER --region=us-central1 
>>>>>>>>>>>> --image-version=2.0
>>>>>>>>>>>> --enable-component-gateway
>>>>>>>>>>>> gcloud compute ssh my-flink-cluster-m
>>>>>>>>>>>> curl
>>>>>>>>>>>> https://raw.githubusercontent.com/cs109/2015/master/Lectures/Lecture15b/sparklect/shakes/kinglear.txt
>>>>>>>>>>>> > kinglear.txt
>>>>>>>>>>>> curl
>>>>>>>>>>>> https://raw.githubusercontent.com/apache/beam/master/sdks/python/apache_beam/examples/wordcount.py
>>>>>>>>>>>> > wordcount.py
>>>>>>>>>>>> pip install apache_beam apache_beam[gcp]
>>>>>>>>>>>> . /usr/bin/flink-yarn-daemon
>>>>>>>>>>>> python wordcount.py --input kinglear.txt --output my_counts
>>>>>>>>>>>> --runner FlinkRunner --flink_master $FLINK_MASTER_URL 
>>>>>>>>>>>> --environment_type
>>>>>>>>>>>> DOCKER --flink_job_server_jar
>>>>>>>>>>>> beam/runners/flink/1.13/build/libs/beam-runners-flink-1.13-2.32.0-SNAPSHOT.jar
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Let me know if anything stands out to you. Thanks again for the
>>>>>>>>>>>> support! Sorry if I'm missing something silly
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Jul 9, 2021 at 3:20 PM Kyle Weaver <[email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> That's for Java only. Joey was asking about the portable
>>>>>>>>>>>>> (Python) example.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Jul 9, 2021 at 12:18 PM Tianzi Cai <[email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks Kyle so much for forwarding.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I was literally just trying this myself and got stuck too (b/
>>>>>>>>>>>>>> <http://b/193180649>193180649 <http://b/193180649>). I
>>>>>>>>>>>>>> finally got it all to work. Please feel free to share with the 
>>>>>>>>>>>>>> customer. I can
>>>>>>>>>>>>>> give them repo.reader permission if needed.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    1. Run this command to generate the canonical word count
>>>>>>>>>>>>>>    example.
>>>>>>>>>>>>>>    mvn archetype:generate \
>>>>>>>>>>>>>>        -DarchetypeGroupId=org.apache.beam \
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>    
>>>>>>>>>>>>>> -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
>>>>>>>>>>>>>>        -DarchetypeVersion=2.30.0 \
>>>>>>>>>>>>>>        -DgroupId=org.example \
>>>>>>>>>>>>>>        -DartifactId=word-count-beam \
>>>>>>>>>>>>>>        -Dversion="0.1" \
>>>>>>>>>>>>>>        -Dpackage=org.apache.beam.examples \
>>>>>>>>>>>>>>        -DinteractiveMode=false
>>>>>>>>>>>>>>    2. Make a few code changes (here
>>>>>>>>>>>>>>    
>>>>>>>>>>>>>> <https://source.cloud.google.com/tz-playground-bigdata/word-count-example/+/gcp:>
>>>>>>>>>>>>>>  are
>>>>>>>>>>>>>>    mine) then make sure that the code works with mvn compile
>>>>>>>>>>>>>>    exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount 
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>    you can see the aggregated results printed out.
>>>>>>>>>>>>>>    3. Run mvn package -Pflink-runner to get the packaged
>>>>>>>>>>>>>>    JARs.
>>>>>>>>>>>>>>    4. Upload the uber jar word-count-beam-bundled-0.1.jar to
>>>>>>>>>>>>>>    a Cloud Storage bucket. SSH into my Dataproc master node. 
>>>>>>>>>>>>>> Download the uber
>>>>>>>>>>>>>>    jar.
>>>>>>>>>>>>>>    5. flink run -c org.apache.beam.examples.WordCount
>>>>>>>>>>>>>>    word-count-beam-bundled-0.1.jar --runner=FlinkRunner
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Jul 9, 2021 at 12:11 PM Kyle Weaver <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you're not committed to Dataproc, you may also want to
>>>>>>>>>>>>>>> try running it on GKE, which AFAIK doesn't have these issues.
>>>>>>>>>>>>>>> https://github.com/GoogleCloudPlatform/flink-on-k8s-operator/blob/master/docs/beam_guide.md
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Jul 9, 2021 at 12:08 PM Kyle Weaver <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi Joey,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Jackson dependency issues are likely
>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/BEAM-10430. You will
>>>>>>>>>>>>>>>> have to manually patch it until a fix is available in an 
>>>>>>>>>>>>>>>> upcoming Beam
>>>>>>>>>>>>>>>> release.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. Download Beam source from Github
>>>>>>>>>>>>>>>> 2. Check out a patch for the issue, such as
>>>>>>>>>>>>>>>> https://github.com/apache/beam/pull/14953
>>>>>>>>>>>>>>>> 3. Build the Flink runner using command "./gradlew
>>>>>>>>>>>>>>>> :runners:flink:1.13:job-server:shadowJar"
>>>>>>>>>>>>>>>> 4. Use the outputted Flink runner jar in your Python
>>>>>>>>>>>>>>>> pipeline options
>>>>>>>>>>>>>>>> "--flink_job_server_jar=runners/flink/1.13/build/libs/beam-runners-flink-1.13-2.31.0-SNAPSHOT.jar"
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For the "No container id" issue, can you share the full
>>>>>>>>>>>>>>>> logs?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +Tianzi Cai <[email protected]> +Anthony Mancuso
>>>>>>>>>>>>>>>> <[email protected]>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Kyle
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Jul 9, 2021 at 8:47 AM Ahmet Altay <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> /cc @Kyle Weaver <[email protected]>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Jul 9, 2021 at 5:24 AM Joey Tran <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Hello!
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm trying to just demo Beam/Flink and I tried following
>>>>>>>>>>>>>>>>>> the instructions with Google's Dataproc but I get a bunch of 
>>>>>>>>>>>>>>>>>> errors ranging
>>>>>>>>>>>>>>>>>> from jackson dependency issues to some issue about "No 
>>>>>>>>>>>>>>>>>> container id".
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Does anyone know if these dataproc instructions[1] are
>>>>>>>>>>>>>>>>>> complete? I ran through it pretty much word for word and 
>>>>>>>>>>>>>>>>>> can't get a simple
>>>>>>>>>>>>>>>>>> wordcount going, I'm not sure if I'm somehow messing 
>>>>>>>>>>>>>>>>>> something up or
>>>>>>>>>>>>>>>>>> there's more necessary than just this doc instructs? FWIW 
>>>>>>>>>>>>>>>>>> I've been able to
>>>>>>>>>>>>>>>>>> run the java wordcount example fine, it seems like I only 
>>>>>>>>>>>>>>>>>> run into issues
>>>>>>>>>>>>>>>>>> when trying to follow the portable runner instructions
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks so much in advance for your help1 I'm not very
>>>>>>>>>>>>>>>>>> experienced with deploying these kinds of things but I 
>>>>>>>>>>>>>>>>>> wanted to do a demo
>>>>>>>>>>>>>>>>>> to show that Beam+Flink is a better solution than writing a 
>>>>>>>>>>>>>>>>>> framework myself
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>> https://cloud.google.com/dataproc/docs/concepts/components/flink#portable_beam_jobs
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Reply via email to