********** Jacob Kroeze VPAFA jkro...@usc.edu 213-740-3487
________________________________ From: Romain Manni-Bucau <rmannibu...@gmail.com> Sent: Monday, August 20, 2018 8:57:36 AM To: user@beam.apache.org Subject: Re: Launching a subprocess in DoFn It is surprising eclipse doesn't run with your user but yes, perms seem to be the key here. I don't recall the details but in old packages of docker you needed to add it to sudoers or manually add yourself in a docker group. Romain Manni-Bucau @rmannibucau<https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=UFXzsV0prdyxWCSMPsd1IUPxNKXM3IK-CPJA8yQH0gQ&e=> | Blog<https://urldefense.proofpoint.com/v2/url?u=https-3A__rmannibucau.metawerx.net_&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=pnzx75jQLwYrL6ZLZtLT2FBDxi6BcpRfhET_jITDvxg&e=> | Old Blog<https://urldefense.proofpoint.com/v2/url?u=http-3A__rmannibucau.wordpress.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=emCqpxr37vNGLZW6p6oF-aK9_yAHSfQWS3JGHUFi4aw&e=> | Github<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=WjwiQD5RYlrwzlqPiw5roFoG1keVCVZn8K8nCj7c8Vc&e=> | LinkedIn<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_in_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=eG32PSAWMT2e1l3tjAc6FGQu0xLk4K3Ff9PGlQ3Ycck&e=> | Book<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.packtpub.com_application-2Ddevelopment_java-2Dee-2D8-2Dhigh-2Dperformance&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=cEsnsn3olYbPULZfVTsTE2liXWke-A1UbfiYBU4jQqM&e=> Le lun. 20 août 2018 à 17:24, Mahesh Vangala <vangalamahe...@gmail.com<mailto:vangalamahe...@gmail.com>> a écrit : Hi Romain - OKay, so I was running it through eclipse. I believe the jobs were running as eclipse user (?) as oppose to me (authorized to launch docker). Once, I ran project on command line (irrespective of sudo or not), containers executed just fine. Let me know if that makes sense. Thanks again. Regards, Mahesh -- Mahesh Vangala (Ph) 443-326-1957 (web) mvangala.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__mvangala.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=A1RdM8bs32-TAh0Sj4lvX7JZVpbgNrJtVQZdrzdtIhE&e=> On Mon, Aug 20, 2018 at 11:18 AM Mahesh Vangala <vangalamahe...@gmail.com<mailto:vangalamahe...@gmail.com>> wrote: Hi Romain - Got it. Apparently, I needed to run beam as sudo to launch the docker containers. I still need to figure out why that's the case. Thank you. -- Mahesh Vangala (Ph) 443-326-1957 (web) mvangala.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__mvangala.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=A1RdM8bs32-TAh0Sj4lvX7JZVpbgNrJtVQZdrzdtIhE&e=> On Mon, Aug 20, 2018 at 11:05 AM Mahesh Vangala <vangalamahe...@gmail.com<mailto:vangalamahe...@gmail.com>> wrote: Hello Romain - So, I have added pb.inheritIO().start().waitFor(); and now I have an error /bin/bash: docker: command not found. But, I have docker installed on the system. /usr/local/bin/docker Any ideas why I'm seeing this error when launched from within DoFn? Thank you so much for your help. -- Mahesh Vangala (Ph) 443-326-1957 (web) mvangala.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__mvangala.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=A1RdM8bs32-TAh0Sj4lvX7JZVpbgNrJtVQZdrzdtIhE&e=> On Mon, Aug 20, 2018 at 10:58 AM Romain Manni-Bucau <rmannibu...@gmail.com<mailto:rmannibu...@gmail.com>> wrote: Weird, this code works: https://gist.github.com/rmannibucau/4703f321bb1962d1303f8eccbd05df0e<https://urldefense.proofpoint.com/v2/url?u=https-3A__gist.github.com_rmannibucau_4703f321bb1962d1303f8eccbd05df0e&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=ZVEPUIkP1v7BLPIIGU4gEkYj8cQNXN2vNa7FdNpRHRI&e=> Are you sure your test_in.csv has some data (otherwise no DoFn processing will be triggered)? Romain Manni-Bucau @rmannibucau<https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=UFXzsV0prdyxWCSMPsd1IUPxNKXM3IK-CPJA8yQH0gQ&e=> | Blog<https://urldefense.proofpoint.com/v2/url?u=https-3A__rmannibucau.metawerx.net_&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=pnzx75jQLwYrL6ZLZtLT2FBDxi6BcpRfhET_jITDvxg&e=> | Old Blog<https://urldefense.proofpoint.com/v2/url?u=http-3A__rmannibucau.wordpress.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=emCqpxr37vNGLZW6p6oF-aK9_yAHSfQWS3JGHUFi4aw&e=> | Github<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=WjwiQD5RYlrwzlqPiw5roFoG1keVCVZn8K8nCj7c8Vc&e=> | LinkedIn<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_in_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=eG32PSAWMT2e1l3tjAc6FGQu0xLk4K3Ff9PGlQ3Ycck&e=> | Book<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.packtpub.com_application-2Ddevelopment_java-2Dee-2D8-2Dhigh-2Dperformance&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=cEsnsn3olYbPULZfVTsTE2liXWke-A1UbfiYBU4jQqM&e=> Le lun. 20 août 2018 à 16:33, Mahesh Vangala <vangalamahe...@gmail.com<mailto:vangalamahe...@gmail.com>> a écrit : Hi Romain - I don't see any errors when I used waitFor(). However, I don't see those processes being executed either since "docker ps -a" doesn't list any processes. This is quite unrelated to beam itself normally. If your engine (spark, dataflow etc) doesn't have a security manager active I am using DirectRunner though. Let me know. Thank you! -- Mahesh Vangala (Ph) 443-326-1957 (web) mvangala.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__mvangala.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=A1RdM8bs32-TAh0Sj4lvX7JZVpbgNrJtVQZdrzdtIhE&e=> On Mon, Aug 20, 2018 at 10:28 AM Romain Manni-Bucau <rmannibu...@gmail.com<mailto:rmannibu...@gmail.com>> wrote: Hi Mahesh, Did you get the same error? This is quite unrelated to beam itself normally. If your engine (spark, dataflow etc) doesn't have a security manager active it should be enough, if it has you can be forbidden to use that. Romain Manni-Bucau @rmannibucau<https://urldefense.proofpoint.com/v2/url?u=https-3A__twitter.com_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=UFXzsV0prdyxWCSMPsd1IUPxNKXM3IK-CPJA8yQH0gQ&e=> | Blog<https://urldefense.proofpoint.com/v2/url?u=https-3A__rmannibucau.metawerx.net_&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=pnzx75jQLwYrL6ZLZtLT2FBDxi6BcpRfhET_jITDvxg&e=> | Old Blog<https://urldefense.proofpoint.com/v2/url?u=http-3A__rmannibucau.wordpress.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=emCqpxr37vNGLZW6p6oF-aK9_yAHSfQWS3JGHUFi4aw&e=> | Github<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=WjwiQD5RYlrwzlqPiw5roFoG1keVCVZn8K8nCj7c8Vc&e=> | LinkedIn<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.linkedin.com_in_rmannibucau&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=eG32PSAWMT2e1l3tjAc6FGQu0xLk4K3Ff9PGlQ3Ycck&e=> | Book<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.packtpub.com_application-2Ddevelopment_java-2Dee-2D8-2Dhigh-2Dperformance&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=cEsnsn3olYbPULZfVTsTE2liXWke-A1UbfiYBU4jQqM&e=> Le lun. 20 août 2018 à 16:08, Mahesh Vangala <vangalamahe...@gmail.com<mailto:vangalamahe...@gmail.com>> a écrit : Hello Romain - I did try that, still no luck. Also, when I put the process start logic into separate Test script, I do notice successful docker container when I do "docker ps". However, no such luck implementing that logic with in DoFn. Any thoughts? Thank you. Regards, Mahesh -- Mahesh Vangala (Ph) 443-326-1957 (web) mvangala.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__mvangala.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=A1RdM8bs32-TAh0Sj4lvX7JZVpbgNrJtVQZdrzdtIhE&e=> On Sun, Aug 19, 2018 at 3:53 AM Romain Manni-Bucau <rmannibu...@gmail.com<mailto:rmannibu...@gmail.com>> wrote: waitFor and not java wait primitive? Le dim. 19 août 2018 04:35, Mahesh Vangala <vangalamahe...@gmail.com<mailto:vangalamahe...@gmail.com>> a écrit : Hello Beamers - I am trying to pull a POC - launch docker image per element in Input PCollection and then return some data to Output Pcollection. Here is my code: public class VariantCaller { public static void main( String[] args ) { PipelineOptions opts = PipelineOptionsFactory.fromArgs(args).create(); Pipeline p = Pipeline.create(opts); PCollection<String> lines = p.apply(TextIO.read().from("test_in.csv")); PCollection<String> outLines = lines.apply(ParDo.of(new LaunchDocker.LaunchJobs())); PCollection<String> mergedLines = outLines.apply(Combine.globally(new AddLines())); mergedLines.apply(TextIO.write().to("test_out.csv")); p.run(); } } My LaunchDocker Code: public class LaunchDocker { public static class LaunchJobs extends DoFn<String, String> { private static final long serialVersionUID = 1L; private static final Logger LOG = LoggerFactory.getLogger(AddLines.class); @ProcessElement public void processElement(ProcessContext c) throws Exception { // Get the input element from ProcessContext. String word = c.element().split(",")[0]; LOG.info(word); ProcessBuilder pb = new ProcessBuilder("/bin/bash", "-c", "docker run --rm ubuntu:16.04 sleep 20"); pb.start().wait(); // Use ProcessContext.output to emit the output element. if (!word.isEmpty()) c.output(word + "\n"); } } } However, this fails with error: Aug 18, 2018 10:30:23 PM org.apache.beam.sdk.io.FileBasedSource getEstimatedSizeBytes INFO: Filepattern test_in.csv matched 1 files with total size 36 Aug 18, 2018 10:30:23 PM org.apache.beam.sdk.io.FileBasedSource split INFO: Splitting filepattern test_in.csv into bundles of size 4 took 1 ms and produced 1 files and 9 bundles Aug 18, 2018 10:30:23 PM pipelines.variant_caller.LaunchDocker$LaunchJobs processElement INFO: sample1 Aug 18, 2018 10:30:23 PM pipelines.variant_caller.LaunchDocker$LaunchJobs processElement INFO: 4 Aug 18, 2018 10:30:23 PM pipelines.variant_caller.LaunchDocker$LaunchJobs processElement INFO: 1 Exception in thread "main" org.apache.beam.sdk.Pipeline$PipelineExecutionException: java.lang.IllegalMonitorStateException at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:332) at org.apache.beam.runners.direct.DirectRunner$DirectPipelineResult.waitUntilFinish(DirectRunner.java:302) at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:197) at org.apache.beam.runners.direct.DirectRunner.run(DirectRunner.java:64) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:313) at org.apache.beam.sdk.Pipeline.run(Pipeline.java:299) at pipelines.variant_caller.VariantCaller.main(VariantCaller.java:29) Caused by: java.lang.IllegalMonitorStateException at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at pipelines.variant_caller.LaunchDocker$LaunchJobs.processElement(LaunchDocker.java:19) Can you share your ideas what's the best way of achieving this? Thank you for your help! Sincerely, Mahesh -- Mahesh Vangala (Ph) 443-326-1957 (web) mvangala.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__mvangala.com&d=DwMFaQ&c=clK7kQUTWtAVEOVIgvi0NU5BOUHhpN0H8p7CSfnc_gI&r=wTsPzbQ0in7NK_XbpYAHtTvkoDRTrotzoKIAZi5EZsk&m=lcW9ZEJKOdFOy9yaVGVkIDnYFAwsjrqOmslKItV17Eg&s=A1RdM8bs32-TAh0Sj4lvX7JZVpbgNrJtVQZdrzdtIhE&e=>