Michael, Currently it is not possible to make to actions to run in the same node.
Eventually, when Oozie starts leveraging Yarn capabilities, this could be possible. Today you can overcome this limitation by using HDFS as the filesystem for your actions to leave and pick up data from. Thanks. On Wed, Oct 30, 2013 at 10:15 AM, <[email protected]> wrote: > Using the distributed cache is a good idea for MR-based tasks, but not all > tasks are MR-based. > > For example, I might need to run a shell script action followed by a Java > action, neither of which does anything with MR and need to work on files on > the local filesystem. It would be useful to have a "compound action" that > can run a shell action and Java action on the same node consecutively. I > was hoping this is what a sub-workflow is for. > > One could argue that "compound things" just need to be managed via your own > shell action, but I like the Java action because it sets up your classpath > (including the Hadoop jars in your path). I'm not sure how to do this in > my own shell script to launch a Java program. So it is more convenient to > run a shell action that runs some bash stuff and then launch a Java program > to do more stuff with it before putting the final result into HDFS. > > Any other ideas on ways to do this? > -Michael > > > On Wed, Oct 30, 2013 at 12:20 PM, Serega Sheypak > <[email protected]>wrote: > > > Its mapreduce duty to select which TT node use to run task. > > Try to put your local stuff into hdfs and use distributed cache > > 30.10.2013 19:22 пользователь <[email protected]> написал: > > > > > I have two actions that need to run on the same datanode (due to stuff > on > > > the local filesystem). Is there any way to ensure this in Oozie? > > > > > > For instance, if I put them into the same sub-workflow, will that work? > > > Does a subworkflow run two or more actions at the same node? > > > > > > Thanks, > > > -Michael > > > > > > -- Alejandro
