Thanks Geoff. I've had informal feedback from others in favour of the list approach, and in my trials it is working nice. I will apply the changes in the docs and PRs.
Two other things: * Geoff's comment made me wonder about having a "function" step, where within a workflow one could define a local function. I'm thinking *not* for just now, but FYI it would not be too hard to add. * The semantics of referencing a sensor which isn't yet "ready" is ambiguous. Do we wait, or return null, or give an error. By default it blocks but this feels wrong. I think we should (1) add a new `wait` step which allows waiting for a value, and (2) give an error when used elsewhere (which is also the behaviour if you reference a non-existent key in a map), and (3) support as part of local variables (only) i.e. `let` some limited evaluation, including the `??` nullish operator which forgives an error on the LHS (and basic maths) A bit more detail on that last, it would allow: - let x = ${entity.sensor.DoesNotExist} ?? 0 - let x = x + 1 But it would give an error if you reference it in log or other commands: - log the value is ${entity.sensor.DoesNotExist} Best Alex On Wed, 21 Sept 2022 at 20:35, Geoff Macartney <geom...@apache.org> wrote: > Hi Alex, Mykola, > > By the way I should mention that I'm very busy in the evenings this week so > might not get to look at the latest PR for a while. By all means go ahead > and merge it if Mykola and/or others are happy with it, no need to wait for > me. > > Cheers > Geoff > > > On Tue, 20 Sept 2022, 22:20 Geoff Macartney, <geom...@apache.org> wrote: > > > Hi Alex, > > > > +1 This updated proposal looks good - I do think the list based > > approach will be simpler and less error prone, and the fact that you > > will support an optional `id` anyway, if that is desired, means it > > retains much of the flexibility of the map based approach. The custom > > workflow step looks a little like the "functions" that we discussed > > previously. Putting this all together will be pretty powerful. > > > > Will try to get a look at the latest PR if I can. > > > > Cheers > > Geoff > > > > > > On Mon, 19 Sept 2022 at 17:31, Alex Heneveld <a...@cloudsoft.io> wrote: > > > > > > Geoff- Thanks. Comments addressed in #1361 along with a major > addition > > to > > > support variables -- inputs/outputs/etc. > > > > > > All- One of the points Geoff makes concerns how steps are defined. I > > > think along with other comments that tips the balance in favour of > > > revisiting how steps are defined. > > > > > > I propose we switch from the OLD proposed approach -- the map of > ordered > > > IDs -- to a NEW LIST-BASED approach. There's a lot of detail below but > > > in-short it's shifting from: > > > > > > steps: > > > 1-say-hi: log hi > > > 2-step-two: log step 2 > > > > > > To: > > > > > > steps: > > > - log hi > > > - log step 2 > > > > > > > > > Specifically, based on feedback and more hands-on experience, I > propose: > > > > > > * steps now be supplied as a list (now a map) > > > * users are no longer required to supply an ID for each step (in the > old > > > approach, the ID was required as the key for every step) > > > * users can if they wish supply an ID for any step (now as an explicit > > `id: > > > <ID>` rule) > > > * the default order, if no `next: <ID>` instruction is supplied, is the > > > order of the list (in the old approach the order was based on the ID) > > > > > > Also, the shorthand idea has evolved a little bit; instead of a > "<type>: > > > <type-specific-shorthand-template>" single-key map, we've suggested: > > > > > > * it be a string "<type> <type-specific-shorthand-template>" > > > * shorthand can also be supplied in a map using the key "s" or the key > > > "shorthand" (to allow shorthand along with other step key values) > > > * custom steps can define custom shorthand templates (e.g. "${key} "=" > > > ${value}") > > > * (there is also some evolution in how custom steps are defined) > > > > > > > > > To illustrate: > > > > > > The OLD EXAMPLE: > > > > > > steps: > > > 1: > > > type: container > > > image: my/google-cloud > > > command: gcloud dataproc jobs submit spark > --BUCKET=gs://${BUCKET} > > > env: > > > BUCKET: $brooklyn:config("bucket") > > > on-error: retry > > > 2: > > > set-sensor: spark-output=${1.stdout} > > > > > > Would become in the NEW proposal: > > > > > > steps: > > > - type: container > > > image: my/google-cloud > > > command: gcloud dataproc jobs submit spark > --BUCKET=gs://${BUCKET} > > > env: > > > BUCKET: $brooklyn:config("bucket") > > > on-error: retry > > > - set-sensor spark-output = ${1.stdout} > > > > > > If we wanted to attach an `id` to the second step (e.g. for use with > > > "next") we could write it either as: > > > > > > # full long-hand map > > > - type: set-sensor > > > input: > > > sensor: spark-output > > > value: ${1.stdout} > > > id: set-spark-output > > > > > > # mixed "s" shorthand key and other fields > > > - s: set-sensor spark-output = ${1.stdout} > > > id: set-spark-output > > > > > > To explain the reasoning: > > > > > > The advantages of steps: > > > > > > * Slightly less verbose when no ID is needed on a step > > > * Easier to read and understand flow > > > * Avoids hassle of renumbering when introducing step > > > * Avoids risk of error where same key defined multiple time > > > > > > The advantages of OLD map-based scheme (implied disadvantages of the > new > > > steps process): > > > > > > * Easier user-facing correlation on steps (e.g. in UI) by always having > > an > > > explicit ID for easier correlation > > > * Easier to extend a workflow by inserting or overriding explicit steps > > > > > > After some initial usage of the workflow, it seems these advantages of > > the > > > old approach are outweighed by the advantages of the list approach. In > > > particular the "correlation" can be done in other ways, and extending a > > > workflow is probably not so useful, whereas supplying and maintaining > an > > ID > > > is a hassle, error-prone, and harder to understand. > > > > > > Finally to explain the custom steps idea, it works out nicely in the > code > > > and we think for users to add a "compound-step" to the catalog e.g. as > > > follows for the workflow shown above: > > > > > > id: retryable-gcloud-dataproc-with-bucket-and-sensor > > > item: > > > type: custom-workflow-step > > > parameters: > > > bucket: > > > type: string > > > sensor_name: > > > type: string > > > default: spark-output > > > shorthand_definition: [ " bucket " ${bucket} ] [ " sensor " > > > ${sensor_name} ] > > > steps: > > > - type: container > > > image: my/google-cloud > > > command: gcloud dataproc jobs submit spark > --BUCKET=gs://${BUCKET} > > > env: > > > BUCKET: ${bucket} > > > on-error: retry > > > - set-sensor ${sensor_name} = ${1.stdout} > > > > > > A user could then write a step: > > > > > > - retryable-gcloud-dataproc-with-bucket-and-sensor > > > > > > And optionally use the shorthand per the shorthand_definition, matching > > the > > > quoted string literals and inferring the indicated parameters, e.g.: > > > > > > - retryable-gcloud-dataproc-with-bucket-and-sensor bucket my-bucket > > sensor > > > my-spark-output > > > > > > They could of course also use the longhand: > > > > > > - type: retryable-gcloud-dataproc-with-bucket-and-sensor > > > input: > > > bucket: my-bucket > > > sensor_name: my-spark-output > > > > > > > > > Best > > > Alex > > > > > > > > > > > > On Sat, 17 Sept 2022 at 21:13, Geoff Macartney <geom...@apache.org> > > wrote: > > > > > > > Hi Alex, > > > > > > > > Belatedly reviewed the PR. It's looking good! And surprisingly simple > > > > in the end. Made a couple of minor comments on it. > > > > > > > > Cheers > > > > Geoff > > > > > > > > On Thu, 8 Sept 2022 at 09:35, Alex Heneveld <a...@cloudsoft.io> > wrote: > > > > > > > > > > Hi team, > > > > > > > > > > An initial PR with a few types and the ability to define an > effector > > is > > > > > available [1]. > > > > > > > > > > This is enough for the next steps to be parallelized, e.g. new > steps > > > > > added. The proposal has been updated with a work plan / list of > > tasks > > > > > [2]. Any volunteers to help with some of the upcoming tasks let me > > know. > > > > > > > > > > Finally I've been thinking about the "shorthand syntax" and how to > > bring > > > > us > > > > > closer to Peter's proposal of a DSL. The original proposal allowed > > > > instead > > > > > of a map e.g. > > > > > > > > > > step_sleep: > > > > > type: sleep > > > > > duration: 5s > > > > > > > > > > or > > > > > > > > > > step_update_service_up: > > > > > type: set-sensor > > > > > sensor: > > > > > name: service.isUp > > > > > type: boolean > > > > > value: true > > > > > > > > > > being able to use a shorthand _map_ with a single key being the > > type, and > > > > > value interpreted by that type, so in the OLD SHORTHAND PROPOSAL > the > > > > above > > > > > could be written: > > > > > > > > > > step_sleep: > > > > > sleep: 5s > > > > > > > > > > step_update_service_up: > > > > > set-sensor: service.isUp = true > > > > > > > > > > Having played with syntaxes a bit I wonder if we should instead say > > the > > > > > shorthand DSL kicks in when the step _body_ is a string (instead > of a > > > > > single-key map), and the first word of the string being the type, > > and the > > > > > remainder interpreted by the type, and we allow it to be a bit more > > > > > ambitious. > > > > > > > > > > Concretely this NEW SHORTHAND PROPOSAL would look something like: > > > > > > > > > > step_sleep: sleep 5s > > > > > step_update_service_up: set-sensor service.isUp = true > > > > > # also supporting a type, ie `set-sensor [TYPE] NAME = VALUE`, eg > > > > > step_update_service_up: set-sensor boolean service.isUp = true > > > > > > > > > > You would still need the full map syntax whenever defining flow > > logic -- > > > > eg > > > > > condition, next, retry, or timeout -- or any property not supported > > by > > > > the > > > > > shorthand syntax. But for the (majority?) simple cases the > > expression > > > > > would be very concise. In most cases I think it would feel like a > > DSL > > > > but > > > > > has the virtue of a very clear translation to the actual workflow > > model > > > > and > > > > > the underlying (YAML) model needed for resumption and UI. > > > > > > > > > > As a final example, the example used at the start of the proposal > > > > > (simplified a little -- removing on-error retry and env map as > those > > > > > wouldn't be supported by shorthand): > > > > > > > > > > brooklyn.initializers: > > > > > - type: workflow-effector > > > > > name: run-spark-on-gcp > > > > > steps: > > > > > 1: > > > > > type: container > > > > > image: my/google-cloud > > > > > command: gcloud dataproc jobs submit spark > > > > > --BUCKET=gs://$brooklyn:config("bucket") > > > > > 2: > > > > > type: set-sensor > > > > > sensor: spark-output > > > > > value: ${1.stdout} > > > > > > > > > > Could be written in this shorthand as follows: > > > > > > > > > > steps: > > > > > 1: container my/google-cloud command "gcloud dataproc jobs > submit > > > > spark > > > > > --BUCKET=gs://${entity.config.bucket}" > > > > > 2: set-sensor spark-output ${1.stdout} > > > > > > > > > > Thoughts? > > > > > > > > > > Best > > > > > Alex > > > > > > > > > > > > > > > [1] https://github.com/apache/brooklyn-server/pull/1358 > > > > > [2] > > > > > > > > > > > > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.gbadaqa2yql6 > > > > > > > > > > > > > > > On Wed, 7 Sept 2022 at 09:58, Alex Heneveld <a...@cloudsoft.io> > > wrote: > > > > > > > > > > > Hi Peter, > > > > > > > > > > > > Yes - thanks for the extra details. I did take your suggestion > to > > be a > > > > > > procedural DSL not YAML, per the illustration at [1] (second code > > > > block). > > > > > > Probably where I was confusing was in saying that unlike DSLs > which > > > > just > > > > > > run (and where the execution can be delegated to eg > > java/groovy/ruby), > > > > here > > > > > > we need to understand and display, store and resume the workflow > > > > progress. > > > > > > So I think it needs to be compiled to some representation that is > > well > > > > > > described and that new Apache Brooklyn code can reason about, > both > > in > > > > the > > > > > > UI (JS) and backend (Java). Parsing a DSL is much harder than > > using > > > > YAML > > > > > > for this "reasonable" representation (as in we can reason _about_ > > it > > > > :) ), > > > > > > because we already have good backend processing, persistence, > > > > > > serialization; and frontend processing and visualization support > > for > > > > > > YAML-based models. So I think we almost definitely want a > > > > well-described > > > > > > declarative YAML model of the workflow. > > > > > > > > > > > > We might *also* want a Workflow DSL because I agree with you a > DSL > > > > would > > > > > > be nicer for a user to write (if writing by hand; although if > > composing > > > > > > visually a drag-and-drop to YAML is probably easier). However it > > > > should > > > > > > probably get "compiled" into a Workflow YAML. So I'm suggesting > > we do > > > > the > > > > > > workflow YAML at this stage, and a DSL that compiles into that > YAML > > > > can be > > > > > > designed later. (Designing a good DSL and parser and > > reason-about-able > > > > > > representation is a big task, so being able to separate it feels > > good > > > > too!) > > > > > > > > > > > > Best > > > > > > Alex > > > > > > > > > > > > [1] > > > > > > > > > > > > > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.75wm48pjvx0h > > > > > > > > > > > > > > > > > > On Fri, 2 Sept 2022 at 20:17, Geoff Macartney < > > > > geoff.macart...@gmail.com> > > > > > > wrote: > > > > > > > > > > > >> Hi Peter, > > > > > >> > > > > > >> Thanks for such a detailed writeup of how you see this working. > I > > fear > > > > > >> I've too little experience with this sort of thing to be able to > > say > > > > > >> anything very useful about it. My thought on the matter would > be, > > > > > >> let's get started with the yaml based approach and see how it > > goes. I > > > > > >> think that experience would then give us a much better feel for > > what a > > > > > >> really nice and usable DSL for workflows would look like > > (probably to > > > > > >> address all the pain points of the yaml approach! :-) The > > outline > > > > > >> above will then be a good starting point, I'm sure. > > > > > >> > > > > > >> Cheers > > > > > >> Geoff > > > > > >> > > > > > >> On Thu, 1 Sept 2022 at 21:26, Peter Abramowitsch > > > > > >> <pabramowit...@gmail.com> wrote: > > > > > >> > > > > > > >> > Hi All > > > > > >> > I just wanted to clarify something in my comment the other day > > about > > > > > >> DSLs > > > > > >> > since I see that the acronym was also used in Alex's original > > > > document. > > > > > >> > Unless I misunderstood, Alex was proposing to create a DSL for > > > > Brooklyn > > > > > >> > using yaml as syntax and writing a code layer to translate > > between > > > > that > > > > > >> > syntax and underlying APIs which are presumably all in Java. > > > > > >> > > > > > > >> > What I was suggesting was a DSL written directly in Java (I > > guess) > > > > > >> whose > > > > > >> > syntax would be that language, but whose grammar would be > > keywords > > > > that > > > > > >> > were also Java functions. Some of these functions would be > > > > pre-defined > > > > > >> in > > > > > >> > the DSL, while others could be defined by the user and could > > use > > > > other > > > > > >> > functions of the DSL. The result would be turned into a JAR > > file > > > > (or > > > > > >> > equivalent in another platform) But during the compile > phase, > > it > > > > > >> would be > > > > > >> > checked for errors, and it could be debugged line by line > either > > > > > >> invoking > > > > > >> > live functionality or using a library of mock versions of the > > > > Brooklyn > > > > > >> API. > > > > > >> > > > > > > >> > In this 'native' DSL one could provide different types of > > workflow > > > > > >> > constructs as functions (In the BaseClass), taking function > > names as > > > > > >> method > > > > > >> > pointers, or using Lambdas. It would be a lot easier in Ruby > or > > > > Python > > > > > >> > > > > > > >> > // linear > > > > > >> > brooklynRun(NamedTaskMethod, NamedTaskMethod) > > > > > >> > > > > > > >> > // chained > > > > > >> > TaskMethodA()TaskMethodB(). > > > > > >> > > > > > > >> > // asynchronous > > > > > >> > brooklynJoin(NamedTaskMethod, NamedTaskMethod,...) > > > > > >> > > > > > > >> > // conditional > > > > > >> > brooklynRunIf(NamedTaskMethod, NamedConditionMethod,...) > > > > > >> > > > > > > >> > // iterative > > > > > >> > brooklynRunWhile(NamedTaskMethod, NamedConditionMethod,...) > > > > > >> > brooklynRunUntil(NamedTaskMethod, NamedConditionMethod,...) > > > > > >> > > > > > > >> > // there could even be a utility to implement legacy syntax > > (this of > > > > > >> course > > > > > >> > would require the extra code layer I was trying to avoid) > > > > > >> > runYaml(Path) > > > > > >> > > > > > > >> > A basic class structure might be > > > > > >> > > > > > > >> > // where BrooklynRecipeBase implements the utility functions > > > > including, > > > > > >> > among others Join, Run, If, While, Until mentioned above > > > > > >> > // and the BrooklynWorkflowInterface would dictate the > > functional > > > > > >> > requirements for the mandatory aspects of the Recipe. > > > > > >> > class MyRecipe extends BrooklynRecipeBase implements, > > > > > >> > BrooklynWorkflowInterface > > > > > >> > { > > > > > >> > Initialize() > > > > > >> > createContext() - spin up resources > > > > > >> > workflow() - the main launch sequence using aspects of the DSL > > > > > >> > monitoring() - an asynchronous workflow used to manage sensor > > > > output or > > > > > >> for > > > > > >> > whatever needs to be done while the "orchestra" is plating > > > > > >> > shutdownHook() - called whenever shutdown is happening > > > > > >> > } > > > > > >> > > > > > > >> > For those who don't like the smell of Java, the source file > > could > > > > just > > > > > >> be > > > > > >> > the contents, which would then be injected into the class > > framing > > > > code > > > > > >> > before compilation. > > > > > >> > > > > > > >> > These are just ideas. I'm not familiar enough with Brooklyn > in > > its > > > > > >> current > > > > > >> > implementation to be able to create realistic pseudocode. > > > > > >> > > > > > > >> > Peter > > > > > >> > > > > > > >> > On Thu, Sep 1, 2022 at 9:24 AM Geoff Macartney < > > > > > >> geoff.macart...@gmail.com> > > > > > >> > wrote: > > > > > >> > > > > > > >> > > Hi Alex, > > > > > >> > > > > > > > >> > > That's great, I'll be excited to hear all about it. 7th > > September > > > > > >> > > suits me fine; I would probably prefer 4.00 p.m. over 11.00. > > > > > >> > > > > > > > >> > > Cheers > > > > > >> > > Geoff > > > > > >> > > > > > > > >> > > On Thu, 1 Sept 2022 at 12:41, Alex Heneveld < > > a...@cloudsoft.io> > > > > > >> wrote: > > > > > >> > > > > > > > > >> > > > Thanks for the excellent feedback Geoff and yes there are > > some > > > > very > > > > > >> cool > > > > > >> > > and exciting things added recently -- containers, > conditions, > > and > > > > > >> terraform > > > > > >> > > and kubernetes support, all of which make writing complex > > > > blueprints > > > > > >> much > > > > > >> > > easier. > > > > > >> > > > > > > > > >> > > > I'd love to host a session to showcase these. > > > > > >> > > > > > > > > >> > > > How does Wed 7 Sept sound? I could do 11am UK or 4pm UK > -- > > > > > >> depending > > > > > >> > > what time suits for people who are interested. Please RSVP > > and > > > > > >> indicate > > > > > >> > > your time preference! > > > > > >> > > > > > > > > >> > > > Best > > > > > >> > > > Alex > > > > > >> > > > > > > > > >> > > > > > > > > >> > > > On Wed, 31 Aug 2022 at 22:17, Geoff Macartney < > > > > > >> geoff.macart...@gmail.com> > > > > > >> > > wrote: > > > > > >> > > >> > > > > > >> > > >> Hi Alex, > > > > > >> > > >> > > > > > >> > > >> Another thought occurred to me when reading that workflow > > > > > >> proposal. You > > > > > >> > > wrote > > > > > >> > > >> > > > > > >> > > >> "and with the recent support for container-based tasks > and > > > > > >> declarative > > > > > >> > > >> conditions, we have taken big steps towards enabling YAML > > > > > >> authorship" > > > > > >> > > >> > > > > > >> > > >> Unfortunately over the past while I haven't been able to > > keep > > > > up as > > > > > >> > > >> closely as I would like with developments in Brooklyn. > I'm > > just > > > > > >> > > >> wondering if it might be possible to get together some > > time, on > > > > > >> Google > > > > > >> > > >> Meet or Zoom or whatnot, if you or a colleague could > spare > > > > half an > > > > > >> > > >> hour to demo some of these recent developments? But don't > > worry > > > > > >> about > > > > > >> > > >> it if you're too busy at present. > > > > > >> > > >> > > > > > >> > > >> Adding dev@ to this in CC for the sake of Openness. > Others > > > > might > > > > > >> also > > > > > >> > > >> be interested! > > > > > >> > > >> > > > > > >> > > >> Cheers > > > > > >> > > >> Geoff > > > > > >> > > > > > > > >> > > > > > > > > > > > > >