On Wed, Jul 21, 2010 at 05:30:57PM -0400, Rich Meyers wrote:
>
> > > 1. Ruote state is a black box. How do I know what workflows are running?
> >
> > engine.processes
> >
> > Warning, can be a costly operation depending on the storage implementation.
> >
> > > What is the status of each running workflow?
> >
> > engine.processes.each { |ps| p ps }
> > p engine.process(wfid)
>
> Correct me if I'm wrong but for each running workflow I can only get its
> wfid. In practice, workflows have descriptive names (like
> download-google-report) that users and scripts use. Displaying the fact that
> a download-google-report workflow launched on 1/1/2001 is currently running
> requires me to maintain my own mapping of workflow names to wfids.
Hello Rich,
what about http://gist.github.com/485341 :
---8<---
require 'rubygems'
require 'ruote'
engine = Ruote::Engine.new(Ruote::Worker.new(Ruote::HashStorage.new))
pdef = Ruote.process_definition :name => 'download', :revision => '0.1' do
alpha
end
engine.register_participant 'alpha', Ruote::StorageParticipant
wfid = engine.launch(pdef)
engine.wait_for(:alpha)
p engine.process(wfid).definition_name
p engine.process(wfid).definition_revision
p engine.process(wfid).launched_time
--->8---
> This is the first requirement for having persistent storage parallel to
> ruote's storage.
I don't understand that implication.
> > > Where is the list of workflows that finished?
> >
> > Out of the box, there is no such list. Terminated workflows are simply
> > removed. You can write a history service to log them.
>
> The history service may be a valid alternative. Does ruote storage support
> persisting arbitrary additional data (besides workflows/expressions)? If I'm
> going to have a history service that is more part of ruote than my own app
> I'd like it to use ruote storage.
It can.
> > > 2. Waiting for workflows is not reliable.
> >
> > Waiting for workflows is used when testing ruote or when showing small
> > quickstarts.
> >
> > > Ruote cannot wait for workflows that finish before the wait starts.
> >
> > This could help (2.1.11) :
> >
> > http://github.com/jmettraux/ruote/commit/0eb09d354992e27778f87fe64d418632b4281d9c
> >
> Thanks, this seems like a step in the right direction. For my purposes
> however @seen would probably need to be unbounded and persistent. Without
> persistent @seen in particular starting up a fresh process to wait for a
> workflow that finished a long time ago wouldn't work from what I can see.
Feel free to take inspiration from TestLogger and WaitLogger and don't hesitate
to have a peek at History and co if you really want to look further.
> > > 3. Ruote does not limit concurrency. If some of my workflows use external
> > > resources such as downloading files from the internet I don't want an
> > > unbounded number of these workflows running at once.
> >
> > Ruby is written in Ruby. When running on MRI (C) Ruby, there is only 1
> > thread on at a time. Ruote brings no magic concurrency to the table. For 1
> > worker, there is only 1 workflow instance performing something at a time.
>
> In ruby code, yes. But I'm using curb (curl binding for ruby) which allows me
> to have multiple concurrent downloads being managed by C code.
So it's your responsibility has a participant implementer, not the one of ruote.
> > > I considered typical ruote use cases mentioned in the documentation and
> > > on the mailing list and I suppose when dealing with human processes that
> > > are persisted externally some or all of these issues do not appear. I'm
> > > starting to think that ruote works ok for state transitions for objects
> > > persisted elsewhere but I don't see how it can effectively manage
> > > processes that only exist in ruote.
> >
> > Could you please expand on that ? Ruote isn't about state/transitions. I
> > understand the part "for objects persisted elsewhere", where ruote
> > processes alter the state of objects, but I don't get the "I don't see how
> > it (ruote) can effectively manage processes that only exist in ruote".
>
> Suppose I want to check stock prices on various exchanges every day. I have
> one workflow for each exchange that knows how to get data for a particular
> stock and parse it. I launch these workflows from cron daily.
>
> I need to know:
>
> - What stocks are currently being downloaded from what exchanges?
What about a smarter participant that knows what's going on ?
> - What workflows have been running for over an hour? What step are they on
> right now?
See engine.process(wfid).launched_time in the above code gist.
> - Were there any downloading or parsing failures in the last week? What were
> the causes?
I guess you could query ruote's error journal or directly ask the smart
download participant.
> Ideally I would like ruote to manage the workflows from start to finish. As
> such I would like ruote to tell me that currently GOOG is being retrieved
> from NY stock exchange, and yesterday all retrievals failed because
> connection to stock exchange couldn't be established.
>
> On my site users indicate which stock quotes they want to see, and I retrieve
> only those stock quotes. But I may have a sudden spike in user activity. I
> want to limit the number of active downloads to 10 per stock exchange. I
> don't want to limit the number of stock quotes that are being parsed since
> it's a relatively quick operation. I also don't know how long each download
> would take, and I want everything to be downloaded as soon as possible after
> scheduled downloads begin. Ideally I want to submit a potentially huge list
> of stocks to ruote and have it only invoke 10 download participants at a time.
What about a smarter participant that queues download ?
Sorry, but ruote is a workflow engine, it's not a quote downloading workflow
engine.
> And, again, all of this should be persisted in ruote storage.
Participants can "stash" data in the engine. Maybe it could help.
http://github.com/jmettraux/ruote/blob/ruote2.1/test/functional/ft_38_participant_more.rb#L160-191
> Normally workflows are asynchronous in the sense that the cron script
> launches them and does not wait for completion. A separate cron job would
> check for errors daily after some suitable interval. However, during
> development I want to run the workflows synchronously so that if any part of
> them fails I find out about the error as soon as possible. Thus I need to be
> able to run the workflows both synchronously and asynchronously.
So are you suggesting I should write a synchronous ruote ?
Rich, I'd be glad to help integrate ruote in your architecture. Stop labelling
as ruote flaws issues that are clearly on your side. I plead guilty for the
lack of documentation, but not for "it doesn't work like I think it should
work" things.
Best regards,
--
John Mettraux - http://jmettraux.wordpress.com
--
you received this message because you are subscribed to the "ruote users" group.
to post : send email to [email protected]
to unsubscribe : send email to [email protected]
more options : http://groups.google.com/group/openwferu-users?hl=en