Re: [ruote:2531] Integrating ruote with applications

John Mettraux Wed, 21 Jul 2010 17:16:51 -0700

On Wed, Jul 21, 2010 at 05:30:57PM -0400, Rich Meyers wrote:
> 
> > > 1. Ruote state is a black box. How do I know what workflows are running?
> > 
> > engine.processes
> > 
> > Warning, can be a costly operation depending on the storage implementation.
> > 
> > > What is the status of each running workflow?
> > 
> > engine.processes.each { |ps| p ps }
> > p engine.process(wfid)
> 
> Correct me if I'm wrong but for each running workflow I can only get its 
> wfid. In practice, workflows have descriptive names (like 
> download-google-report) that users and scripts use. Displaying the fact that 
> a download-google-report workflow launched on 1/1/2001 is currently running 
> requires me to maintain my own mapping of workflow names to wfids.


Hello Rich,

what about http://gist.github.com/485341 :

---8<---
require 'rubygems'
require 'ruote'

engine = Ruote::Engine.new(Ruote::Worker.new(Ruote::HashStorage.new))

pdef = Ruote.process_definition :name => 'download', :revision => '0.1' do
  alpha
end

engine.register_participant 'alpha', Ruote::StorageParticipant

wfid = engine.launch(pdef)

engine.wait_for(:alpha)

p engine.process(wfid).definition_name
p engine.process(wfid).definition_revision
p engine.process(wfid).launched_time
--->8---


> This is the first requirement for having persistent storage parallel to 
> ruote's storage.

I don't understand that implication.


> > > Where is the list of workflows that finished?
> > 
> > Out of the box, there is no such list. Terminated workflows are simply 
> > removed. You can write a history service to log them.
> 
> The history service may be a valid alternative. Does ruote storage support 
> persisting arbitrary additional data (besides workflows/expressions)? If I'm 
> going to have a history service that is more part of ruote than my own app 
> I'd like it to use ruote storage.

It can.


> > > 2. Waiting for workflows is not reliable.
> > 
> > Waiting for workflows is used when testing ruote or when showing small 
> > quickstarts.
> > 
> > > Ruote cannot wait for workflows that finish before the wait starts.
> > 
> > This could help (2.1.11) :
> > 
> > http://github.com/jmettraux/ruote/commit/0eb09d354992e27778f87fe64d418632b4281d9c
> > 
> Thanks, this seems like a step in the right direction. For my purposes 
> however @seen would probably need to be unbounded and persistent. Without 
> persistent @seen in particular starting up a fresh process to wait for a 
> workflow that finished a long time ago wouldn't work from what I can see.

Feel free to take inspiration from TestLogger and WaitLogger and don't hesitate 
to have a peek at History and co if you really want to look further.


> > > 3. Ruote does not limit concurrency. If some of my workflows use external 
> > > resources such as downloading files from the internet I don't want an 
> > > unbounded number of these workflows running at once.
> > 
> > Ruby is written in Ruby. When running on MRI (C) Ruby, there is only 1 
> > thread on at a time. Ruote brings no magic concurrency to the table. For 1 
> > worker, there is only 1 workflow instance performing something at a time.
> 
> In ruby code, yes. But I'm using curb (curl binding for ruby) which allows me 
> to have multiple concurrent downloads being managed by C code.

So it's your responsibility has a participant implementer, not the one of ruote.


> > > I considered typical ruote use cases mentioned in the documentation and 
> > > on the mailing list and I suppose when dealing with human processes that 
> > > are persisted externally some or all of these issues do not appear. I'm 
> > > starting to think that ruote works ok for state transitions for objects 
> > > persisted elsewhere but I don't see how it can effectively manage 
> > > processes that only exist in ruote.
> > 
> > Could you please expand on that ? Ruote isn't about state/transitions. I 
> > understand the part "for objects persisted elsewhere", where ruote 
> > processes alter the state of objects, but I don't get the "I don't see how 
> > it (ruote) can effectively manage processes that only exist in ruote".
> 
> Suppose I want to check stock prices on various exchanges every day. I have 
> one workflow for each exchange that knows how to get data for a particular 
> stock and parse it. I launch these workflows from cron daily.
> 
> I need to know:
> 
> - What stocks are currently being downloaded from what exchanges?

What about a smarter participant that knows what's going on ?

> - What workflows have been running for over an hour? What step are they on 
> right now?

See engine.process(wfid).launched_time in the above code gist.

> - Were there any downloading or parsing failures in the last week? What were 
> the causes?

I guess you could query ruote's error journal or directly ask the smart 
download participant.


> Ideally I would like ruote to manage the workflows from start to finish. As 
> such I would like ruote to tell me that currently GOOG is being retrieved 
> from NY stock exchange, and yesterday all retrievals failed because 
> connection to stock exchange couldn't be established.
> 
> On my site users indicate which stock quotes they want to see, and I retrieve 
> only those stock quotes. But I may have a sudden spike in user activity. I 
> want to limit the number of active downloads to 10 per stock exchange. I 
> don't want to limit the number of stock quotes that are being parsed since 
> it's a relatively quick operation. I also don't know how long each download 
> would take, and I want everything to be downloaded as soon as possible after 
> scheduled downloads begin. Ideally I want to submit a potentially huge list 
> of stocks to ruote and have it only invoke 10 download participants at a time.

What about a smarter participant that queues download ?

Sorry, but ruote is a workflow engine, it's not a quote downloading workflow 
engine.


> And, again, all of this should be persisted in ruote storage.

Participants can "stash" data in the engine. Maybe it could help.

  
http://github.com/jmettraux/ruote/blob/ruote2.1/test/functional/ft_38_participant_more.rb#L160-191


> Normally workflows are asynchronous in the sense that the cron script 
> launches them and does not wait for completion. A separate cron job would 
> check for errors daily after some suitable interval. However, during 
> development I want to run the workflows synchronously so that if any part of 
> them fails I find out about the error as soon as possible. Thus I need to be 
> able to run the workflows both synchronously and asynchronously.

So are you suggesting I should write a synchronous ruote ?

Rich, I'd be glad to help integrate ruote in your architecture. Stop labelling 
as ruote flaws issues that are clearly on your side. I plead guilty for the 
lack of documentation, but not for "it doesn't work like I think it should 
work" things.


Best regards,

-- 
John Mettraux - http://jmettraux.wordpress.com

-- 
you received this message because you are subscribed to the "ruote users" group.
to post : send email to [email protected]
to unsubscribe : send email to [email protected]
more options : http://groups.google.com/group/openwferu-users?hl=en

Re: [ruote:2531] Integrating ruote with applications

Reply via email to