Hello John,

So I worked on this issue today and this is what I implemented :

A synchronization mechanism between two processes (1-1 relation) that only
let the processes continue when both processes have reached the same
synchronization participant.
It works by storing the synchronization key along with the first workitem
to reach this point. Once a second participant is called with the same key,
we fetch the workitem from the storage, receive it and thus resume the
first process, and we reply immediately to the second one, continuing the
second process.

So there are no db hits when there is no activity, and it does not matter
which process gets to the synchronization first :)

*master process *

    pdef = Ruote.process_definition :name => 'receipt_processing' do

      concurrent_iterator :on_field => 'receipt_subpicture_uuids', :to_var
=> 'uuid' do
        synchronize :key => "processed_image-${v:uuid}"
        do_something_with_the_ocr_results
      end

      rest of master process...

    end

*child processes*

    pdef = Ruote.process_definition :name =>
'receipt_subpicture_processing' do

      process_image
      synchronize :key =>"processed_image-${f:receipt_subpicture_uuid}"

    end


The code is available on GitHub :
https://github.com/adrienkohlbecker/ruote-synchronize
Best regards,
Adrien


On Fri, Apr 5, 2013 at 2:43 AM, John Mettraux <[email protected]> wrote:

>
> On Thu, Apr 04, 2013 at 03:38:25PM -0700, Adrien Kohlbecker wrote:
> >
> > We are doing OCR, so we need the pictures to be there :) But besides the
> > initial API call (receipt creation) and the upload of the pictures, there
> > is no user interaction at this point in the process.
>
> Hello,
>
> so why not start the process when the pictures are uploaded?
>
>
> > I've given some thought about what you suggested, but I still see some
> > flaws :
> >
> > - If we have Ruote poll our database for the status of the upload, we may
> > hit some performance problems, as we would have 10-20 threads polling our
> > db during peak time.
> >
> > - If we implement a timeout in the participant, the master process will
> > fail, and then if a user reopens the app and the upload restarts the
> master
> > process will not pick it up (correct me if I'm wrong regarding the
> timeout
> > implementation). This regularly happens, especially with our iOS app in
> > which we can't upload in the background.
>
> What do you currently have in place for the uploads?
>
> Can it be wrapped in a service that ruote talks to via a participant
> (and/or
> a receiver as you suggested below)?
>
>
> > I guess the issue here is that we are trying to be as fast as possible
> with
> > the OCR so we need to treat each picture as soon as they are uploaded and
> > in parallel.
>
> OK.
>
>
> > On the other hand, if a single thread is polling the db then it's
> possible,
> > what about a special kind of Receiver that knows "these are the waiting
> > participants with their picture id, and these are the uploaded pictures
> in
> > the last X seconds" ?
>
> Yes, much better than my Participant + ImageUploadService suggestion. It
> would have to survive a restart though. That knowledge in the receiver is a
> cache somehow. When the participant "arrives" it may be allowed a db query
> (cache miss, pass through) to see if the images are really not there (the
> other case would be the images are there, but the receiver forgot about
> them).
>
>
> > From another angle, we could also do the synchronization after the
> > processing of each each picture, which could be easier to implement ? If
> > each upload launches a separate ruote process, is there a way for these
> > subprocesses to communicate with a master process inside Ruote ? They are
> > still not launched by the master process, but this time everything
> happens
> > inside Ruote
> >
> > I think that having a way to do inter-process communication inside ruote
> > could be nice to have, what do you think ?
>
> There is the "listen" expression that brings some kind of inter-process
> communication, although participants [+ receivers] are probably better
> fits,
> they are sitting between ruote processes and external systems (other ruote
> processes are external systems somehow).
>
> ```
>   RuoteEngine = Ruote::Dashboard.new(...)
>
>   main_flow =
>     Ruote.define do
>       wait_for_images
>       # ... the rest
>     end
>
>   upload_flow =
>     Ruote.define do
>       wait_for_upload
>       perform_ocr
>       notify_main_flow
>     end
>
>   main_wfid =
>     RuoteEngine.launch(
>       main_flow, 'images' => images)
>
>   images.each do |image_info|
>     RuoteEngine.launch(
>       upload_flow, 'main_wfid' => main_wfid, 'image_info' => image_info)
>   end
>
> #
> # OR
> #
>
>   main_flow =
>     Ruote.define do
>       iterator :on => 'f:images', :to => 'f:image_info' do
>         wait_for_upload
>         perform_ocr
>         notify_main_flow
>       end
>       # ... the rest
>     end
>
>   # ...
> ```
>
> I prefer the first version, smaller processes interacting.
>
> So many ways to skin a cat.
>
> Though I still prefer launching the main flow when the images are uploaded
> and OCRized.
>
> Not sure if it's a case of "hey ruote needs to be adapted to our needs",
> sounds more like the classical case of "let's clearly define services so
> that
> ruote and other components in our architecture can leverage them".
>
> Please remember that I know nothing of your
> architecture/intentions/requirements, just letting my imagination loose.
>
> Thanks for reminding me of the receivers.
>
>
> Best regards,
>
> --
> John Mettraux - http://lambda.io/jmettraux
>
> --
> --
> you received this message because you are subscribed to the "ruote users"
> group.
> to post : send email to [email protected]
> to unsubscribe : send email to
> [email protected]
> more options : http://groups.google.com/group/openwferu-users?hl=en
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "ruote" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/openwferu-users/Y75D_6cXf3M/unsubscribe?hl=en
> .
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>

-- 
-- 
you received this message because you are subscribed to the "ruote users" group.
to post : send email to [email protected]
to unsubscribe : send email to [email protected]
more options : http://groups.google.com/group/openwferu-users?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"ruote" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to