Hi John,

If manage to get this working, we owe you a dinner (and more) the next time 
you're in Tokyo.

So here's the story with Sequel.  We are using 1.9.2-p290 and Sequel 3.37.0 
and mysql2 0.3.11.

Here's a sample of the SQL traces, if we add a logger to a Sequel 
connection:

https://gist.github.com/3175854

The trace is on a new and clean database, with no data in it, and simply 
instantiating the Route::Engine like so:

  db = Sequel.connect(db_settings)
  db.loggers << OurLogging.logger

  RuoteKit.engine = Ruote::Engine.new(
    Ruote::Worker.new(
      Ruote::Sequel::Storage.new(db)))

Even though we started no process definitions, Ruote's Engine immediately 
hits the DB.  But more importantly, it never stops.  The log just grows and 
grows and grows.

Regarding the FsStorage, your megabyte numbers are dramatically smaller 
than ours.  We are basically doing nothing as well, but I'll perform your 
tests..   I grant that disk-space is cheap, but there's not enough 
disk-space in the world available if we keep Ruote running for a year, 
without  deleting some of the stuff Ruote saves on the file system.  If we 
have to use FsStorage, we would need to run a cron job to clean things 
occassionally.  What files would we need to purge?  The structure of the 
FsStorage is rather daunting.

Also, when inspecting either the file system or the database, it appears 
that Ruote serializes any data - in a JSON object - any data passes through 
via input, or during the processes themselves.  It looks like it's not a 
good idea to be passing around large files inside of workitems, since they 
would either be persisted in the DB or FileSystem indefinitely.  Is that a 
correct conclusion?

So lots of questions, but I'm interested to hear about the Sequel storage 
issues.

Thanks in advance for helping us.  We planned on deploying on Friday, but 
this is looking like a deal breaker. :(

Chad



On Wednesday, July 25, 2012 8:30:48 PM UTC+9, John Mettraux wrote:
>
>
> On Wed, Jul 25, 2012 at 03:21:25AM -0700, Chad wrote: 
> > 
> > We are building a service on ruote, and have encountered some 
> performance 
> > problems.  We originally used Route::Sequel::Storage with the mysql2 
> > adapter.  On deployment we noticed that ruote made massive number of sql 
> > requests, producing numerous time outs. 
>
> Hello Chad, 
>
> do you have details? I'd like to make ruote-sequel better. 
>
>
> > For comparison, we switched to Route::FsStorage.  (We do not want to use 
> > the filesystem for storage, in case we need to scale ruote to another 
> > server).  Anyway, I ran 200 requests (via thin sinatra web service) to a 
> > simple process definition that takes a string "Hello" and outputs via 
> one 
> > participant.  The FsStorage directory grew to 4.1MB.   This seems like a 
> > lot for just inputting and outputting.   This is going to be problem for 
> > our larger jobs that we have been writing. 
>
> I ran two tests: 
>
> ---8<--- 
> require 'rufus-json/automatic' 
> require 'ruote' 
> require 'ruote-fs' 
>
> FileUtils.rm_rf('work') rescue nil 
>
> dboard = 
> Ruote::Dashboard.new(Ruote::Worker.new(Ruote::FsStorage.new('work'))) 
> dboard.noisy = ENV['NOISY'].to_s == 'true' 
>
> dboard.register_participant 'toto' do |workitem| 
>   print '.' 
> end 
>
> 200.times do 
>   wfid = dboard.launch(Ruote.define do 
>     toto 
>   end) 
>   dboard.wait_for(wfid) 
> end 
>
> puts 
> puts `du -sh work/` 
> --->8--- 
>
> Those 200 hundred terminated flows take up 12K. 
>
> Then I moved to 200 hundred un-terminated flows: 
>
> ---8<--- 
> require 'rufus-json/automatic' 
> require 'ruote' 
> require 'ruote-fs' 
>
> FileUtils.rm_rf('work') rescue nil 
>
> dboard = 
> Ruote::Dashboard.new(Ruote::Worker.new(Ruote::FsStorage.new('work'))) 
> dboard.noisy = ENV['NOISY'].to_s == 'true' 
>
> dboard.register_participant 'toto', Ruote::NullParticipant 
>
> 200.times do 
>   dboard.launch(Ruote.define do 
>     toto 
>   end) 
> end 
>
> sleep 14.0 
>
> puts 
> puts `du -sh work/` 
> --->8--- 
>
> and it reached 1.6M. 
>
>
> > So I have 2 questions.  Are the massive amount of sql requests normal, 
> and 
>
> Yes, it's normal. Ruote needs to keep up with incoming msgs (orders) and 
> schedules so it polls its storage. 
>
> When activity slows down, the polling slows down a bit. 
>
> Some storage implementations (sorry, ruote-swf for example) use long 
> polling 
> techniques. 
>
> If you need less polling I could adapt (or provide means to adapt the 
> polling 
> frequency). 
>
>
> > is the amount of file system storage usual?  If so, how do people work 
> > handle this in production environments. 
>
> Yes, it's normal. I chose to not compress at all files for the fs storage 
> implementation... Disk space is cheap (I guess you work with virtual 
> servers). 
>
> It's not too difficult to modify a storage implementation to have some 
> degree 
> of compression. 
>
> Ruote keeps track of lots of redudant information (especially in 
> expressions), they are like many save points. It's uneconomical but easier 
> to 
> implement and easier to fix. 
>
>
> Best regards, 
>
> -- 
> John Mettraux - http://lambda.io/jmettraux 
>
>

-- 
you received this message because you are subscribed to the "ruote users" group.
to post : send email to [email protected]
to unsubscribe : send email to [email protected]
more options : http://groups.google.com/group/openwferu-users?hl=en

Reply via email to