Hi,
I recently discovered an interesting bug, that occurs when you do a lot
of parallel requests to Deltacloud API.
Let say you start Deltacloud API with the 'mock' driver as default
driver. Then you do 3 parallel requests to retrieve RHEV-M images,
realms and hardware_profiles. In that case I get this error:
<snip>
E, [2013-04-18T12:44:13.629135 #11892] ERROR -- 500: [LoadError]
uninitialized constant Deltacloud::Drivers::Rhevm
/home/mfojtik/code/core/server/lib/deltacloud/helpers/driver_helper.rb:57:in
`rescue in driver'
/home/mfojtik/code/core/server/lib/deltacloud/helpers/driver_helper.rb:54:in
`driver'
127.0.0.1 - - [18/Apr/2013 12:44:13] "GET /api HTTP/1.1" rhevm
http://provider.url 500 127410 0.1480
</snip>
Note, that when I do the curl request manually with same params
(provider, accept, I get 200 back).
I think this might be a threading issue, but I'm not sure how to fix it.
I started looking at 'driver' method and what we do here is that we
require the 'driver' source file and then return the initialized driver.
We do this for every request (except we don't 'require' the driver
source if the driver class exists in current namespace).
My impression is that the 'require' method is not thread-safe. It can be
demonstrated on this code:
begin
driver_class
rescue NameError => e
require_relative(driver_source_name) ? retry :
raise(LoadError.new(e.message))
end
In this case, we try to return the 'driver_class' and if the constant
does not exists, we require the driver source file and call 'retry' that
will then try to return it again. I think, if you have multiple parallel
requests, the 'require_relative' (which is just alias for 'require')
behave incorrectly, because multiple threads are requiring the same file
in parallel.
My fix for this, that so far works for me is to change the line after
'rescue NameError => e' to:
Thread.exclusive { require_relative(driver_source_name) } ? retry :
raise(LoadError.new(e.message))
With this I don't run into any problems with parallel requests (so far).
However, I'm not 'ruby threads' expert, so any advise from somebody more
experienced is appreciated.
Also I think one task, we need to do in Deltacloud/CIMI in close future
would be to identify spots in our code base that could be potentially
not thread safe.
Having somebody to write some reasonable benchmarking tool (like 'ab'
with different urls/drivers/etc) would help to identifying this spots.
-- Michal
--
Michal Fojtik <mfoj...@redhat.com>
Deltacloud API, CloudForms