Re: [ZODB-Dev] Wrong blob file being returned (similar to https://mail.zope.org/pipermail/zodb-dev/2011-February/014067.html )

2011-07-13 Thread steve
Hello Jim,

On 07/12/2011 07:28 PM, Jim Fulton wrote:
 On Tue, Jul 12, 2011 at 6:33 AM, stevest...@lonetwin.net  wrote:
  Hi,
 [...snip...]
  a. our site is image heavy (36293 blob files) and the servers are behind a 
 load
  balancer so in a single request to the web-app (a repoze.bfg site) we might 
 even
  load collectively 20+ blobs from any of the 4 servers.

  b. zeo connection string on the clients
  zodb_uri =
  
 zeo://xxx..xxx.xxx:8886/?blob_dir=%(here)s/../var/blobsshared_blob_dir=falseconnection_pool_size=50cache_size=1024MBdrop_cache_rather_verify=true

  c. $ cat var/blobs/.layout
  zeocache

  Any comments/suggestion on how to isolate and fix this problem would be 
 appreciated.

 We have a number of large apps with multiple terabytes of blobs and a
 vaguely similar configuration. We haven't seen this sort of problem.
 One difference is that we set the blob cache size.  I don't suppose
 you're running of disk space?


No we aren't running out of disk space and I hadn't really thought about 
setting 
a limit to the blob cache size (I assumed there'd be a builtin default). In 
fact, I didn't know this was configurable since it isn't mentioned in the docs 
for repoze.zodbconn (which is what we use to connect to the db)[1]. Fortunately 
it appears like repoze.zodbconn just passes along the parameters it doesn't 
understand to the underlying ClientStorage connector. I shall try setting a 
limit (which instinctively seems like a good thing to do anyways to avoid 
original state and cache state from going out of sync).

 The only suggestion I have is to keep an eye on it and try to
 reporoduce the problem.

Yes, I shall attempt to do that on a dev instance of the app.

 I would think that if a request returns an
 incorrect Blob, it would continue to. If someone reports a bad blob,
 get the URL and see if you can reproduce by making the same request to
 each of the app servers, bypassing the load balencer.  If one server
 is being bad, you can remove it from the LB pool to debug it.

That's the problem. I'd assumed the same behavior. Unfortunately it appears 
like 
incorrect blobs aren't always returned for subsequent requests. I traced down 
one such request to a single server, removed it from the LB and checked the 
same 
URL but the request didn't give me an incorrect image. I wonder, would this 
also 
be dependent on the connection pool or the thread serving the request?

Anyways, thanks for your help, I shall test and report back if I find a 
reliable 
way to reproduce this.

cheers,
- steve

[1] http://docs.repoze.org/zodbconn/narr.html#zeo-uri-scheme
-- 
random spiel: http://lonetwin.net/
what i'm stumbling into: http://lonetwin.stumbleupon.com/
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Wrong blob file being returned (similar to https://mail.zope.org/pipermail/zodb-dev/2011-February/014067.html )

2011-07-13 Thread steve
Hi William,

On 07/13/2011 02:26 AM, William Heymann wrote:
 On Tuesday 12 July 2011, steve wrote:
  Hi,

  I have a setup where 4 ZEO clients running on separate machines connect to
  a single DB server which runs on a different system by itself. The ZEO
  clients and the DB server all are at version ZODB3-3.10.2. Now, since the
  last few weeks some of our users have been reporting that they
  occasionally see incorrect images being returned.

 One thing you may want to look at is the load balancer. Apache has a bug that
 keeps being opened and closed again for swapping data between requests under
 load. Because it happens at the apache level and not the zope level you will
 never see this problem in any of the zope logs.

hmmm, interesting. Since we are using Amazon's Elastic LB to load balance these 
app. servers running of EC2 instances and we do not have any direct access to 
the ELB logs or systems, pining this problem on the ELB might be difficult 
(although AFIAK, the ELB setup also uses apache -- I could be mistaken though).

If it isn't too much effort could you please point me to the apache bug you 
mentioned ?

 Just make sure you don't have a similar situation or you could end up
 debugging the wrong thing to a huge waste of time. In my case I spent a lot of
 time debugging zope and when I finally discovered it was apache that was
 screwing up I ended up just dumping apache for nginx.


Thanks for the info. I shall keep a lookout for this.

cheers,
- steve

-- 
random spiel: http://lonetwin.net/
what i'm stumbling into: http://lonetwin.stumbleupon.com/
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Wrong blob file being returned (similar to https://mail.zope.org/pipermail/zodb-dev/2011-February/014067.html )

2011-07-13 Thread Jim Fulton
On Wed, Jul 13, 2011 at 7:57 AM, steve st...@lonetwin.net wrote:
...
 That's the problem. I'd assumed the same behavior. Unfortunately it appears
 like incorrect blobs aren't always returned for subsequent requests. I
 traced down one such request to a single server, removed it from the LB and
 checked the same URL but the request didn't give me an incorrect image.

Are you *ever* able to get incorrect blob data when going directly
against an app
instance?

 I
 wonder, would this also be dependent on the connection pool or the thread
 serving the request?

If it's a bug in ZODB, it's one I haven't seen and I'd have no idea
what it's parameters might be. I suppose anything's possible. :)

Note that, in the actually database file, blobs have no data.
All that matters is their oid and serial.  I can't imagine
how this could behave differently across threads.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Wrong blob file being returned (similar to https://mail.zope.org/pipermail/zodb-dev/2011-February/014067.html )

2011-07-13 Thread William Heymann
On Wednesday 13 July 2011, you wrote:

 hmmm, interesting. Since we are using Amazon's Elastic LB to load balance
 these app. servers running of EC2 instances and we do not have any direct
 access to the ELB logs or systems, pining this problem on the ELB might be
 difficult (although AFIAK, the ELB setup also uses apache -- I could be
 mistaken though).
 
 If it isn't too much effort could you please point me to the apache bug you
 mentioned ?

https://issues.apache.org/bugzilla/show_bug.cgi?id=46949

That is one I ran into before. There are some others like it but I did not 
immediately find them. The bug has been closed as fixed but people still seem 
to be having that problem. I found others besides myself that ran into that 
problem also on supposedly fixed versions and most just ending up going to 
nginx instead.


___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Wrong blob file being returned (similar to https://mail.zope.org/pipermail/zodb-dev/2011-February/014067.html )

2011-07-12 Thread Jim Fulton
On Tue, Jul 12, 2011 at 6:33 AM, steve st...@lonetwin.net wrote:
 Hi,

 I have a setup where 4 ZEO clients running on separate machines connect to a
 single DB server which runs on a different system by itself. The ZEO clients 
 and
 the DB server all are at version ZODB3-3.10.2. Now, since the last few weeks
 some of our users have been reporting that they occasionally see incorrect
 images being returned.

 On googling I came across the thread below and was wondering whether I am 
 seeing
 the same thing as this:

 https://mail.zope.org/pipermail/zodb-dev/2011-February/014067.htm

 ...although the setup and version is different (ie: ZODB-3.8 and
RelStorage).

Yeah, quite a bit different.


 Unfortunately, sorry but I do not know enough about ZODB internals to be able 
 to
 say for sure. Is there a way I can test whether the problem is indeed with the
 wrong blob file being returned from the blobcache ? FWIW, we haven't figured 
 out
 a way to consistently reproduce this error ourselves.

Dang.

 Other things that
 may/may-not be relevant:

 a. our site is image heavy (36293 blob files) and the servers are behind a 
 load
 balancer so in a single request to the web-app (a repoze.bfg site) we might 
 even
 load collectively 20+ blobs from any of the 4 servers.

 b. zeo connection string on the clients
 zodb_uri =
 zeo://xxx..xxx.xxx:8886/?blob_dir=%(here)s/../var/blobsshared_blob_dir=falseconnection_pool_size=50cache_size=1024MBdrop_cache_rather_verify=true

 c. $ cat var/blobs/.layout
 zeocache

 Any comments/suggestion on how to isolate and fix this problem would be 
 appreciated.

We have a number of large apps with multiple terabytes of blobs and a
vaguely similar configuration. We haven't seen this sort of problem.
One difference is that we set the blob cache size.  I don't suppose
you're running of disk space?

The only suggestion I have is to keep an eye on it and try to
reporoduce the problem. I would think that if a request returns an
incorrect Blob, it would continue to. If someone reports a bad blob,
get the URL and see if you can reproduce by making the same request to
each of the app servers, bypassing the load balencer.  If one server
is being bad, you can remove it from the LB pool to debug it.

Jim

-- 
Jim Fulton
http://www.linkedin.com/in/jimfulton
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Wrong blob file being returned (similar to https://mail.zope.org/pipermail/zodb-dev/2011-February/014067.html )

2011-07-12 Thread William Heymann
On Tuesday 12 July 2011, steve wrote:
 Hi,
 
 I have a setup where 4 ZEO clients running on separate machines connect to
 a single DB server which runs on a different system by itself. The ZEO
 clients and the DB server all are at version ZODB3-3.10.2. Now, since the
 last few weeks some of our users have been reporting that they
 occasionally see incorrect images being returned.

One thing you may want to look at is the load balancer. Apache has a bug that 
keeps being opened and closed again for swapping data between requests under 
load. Because it happens at the apache level and not the zope level you will 
never see this problem in any of the zope logs.

Just make sure you don't have a similar situation or you could end up 
debugging the wrong thing to a huge waste of time. In my case I spent a lot of 
time debugging zope and when I finally discovered it was apache that was 
screwing up I ended up just dumping apache for nginx.

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev