Getchell, Adam wrote:
And now that we've learned that Plone products live in the
filesystem (of the front ends) as well as the database (on the
backend), I'm curious if there's any work being done to scale out
the back end too.
Only data lives in the database. "Data" includes settings and
content, and can include templates, but whilst they are loaded from
a ZEO DB server, they are executed by the ZEO client handling the
request.

If you upgrade certain products, like PloneCAS, the settings in the
database will prevent the Product from working, even with the new
product files.

So, we have to go through and uninstall the product in Plone, before
upgrading the product.

Unfortunately, for some badly-behaved products (like PloneCAS), you
can't successfully uninstall it in the ZMI, either, to get rid of the
troublesome database settings.

And, it doesn't work on Plone 3.

Sure - badly behaved products suck.

But this has nothing whatsoever to do with scalability or the difference between a ZEO server and a ZEO client.

Our Zope/Plone setup with 20+ instances falls over 4-5 times a
day.
I would caution against having 20+ instances in in a single Zope
setup.

Sorry, to be more specific, I mean 20+ Plone sites on an instance. Or
is that still too many?

20+ Plone sites in one Zope instance is probably not a good idea, unless those 20+ sites are related in some way and share the same set of products.

Should we be separate out more instances (thus, ZEO backends), or
servers?

Probably.

That's the aforementioned scalability issue.

What is?

Theoretically, I'd like a flock of Zope front ends, and group of ZEO
storage units, and as many sites as needed on them. (That way, I
could get redundancy for 20 sites in under 60 boxes.)

This only makes sense of that flick of sites is completely homogeneous. If you have 20 different sites each supported by 20 different software stacks (obviously with some overlap, but with enough divergence for them to be different, which is almost always the case), then each of them should be on their own deployment. Otherwise, you've got an enormous headache in making sure that they don't conflict. At the most basic level, what happens when site 1 wants to upgrade to Plone 3 and site 2 needs to stay with Plone 2.5 for a bit longer?

But yes, we already carved out our main site and put it on a separate
box, just to see if we could make sense of the errors and isolate the
problem.

Good.

Look at www.supervisord.org, but you shouldn't need to restart. You
 need to identify why it falls over, not try to plaster over the
symptom. Start by looking at your logs to understand why it's
falling over.

Logs, yeah, we looked at them all right. Posted questions about them,
hung out on the zope IRC channel, etc. Not much help.

*shrug*

There's not much we can do to help you here with no details or references.

For one thing, they were causing some of the issues by themselves,
writing to disk when certain errors were triggered every 200
milliseconds or so. So we had to curtail them just on that.

That sounds insane. I've never seen a Plone site do that.

Here's an example of what our log files told us (this one fires off
up to twice per second now, on a box running 1 site, since we just
turned detailed logging back on):

If you're getting an error twice per second, it's time to take the site offline and fix the problem, not solider on mindlessly.

2008-05-09T10:52:41 ERROR root Exception while rendering an error
message Traceback (most recent call last): File
"/usr/local/lib/zope/lib/python/OFS/SimpleItem.py", line 223, in
raise_standardErrorMessage v = s(**kwargs) File
"/home/zope/instance1/Products/CMFCore/FSPythonScript.py", line 108,
in __call__ return Script.__call__(self, *args, **kw) File
"/usr/local/lib/zope/lib/python/Shared/DC/Scripts/Bindings.py", line
311, in __call__ return self._bindAndExec(args, kw, None) File
"/usr/local/lib/zope/lib/python/Shared/DC/Scripts/Bindings.py", line
348, in _bindAndExec return self._exec(bound_data, args, kw) File
"/home/zope/instance1/Products/CMFCore/FSPythonScript.py", line 164,
in _exec result = f(*args, **kw) File "Script (Python)", line 18, in
standard_error_message File
"/usr/local/lib/zope/lib/python/Shared/DC/Scripts/Bindings.py", line
311, in __call__ return self._bindAndExec(args, kw, None) File
"/usr/local/lib/zope/lib/python/Shared/DC/Scripts/Bindings.py", line
348, in _bindAndExec return self._exec(bound_data, args, kw) File
"/home/zope/instance1/Products/CMFCore/FSPageTemplate.py", line 195,
in _exec result = self.pt_render(extra_context=bound_names) File
"/home/zope/instance1/Products/CacheSetup/patch_cmf.py", line 48, in
FSPT_pt_render result =
FSPageTemplate.inheritedAttribute('pt_render')( File
"/home/zope/instance1/Products/CacheSetup/patch_cmf.py", line 123, in
PT_pt_render tal=not source, strictinsert=0)() File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 238, in
__call__ self.interpret(self.program) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 281, in
interpret handlers[opcode](self, args) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 749, in
do_useMacro self.interpret(macro) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 281, in
interpret handlers[opcode](self, args) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 457, in
do_optTag_tal self.do_optTag(stuff) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 442, in
do_optTag return self.no_tag(start, program) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 437, in
no_tag self.interpret(program) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 281, in
interpret handlers[opcode](self, args) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 780, in
do_defineSlot self.interpret(block) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 281, in
interpret handlers[opcode](self, args) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 780, in
do_defineSlot self.interpret(block) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 281, in
interpret handlers[opcode](self, args) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 457, in
do_optTag_tal self.do_optTag(stuff) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 442, in
do_optTag return self.no_tag(start, program) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 437, in
no_tag self.interpret(program) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 281, in
interpret handlers[opcode](self, args) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 728, in
do_defineMacro self.interpret(macro) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 281, in
interpret handlers[opcode](self, args) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 772, in
do_defineSlot self.interpret(slot) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 281, in
interpret handlers[opcode](self, args) File
"/usr/local/lib/zope/lib/python/TAL/TALInterpreter.py", line 507, in
do_setLocal_tal self.engine.setLocal(name,
self.engine.evaluateValue(expr)) File
"/usr/local/lib/zope/lib/python/Products/PageTemplates/TALES.py",
line 221, in evaluate return expression(self) File
"/usr/local/lib/zope/lib/python/Products/PageTemplates/ZRPythonExpr.py",
line 47, in __call__ return eval(code, g, {}) File "Python expression
"here.portal_redirection.getRedirectFromPathInfo(request.PATH_INFO)"",
line 1, in <expression> File
"/home/zope/instance1/Products/RedirectionTool/RedirectionTool.py",
line 205, in getRedirectFromPathInfo pathelements =
pathelements[pathelements.index(siteroot[-1])+1:] ValueError:
list.index(x): x not in list

This is a problem with RedirectionTool, that's being triggered when some error message ensues.

If you have any ideas on how to fix this, I'd love to hear about it.
We've spent a couple hundred hours on the problems so far ...

I don't wish to be negative, but if you've spent "a couple of hundred hours" and you haven't get got to the point where you've put a pdb in that line and figured out what the problem is (or paid someone to do it for you, if you don't know how), you need to look at how you're investing your time and think about how you can get outside help in to help you diagnose the problem before you sink any more cost into it.

Martin

--
Author of `Professional Plone Development`, a book for developers who
want to work with Plone. See http://martinaspeli.net/plone-book


_______________________________________________
Setup mailing list
[email protected]
http://lists.plone.org/mailman/listinfo/setup

Reply via email to