On Thu, 2010-12-16 at 23:06 -0500, Dan Scott wrote: > So, given the readily available virtual images and the ability to > install as many copies of Evergreen wherever you want, the idea is that > you should be able to practice these steps and gain knowledge before you > go into production.
But I don't even trust ME at this point. Uncheck a box, BOOM! Check it back and it is still dead and with no useful error message and bang on it all ya want and it is apparently determined to stay dead. Game over, reload the last savepoint. > It's too bad you waited this long to ask for help; there might be > something else going on in your configuration that we could help sort > out - have you told us what error message autogen.sh blows up with? Does > it blow up with a stock install of Evergreen? Well I had been avoiding the lists because during the install phase because I'm hoping to be forced to learn enough about how this sucker works to be able to maintain it. Maybe even patch it if needed, my OO perl skills are still a bit weak but I'm working on that. But this issue was just driving me nuts. There is a complete dump posted today of the errors. But now the story goes really wierd. Poked around some more. The kaboom was always happening in the same place, same line number. Ok, where is that. Tracked it down to /openils/bin/org_tree_html_options.pl so if it won't yield useful info it is time to add some debug code of my own. Not exactly. Added a couple of print statements and now it works every time. Took em back out, still works. Diff it with the copy that came out of the tarball, same file. The md5sum matches the one captured on a backup server from a month ago from before I messed with it today. It hasn't changed between versions 1.6.0.3 and 1.6.0.8 so none of the point updates I have been getting around to today should have made any difference in that file. Never heard of a hardware glitch producing such a consistent problem over days. So now what am I hunting? I have a dump of the database now with the org chart how I want it, but do I dare continue before figuring out what is actually going on? Do I nuke the database again and try to recreate the error? The install is in a pair of kvm virtual machines, postgres on one, everything else on the other. Both are running debian-lenny for AMD64. The physical host for both of them is also running lenny. The physical host is a basic 2U server based on an AMI MB with two sockets for a dozen AMD64 cores, 32GB of ram and a RAID1, nothing too exotic. The buildhost has been rock solid so far, installed everything last year and it has just sat in the rack and done it's thing, currently has 401 days uptime.
signature.asc
Description: This is a digitally signed message part
