On May 20, 2008, at 11:27 AM, Mark Smith wrote:
Hi all, I very much appreciate the patient help and advice, but I'm
still
having trouble getting even small files stored in my mogile setup.
Given the error message you've pasted (403?) this seems like a
configuration/setup problem. Are you sure that your MogileFS setup is
even working at all, even without touching mogtool? Well, it's easy
to figure out if it is or not. Here, this little script:
---
If the process fails, can you copy the output of it and paste on the
mailing list here? There should be a lot of text for all of the work
that the library is doing that will tell you what's going on. Or
anyway, will tell us what's going on, I don't expect most of it to
make sense unless you know the internals of MogileFS. :)
Thanks Mark. The test script worked fine. The 403 errors were only
occurring with "lighttpd" used in place of perlbal. This was a
suggestion (Ask's) which seemed like a good thing to try, but lighttpd
actually made things worse. With lighttpd, about 1 in 5 requests
failed to store, or failed to close.
I've now reverted back to the standard mogstored/perlbal config, and
it's *mostly* working but I'm concerned about the frequency of
mogstored just plain dying... I have to keep a "keepalive" script
running to relaunch any mogstored procs that have mysteriously stopped
running by checking my 16 storage nodes every 5 min.
I'm also worried about intermittent problems when pushing large
numbers of files (currently using mogtool). I'm not sure if this
corresponds to mogstored dying, or trying to hit a dead node before
the restart kicks in, or what. The errors given out by mogtool in
these intermittent cases are one of these:
> MogileFS backend error message: unknown_key unknown_key
> System error message: MogileFS::NewHTTPFile: unable to write to any
allocated storage node at /usr/lib64/perl5/5.8.5/x86_64-linux-thread-
multi/IO/Handle.pm line 399
> System error message: Close failed at /usr/bin/mogtool line 816,
<Sock_minime336:7001> line 215.
I can live with transmit errors once in a while, and for now mogtool
seems to be retrying and recovering. But if they crash the storage
node, that's a showstopper. If it's not normal for mogstored to just
die like that, I will spend some time trying to figure out why that
is. If it *is* normal for mogstored to just die sometimes, I need to
get rid of it quickly and get lighttpd over its intermittent 403
problems. I don't think I have time to do both so I need pick a
direction that's more likely to succeed. My time to evaluate this
solution for our application is running out quickly.
Thanks again for the replies. I would be lost without the help from
the list (which probably means the documentation is weak and puny, but
c'est la vie).