Re: mogstored dying: redux

2008-05-21 Thread Greg Connor


On May 20, 2008, at 11:27 AM, Mark Smith wrote:

Hi all, I very much appreciate the patient help and advice, but I'm  
still

having trouble getting even small files stored in my mogile setup.


Given the error message you've pasted (403?) this seems like a
configuration/setup problem.  Are you sure that your MogileFS setup is
even working at all, even without touching mogtool?  Well, it's easy
to figure out if it is or not.  Here, this little script:

---

If the process fails, can you copy the output of it and paste on the
mailing list here?  There should be a lot of text for all of the work
that the library is doing that will tell you what's going on.  Or
anyway, will tell us what's going on, I don't expect most of it to
make sense unless you know the internals of MogileFS.  :)



Thanks Mark.  The test script worked fine.  The 403 errors were only  
occurring with lighttpd used in place of perlbal.  This was a  
suggestion (Ask's) which seemed like a good thing to try, but lighttpd  
actually made things worse.  With lighttpd, about 1 in 5 requests  
failed to store, or failed to close.


I've now reverted back to the standard mogstored/perlbal config, and  
it's *mostly* working but I'm concerned about the frequency of  
mogstored just plain dying... I have to keep a keepalive script  
running to relaunch any mogstored procs that have mysteriously stopped  
running by checking my 16 storage nodes every 5 min.


I'm also worried about intermittent problems when pushing large  
numbers of files (currently using mogtool).  I'm not sure if this  
corresponds to mogstored dying, or trying to hit a dead node before  
the restart kicks in, or what.  The errors given out by mogtool in  
these intermittent cases are one of these:

 MogileFS backend error message: unknown_key unknown_key
 System error message: MogileFS::NewHTTPFile: unable to write to any  
allocated storage node at /usr/lib64/perl5/5.8.5/x86_64-linux-thread- 
multi/IO/Handle.pm line 399
 System error message: Close failed at /usr/bin/mogtool line 816,  
Sock_minime336:7001 line 215.



I can live with transmit errors once in a while, and for now mogtool  
seems to be retrying and recovering.  But if they crash the storage  
node, that's a showstopper.   If it's not normal for mogstored to just  
die like that, I will spend some time trying to figure out why that  
is.  If it *is* normal for mogstored to just die sometimes, I need to  
get rid of it quickly and get lighttpd over its intermittent 403  
problems.  I don't think I have time to do both so I need pick a  
direction that's more likely to succeed.  My time to evaluate this  
solution for our application is running out quickly.


Thanks again for the replies.  I would be lost without the help from  
the list (which probably means the documentation is weak and puny, but  
c'est la vie).


Re: mogstored dying: redux

2008-05-21 Thread Ask Bjørn Hansen


On May 21, 2008, at 3:17, Greg Connor wrote:

Thanks Mark.  The test script worked fine.  The 403 errors were only  
occurring with lighttpd used in place of perlbal.  This was a  
suggestion (Ask's) which seemed like a good thing to try, but  
lighttpd actually made things worse.  With lighttpd, about 1 in 5  
requests failed to store, or failed to close.


Oh, I'm sorry.  I realize now that the make lighttpd work patch was  
never committed, darn.  Try the patch below.


http://lists.danga.com/pipermail/mogilefs/2007-November/001401.html

--- server/lib/MogileFS/Device.pm   (revision 1177)
+++ server/lib/MogileFS/Device.pm   (working copy)
@@ -371,7 +371,7 @@
 my $ans = $sock;

 # if they don't support this method, remember that
-if ($ans  $ans =~ m!HTTP/1\.[01] (400|405|501)!) {
+if ($ans  $ans =~ m!HTTP/1\.[01] (400|501)!) {
 $self-{no_mkcol} = 1;
 # TODO: move this into method on device, which propogates to  
parent
 # and also receive from parent.  so all query workers share  
this knowledge



--
http://develooper.com/ - http://askask.com/




Re: mogstored dying: redux

2008-05-21 Thread Arthur Bebak

Greg Connor wrote:





MogileFS backend error message: unknown_key unknown_key
System error message: Close failed at /usr/bin/mogtool line 816,  

Sock_minime336:7001 line 78.
This was try #1 and it's been 1.06 seconds since we first tried.   

Retrying...



I am also seeing a large number of these errors:

System error message: MogileFS::Backend: tracker socket never became 
readable (minime336:7001) when sending command: [create_open 
domain=dbbackupsfid=0class=dbbackups-recentmulti_dest=1key=dwh-20080519-vol9,99 
] at /usr/lib/perl5/site_perl/5.8.5/MogileFS/Client.pm line 268


  Close failed at /usr/bin/mogtool line 816
  unable to write to any allocated storage node at 
/usr/lib64/perl5/5.8.5/x86_64-linux-thread-multi/IO/Handle.pm line 399

  Connection reset by peer
  tracker socket never became readable
  socket closed on read at 
/usr/lib/perl5/site_perl/5.8.5/MogileFS/NewHTTPFile.pm line 335
  couldn't connect to mogilefsd backend at 
/usr/lib/perl5/site_perl/5.8.5/MogileFS/Client.pm line 268


Greg, superficially looking at this it seems that all the errors are
networking related with failing socket calls and connectivity issues.

You may want to check for pocket loss on your network and for latency issues.
It may even be something as simple as a bad switch/cable
somewhere or somebody else intermittently pushing a lot of traffic through
your local LAN when you're testing (which I assume is
on a GBit network, right?).

Anyway, something to look at.



--
Arthur Bebak
[EMAIL PROTECTED]


Re: mogstored dying: redux

2008-05-20 Thread Mark Smith
 Hi all, I very much appreciate the patient help and advice, but I'm still
 having trouble getting even small files stored in my mogile setup.

Given the error message you've pasted (403?) this seems like a
configuration/setup problem.  Are you sure that your MogileFS setup is
even working at all, even without touching mogtool?  Well, it's easy
to figure out if it is or not.  Here, this little script:

---
use MogileFS::Client;

$MogileFS::DEBUG = 1;

my $mogc = MogileFS::Client-new(
domain = foo.com::my_namespace,
hosts  = ['10.0.0.2:1234'],
);

my $fh = $mogc-new_file(some_key, some_class);

print $fh test;

unless ($fh-close) {
die Error writing file:  . $mogc-errcode . :  . $mogc-errstr . \n;
}

sleep 5;
my @urls = $mogc-get_paths($key);
print path: $_\n foreach @urls;

$mogc-delete(some_key);
---

Take that, put it on a machine that has the MogileFS client libraries,
and change the values it's using to connect to the server to point at
your tracker.  Then put in a valid class instead of some_class and
give it a shot.  Does it work?  Do you get paths printed?  (I haven't
tested this script, so you might need to kick it a little if there are
any syntax errors and the like.  Just kinda tossed it together.)

If the process fails, can you copy the output of it and paste on the
mailing list here?  There should be a lot of text for all of the work
that the library is doing that will tell you what's going on.  Or
anyway, will tell us what's going on, I don't expect most of it to
make sense unless you know the internals of MogileFS.  :)

Thanks!


-- 
Mark Smith / xb95
[EMAIL PROTECTED]


Re: mogstored dying: redux

2008-05-19 Thread Andy Lo A Foe
Hi,

In my experience WebDAV storage setup (lighttpd, nginx) are much
better at handling large chunks/files than mogstored. I use nginx in a
production environment with files ranging from a couple of bytes to a
gigabyte, no problem. In the pre-production tests I ran mogstored died
reliably with OOM's when handling 100MB+ files. Use mogstored only to
manage the usage stats on your storage nodes in that case.

Gr,
Andy

On Mon, May 19, 2008 at 3:25 AM, Greg Connor [EMAIL PROTECTED] wrote:

 On May 18, 2008, at 5:59 PM, Ask Bjørn Hansen wrote:


 On May 18, 2008, at 17:54, Greg Connor wrote:

  Running.
  Out of memory!
  Out of memory!


 Yikes.   64MB chunks shouldn't be that bad.  Are the storage nodes
 otherwise loaded (high IO wait or some such).


 Nope, the storage nodes are doing nothing other than mogstored at this time.


 Did you try using another HTTP server (lighttpd, nginx, apache, ...) for
 the file transfers to the storage nodes?  I suspect most/many users use that
 so mogstored doesn't get used that much in high traffic environments ...

 No I have not tried this.  Do you believe mogstored is pretty useless in a
 production environment?  If that's true and widely known, it's too bad the
 documents don't reflect this... Is there a document or list posting that
 explains what parts of mogilefs should be tuned (or outright replaced) for a
 high-traffic application?

 Are there documents stashed somewhere that I'm missing?  I looked at the
 new wiki (last updates about 5 and 10 months ago) and read everything
 available there, and I've read most of the man pages.  I keep finding stuff
 that I'm totally not getting.  I would welcome some advice or pointers on
 how to get apache set up to replace mogstored for file transfers...


Re: mogstored dying: redux

2008-05-19 Thread Greg Connor

Andy Lo A Foe wrote:

Hi,

In my experience WebDAV storage setup (lighttpd, nginx) are much
better at handling large chunks/files than mogstored. I use nginx in a
production environment with files ranging from a couple of bytes to a
gigabyte, no problem. In the pre-production tests I ran mogstored died
reliably with OOM's when handling 100MB+ files. Use mogstored only to
manage the usage stats on your storage nodes in that case.



Hi Andy, thanks for the reply.

Do you feel nginx is better than lighttpd for this?  How about apache?

Is it simply a matter of having the other httpd listen on another port, 
and entering that port number in a config file?  Did you have to do 
anything special to configure httpd (for example, to automatically 
create directories that don't yet exist for PUT requests?)


thanks again


Re: mogstored dying: redux

2008-05-19 Thread Justin Huff
We've been using lighttpd, and it works OK. We have run into problems
using the default mogile-generated config not being able to fully
utilize the devices. I *think* we have that solved now though. We also
saw possible stat caching issues around new dir creation.

server.stat-cache-engine = disable
server.network-backend = linux-sendfile
server.event-handler = linux-sysepoll
server.max-worker = 8

lighttpd-1.4.15

--Justin

Greg Connor wrote:
 Andy Lo A Foe wrote:
 Hi,

 In my experience WebDAV storage setup (lighttpd, nginx) are much
 better at handling large chunks/files than mogstored. I use nginx in a
 production environment with files ranging from a couple of bytes to a
 gigabyte, no problem. In the pre-production tests I ran mogstored died
 reliably with OOM's when handling 100MB+ files. Use mogstored only to
 manage the usage stats on your storage nodes in that case.
 
 
 Hi Andy, thanks for the reply.
 
 Do you feel nginx is better than lighttpd for this?  How about apache?
 
 Is it simply a matter of having the other httpd listen on another port,
 and entering that port number in a config file?  Did you have to do
 anything special to configure httpd (for example, to automatically
 create directories that don't yet exist for PUT requests?)
 
 thanks again
 


Re: mogstored dying: redux

2008-05-19 Thread Ask Bjørn Hansen


On May 19, 2008, at 8:49 AM, Greg Connor wrote:

Is it simply a matter of having the other httpd listen on another  
port, and entering that port number in a config file?  Did you have  
to do anything special to configure httpd (for example, to  
automatically create directories that don't yet exist for PUT  
requests?)


Enable WebDAV should do that -- however mogilefs should be able to  
configure at least apache and lighttpd automatically.  Be sure to use  
svn trunk as there were some fixes to some of that recently:


http://code.sixapart.com/svn/mogilefs/trunk/server/CHANGES


 - ask

--
http://develooper.com/ - http://askask.com/




Re: mogstored dying: redux

2008-05-18 Thread Greg Connor


On May 18, 2008, at 5:59 PM, Ask Bjørn Hansen wrote:



On May 18, 2008, at 17:54, Greg Connor wrote:


  Running.
  Out of memory!
  Out of memory!



Yikes.   64MB chunks shouldn't be that bad.  Are the storage nodes  
otherwise loaded (high IO wait or some such).



Nope, the storage nodes are doing nothing other than mogstored at this  
time.



Did you try using another HTTP server (lighttpd, nginx, apache, ...)  
for the file transfers to the storage nodes?  I suspect most/many  
users use that so mogstored doesn't get used that much in high  
traffic environments ...


No I have not tried this.  Do you believe mogstored is pretty useless  
in a production environment?  If that's true and widely known, it's  
too bad the documents don't reflect this... Is there a document or  
list posting that explains what parts of mogilefs should be tuned (or  
outright replaced) for a high-traffic application?


Are there documents stashed somewhere that I'm missing?  I looked at  
the new wiki (last updates about 5 and 10 months ago) and read  
everything available there, and I've read most of the man pages.  I  
keep finding stuff that I'm totally not getting.  I would welcome some  
advice or pointers on how to get apache set up to replace mogstored  
for file transfers...