Hi Dormando thanks for your help. I've run !jobs in both trackers, here are the results:
[EMAIL PROTECTED] /]# telnet 127.0.0.1 6001 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. !jobs delete count 1 delete desired 1 delete pids 12118 fsck count 1 fsck desired 1 fsck pids 12126 monitor count 1 monitor desired 1 monitor pids 12124 queryworker count 5 queryworker desired 5 queryworker pids 12119 12120 12121 12122 12123 reaper count 1 reaper desired 1 reaper pids 12125 replicate count 1 replicate desired 1 replicate pids 12117 [EMAIL PROTECTED] /]# telnet 127.0.0.1 6001 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. !jobs delete count 1 delete desired 1 delete pids 11500 fsck count 1 fsck desired 1 fsck pids 13505 monitor count 1 monitor desired 1 monitor pids 11516 queryworker count 5 queryworker desired 5 queryworker pids 13498 13499 13500 13501 13502 reaper count 1 reaper desired 1 reaper pids 13546 replicate count 1 replicate desired 1 replicate pids 11499 this matches with the number of DB connections I see on the tracker, but in the DB server there are many more connections coming from the tracker server. The total number of connections I saw directly on MySQL is equal to the number of connections I saw with netstat on the DB server. All the connections are in state ESTABLISHED. Do you have any idea on what can be happening? Thanks again! Fernando On Thu, 2008-04-10 at 16:04 -0700, dormando wrote: > telnet to the tracker's management port and run !jobs. > > you should have one DB connection per job type that exists. I figure > you're probably sending most of your traffic to one of the trackers? > > If you have fewer workers than connections something else might be up. > Are those connections busy with queries? Can you tell what kind of > queries? etc. > > -Dormando > > Fernando Gomes wrote: > > Hello > > > > I'm using MogileFS with mysql DB, with two trackers and two storage > > nodes. There are around 65000 files stored, and as they are replicated, > > about 130000 total, distributed by the different devices. > > > > All was working as expected until recently I noticed two problems > > (perhaps related). One is that from time to time the files I got from > > the client (java) wasn't the requested file (I suppose that might be > > something like reported here: > > http://davidrasch.com/2008/01/29/mogilefs-and-race-condition). I'll try > > to look to the client java code in order to see what can be done. > > > > The other problem is that today the trackers started creating a lot of > > database connections (one tracker more than the other), almost taking > > the database down. After finding that the database performance problem > > was caused by one of the trackers I restarted it and things got a bit > > better, but I still see many connections on the database from the > > trackers. > > > > When trying to diagnose the problem also found something that seems > > strange to me (perhaps caused by the hours spent on this problem) - the > > number of connections from one tracker to the database, evaluated by > > netstat -n |grep 3306, is not the same if I execute netstat on the > > tracker server or on the database server (I am now having 8 connections > > from tracker 2 to the database server if I run netstat on the tracker, > > but I see 68 connections from tracker2 to the database if i run it on > > the database server. > > > > If you can give me some tip about what might be the problem it will be > > very useful to me! > > > > Thanks! > > > > Fernando > > >
