how to diagnose server freeze with ddb?

2011-05-11 Thread cronfy
Hello,

I have a server that freezes under high load sometimes. It is on
FreeBSD 7.3. It does not respond neither by network nor to keyboard.
In the same time I can hit Ctrl-Alt-ESC and go to debugger - it works.

What can I try to do in DDB to find out the reason of server freezing?

Thanks in advance!

-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


7.3 - optimizing filesystem - cache all metadata

2011-04-19 Thread cronfy
Hello!

I have FreeBSD 7.3 server that is used for web sites. It performs many
filesystem operations, so filesystem performance is very important. I am
looking how can it be improved.

I already use vfs.lookup shared=1, it helped me some time ago to decrease
CPU time usage on filesystem operations. I also increased
vfs.ufs.dirhash_maxmem to 67108864. But It still sometimes takes several
seconds to ls directory that s not in the cache and fstat() calls sometimes
slow when IO is high.

Can filesystem performance be improved more? I think performance would
benefit from increasing memory used for file metadata cache. One of the most
frequent operations is fstat(). If it could be possible to tell FreeBSD to
keep all metadata cache in memory and never clear it, all repeating fstat()
operations would become very fast.

How can I see how much memory is used for filesystem cache? Is it possible
to increase this memory and increase time that cache entry is keeped in the
memory (probably to infinity)?

Thanks in advance.

-- 
Олег Петрачев
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


memoryuse vs vmemoryuse

2010-12-06 Thread cronfy
Hello!

I am trying to set user limits in login.conf, and I see there are
'memoryuse' and 'vmemoryuse'. Handbook describes only the former.. What is
the difference between them?

-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Rescan hard drives

2010-11-23 Thread cronfy
Hello,

I am using Adaptec 3405 with FreeBSD 7.3. After hot-swapping some hard
drives and deleting/creating RAID1 mirror I stuck with situation when I can
not work with newly created device.

It is aacd1. Previously it was 1000G drive without RAID. Now it is 300G
RAID1 mirror. But `geom disk list` reports:

Geom name: aacd1
Providers:
1. Name: aacd1
   Mediasize: 999642103808 (931G)
   Sectorsize: 512
   Mode: r0w0e0

Geom name: aacd1
Providers:
1. Name: aacd1
   Mediasize: 299563483136 (279G)
   Sectorsize: 512
   Mode: r0w0e0
   fwsectors: 63
   fwheads: 255

If I delete this RAID1 with arcconf, geom disk list still reports about 931G
drive at aacd1, though it was swapped out from server.

`dmesg | tail` says kernel detected new drive:

aacd1:  on aac0
aacd1: 285686MB (585084928 sectors)


But fdisk, bsdlabel, dd and others do not work with aacd1 because of error
message: 'Device not configured'. Also there is old partition table of
previous disk seen in `ls /dev/aacd1*`, tough new disk was not partitioned.

Is there a way to reread connected drives information in FreeBSD? I tried:

# atacontrol list
ATA channel 2:
Master:  no device present
Slave:   no device present
ATA channel 3:
Master:  no device present
Slave:   no device present
ATA channel 4:
Master:  no device present
Slave:   no device present
ATA channel 5:
Master:  no device present
Slave:   no device present

And

# camcontrol devlist
camcontrol: couldn't open /dev/xpt0: No such file or directory

They do not work :(

Would be appreciated for any hint.

-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: zfs on 7.3 with 7.2 world

2010-11-15 Thread cronfy
Hello,


> > I want to start using ZFS v13 and I have FreeBSD 7.2 world with 7.3
> kernel.
> >
> > And if I need to upgrade something in the world - what should it be?
>
> Why do you not update FreeBSD properly? If you want to use 7.3, install
> kernel _and_ world. (I would suggest using 8.1 though.)
>
>
If it would be my own desktop I surely did upgrade it as described and
switched to 8.1 too. But this is a production server, so I am trying to keep
changes as minimal as possible and only if changes are required indeed.


-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


zfs on 7.3 with 7.2 world

2010-11-14 Thread cronfy
Hello,

I want to start using ZFS v13 and I have FreeBSD 7.2 world with 7.3 kernel.
Do I have to upgrade zfs/zpool binaries (and maybe some libraries) to 7.3 or
only recent kernel version is required to work with ZVS v13 safely?

And if I need to upgrade something in the world - what should it be?

Thanks.

-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


fsck reports errors on clean filesystem (mounted rw)

2010-09-10 Thread cronfy
Hello.

I ran fsck on my filesystems while system was running (partitons were
mounted rw with moderate FS usage). fsck reported there were errors
(INCORRECT BLOCK COUNT and others). I decided to reboot to single mode
and check all filesystems. But in single mode fsck did not find any
errors.

 1. Can I be sure my filesystem is consistent?
 2. If fsck reports nonexistent errors (and probably will try to fix
them if asked), isn't it even danger to run fsck on running system?
 3. How can I check (not fix) filesystems while partitions are mouted
rw and are under usage?

FreeBSD 7.3/kernel, 7.2/world.

Thanks in advance.

-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


quotaon stucked in 'syncer' state

2010-05-08 Thread cronfy
Hello,

On FreeBSD 7.3-STABLE I have a job in my root crontab that is executed
every night:

5 0 * * *  /usr/sbin/quotaoff -a; /sbin/quotacheck -aug; /usr/sbin/quotaon -a;

Today I've found out that two quotaon processes stucked in 'syncer'
state in top ('D' state in ps). No quotaon/quotaoff can be started
now:

# /usr/sbin/quotaoff -a
quotaoff: /home: Operation already in progress
quotaoff: /home: Operation already in progress

And these processes can not be killed. Here is ps/top output:

# ps auwwx | grep 'quot[a]'
root   2462  0.0  0.0  4608   912  ??  DThu12AM   0:00.03
/usr/sbin/quotaon -a
root  60450  0.0  0.0  4608   928  ??  DFri12AM   0:00.04
/usr/sbin/quotaon -a

# top -b -Uroot 100500 | grep quota
60450 root  1  -40  4608K   928K syncer 4   0:00  0.00% quotaon
 2462 root  1  -40  4608K   912K syncer 5   0:00  0.00% quotaon



Is there any way to finish these stucked processes without reboot?


-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: User cpu time VS system cpu time

2010-05-06 Thread cronfy
Hello,

>> I want to understand difference between user CPU time and system CPU
>> time in system accounting.
> But keep in mind that "kernel time" is a broad category - while IO time in
> itself does not count as CPU time, file system operations for example do,
> because they really can be CPU intensive.

Ivan, thanks for the great explanation.

I think that I can measure user filesystem usage with sa - it reports
number of IO operations per user/command. In which other cases kernel
time is used instead of user time for a process? I do not mean all of
them - just that usually occur in practice.

I've noticed that there are moments when system load in top for system
time is very high (60-80% while user load is 15-25%, this produces
very high LA also). All processes that were run at this time show high
kernel time usage, although they usually do not. System is getting
back to normal after Apache restart (I think this is related to Apache
shared memory somehow, but not sure).

This makes me suspect that system time in sa can not be relied on
while measuring user system usage, because it notably varies under
some circumstances for same operations. Am I wrong?


-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


User cpu time VS system cpu time

2010-05-03 Thread cronfy
Hello,

I want to understand difference between user CPU time and system CPU
time in system accounting.

When some process uses many system CPU, does it really mean that
process prouduces heavy load on server and takes up resources that
could be used by other tasks instead? Or it only means that this
process performs many waits for, say, I/O operations?

Thanks in advance!

-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Unique process id (not pid) and accounting daemon

2010-01-24 Thread cronfy
Hello.

I am trying to create an accounting daemon that would be more precise
than usual BSD system accounting. It should read the whole process
tree from time to time (say, every 10 seconds) and log changes in
usage of CPU, I/O operations and memory per process. After daemon
notices process exit, it should read /var/account/acct to get a last
portion of accounting data and make a last entry for the process. Also
daemon should read /var/account/acct to find information about
processes that had been running between taking process tree snapshots.

There is a problem: it is not always possible to link a process in a
process tree against matching process in an accounting file. Only
command name, user/group id  and start time will match, but:

 * start time may change (i. e. after ntpdate);
 * command name saved in /var/account/acct is 15 characters max
(AC_COMM_LEN in sys/sys/acct.h), while command name in the process
tree is 19 characters max (MAXCOMLEN in sys/sys/param.h).

To ensure that process in the process tree and process in the
accounting file are the same, I want to add unique process identifier
(uint64_t) to 'proc' struct in sys/sys/proc.h and increment it for
every process fork. I see it is possible to do this just before
sx_sunlock() in fork1() in sys/kern/kern_fork.c. I'll have to add
saving of this identifier in kern_acct.c, of course.

This way I will be extremely easy to remember a process in the process
tree and find a matching one in the accounting file after it finishes.

Am I looking in a right direction or should I try some other way?
Thanks in advance.

--
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Unique id of a process (not pid)

2010-01-21 Thread cronfy
Hello,

Is there any unique identifier of a process in FreeBSD (not PID)?

I am trying to get list of processes and watch for changes
with kvm_getprocs(). I want to catch every process start and exit (except
those processes that were started and finished between calls to
kvm_getprocs()).

But between calls to this function one process may exit and be replaced with
another process with the same pid and same command name. The only difference
is a start time of processes. Looks like this is a solution, but process
start time may change if system time was shifted (i. e. with ntpdate). I can
track these shifts too, but it looks to be too complex.

Is there any simpler way to identify a process? Thanks in advance.

-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: Max kernel dump size

2009-12-28 Thread cronfy
> > How can I calculate max kernel dump size? I want to create my swap
> partition
> > as small as possible, just to fit kernel dump needs.
>
> I'm not sure you really can.  You'll definitely have enough if you allow
> a bit more than you have memory, but these days that's going to be
> overkill most of the time.
>
>
Yes, at this time I use SWAP = RAM + 1G formula. And yes, this is an
overkill especially for expensive SAS drives. I've noticed that my kernel
dumps do not exceed 2-3.5G usually.

Maybe I can collect stats for amount of Active memory used and assume that
kernel dump will not get larger than, say, Active memory + 50%?



-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Max kernel dump size

2009-12-26 Thread cronfy
Hello everybody.

How can I calculate max kernel dump size? I want to create my swap partition
as small as possible, just to fit kernel dump needs.

Thanks.

-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: FreeBSD is too filesystem errors sensitive

2009-12-21 Thread cronfy
>> After panic data *is* getting corrupted anyway - MySQL tables that were
>> open are broken, soft-updates are unsync'ed etc etc.
> If it's an option for you, you may want to look into disabling soft
> updates as well so that you don't have to just hope that everything gets
> synced before the end of the world.  Depending on your usage, however,
> this might result in unacceptably poor performance.

I am thinking about it. I am using RAID controller with battery and
write cache enabled, but I just did not test performance for
Softupdates vs no Softupdates + Write cache. Probably, someone have
done this already?


-- 
// cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: FreeBSD is too filesystem errors sensitive

2009-12-08 Thread cronfy



 panics like 'freeing free block' or 'ffs_valloc: dup alloc'


Is there a way to say "Dear kernel, don't panic, I'am holding your 
hand, keep working please-please-please?" If so, can it lead to 
complete filesystem corruption indeed or it is not so serious?


Afaik you can't do this. And you shouldn't do if it'd be possible. The 
file system errors you mention above should not happen under any 
normal circumstances. They may happen after a crash caused by other 
reasons but should get repaired by fsck. The kernel cannot continue 
with such errors because the whole file system metadata cannot be 
trusted anymore until repaired.



Thanks.

What I can definitely state is that after reboot nothing will get any 
better. I will have same filesystem with same errors + new errors that 
appeared because soft-updates were not synced, and I will have fsck 
running in background. I'd prefer to just start fsck in background, 
skipping that annoying reboot phase ;-) Am I willing strange?



___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: FreeBSD is too filesystem errors sensitive

2009-12-08 Thread cronfy



...
Is there a way to say "Dear kernel, don't panic, I'am holding your hand,
keep working please-please-please?" If so, can it lead to complete
filesystem corruption indeed or it is not so serious?



Drop to DDB, fix it, and 'continue'?
  


If I type 'continue' kernel says 'Dumping... rebooting...'. What magic 
am I missing that you probably meant under "fix it"?


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: FreeBSD is too filesystem errors sensitive

2009-12-08 Thread cronfy



.. but the hell why is it required to panic and kill everything
that would be working happily even if something very disasterous
happen to /backup partition, in example?
  

All those errors indicate file system corruption. To protect other data
from getting corrupted (e.g. by invalid pointers or calculations), the
kernel panics.


...and (hopefully) reboots, determines that there were filesystem
errors, and attempts to correct them with fsck(8)

Why isn't it possible to do the same without a reboot?

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: FreeBSD is too filesystem errors sensitive

2009-12-08 Thread cronfy


Please forgive me for probably a very stupid question. But why is 
FreeBSD so sensitive to filesystem errors that it ends up with panics 
like 'freeing free block' or 'ffs_valloc: dup alloc'? I just can't 
get it. Failed to allocate vnode? Go allocate another one! Freeing 
free block? Leave it free then! I understand these situations should 
never happen, but the hell why is it required to panic and kill 
everything that would be working happily even if something very 
disasterous happen to /backup partition, in example?
Probably because UFS is not designed to be a backup file system but a 
working one :)


All those errors indicate file system corruption. To protect other 
data from getting corrupted (e.g. by invalid pointers or 
calculations), the kernel panics.


To protect us against terrorists our government do strange things too ;-)

After panic data *is* getting corrupted anyway - MySQL tables that were 
open are broken, soft-updates are unsync'ed etc etc.
Server is required to reboot, fsck, time is wasted while this occurs. 
Why all this should happen because of a single vnode fail? Why not just 
throw message in /var/log/messages, return "oh, I failed to save a file" 
to the process that initiated the operation and just go on? Are 
consequences of attept to "free already free block" *so* dangerous that 
it is needed to give up on EVERYTHING? Let's say it was not /backup 
partition, ok, it was /var/tmp/some-php-session or even 
/var/cron/tabs/someuser file that failed. So what? Even 
/boot/kernel/kernel corruption is not critical if you are not going to 
reboot right now (or if you have /boot/kernel.old :)


Is there a way to say "Dear kernel, don't panic, I'am holding your hand, 
keep working please-please-please?" If so, can it lead to complete 
filesystem corruption indeed or it is not so serious?


Thanks.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


FreeBSD is too filesystem errors sensitive

2009-12-07 Thread cronfy

Hello.

Please forgive me for probably a very stupid question. But why is 
FreeBSD so sensitive to filesystem errors that it ends up with panics 
like 'freeing free block' or 'ffs_valloc: dup alloc'? I just can't get 
it. Failed to allocate vnode? Go allocate another one! Freeing free 
block? Leave it free then! I understand these situations should never 
happen, but the hell why is it required to panic and kill everything 
that would be working happily even if something very disasterous happen 
to /backup partition, in example?


Would be very appreciated if someone could explain that... thanks.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


nice for disk I/O

2009-11-27 Thread cronfy

Hello.

It is well known that nice allows to change CPU scheduling priority. But 
is there something
that would tune disk I/O priority for a particular process? Thanks in 
advance.


--
cronfy
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


2 processes reproducible read same file with different speed

2009-11-26 Thread cronfy

Hello.

I've noticed a very weird behavior of 2 Apache processes that shold read 
the same file to process a request (they configured to read it on every 
request). One spends about 6ms to read the file, and second spends about 
114ms (I used ktrace to find this out). Every time, on every request, 
the problem is reproducible. Apaches are the same, the only difference 
between them that they are working from different users to serve 
different sites. Same binary, same config.


First Apache used to work in the same way some time ago - it spent 
~120ms to read the file. But once it changed and now it is working fast.


Restarts of Apache do not look to affect on anything.

The file that Apache should read is 315k long. Apache reads it by small 
blocks of 4096 bytes each. May be FreeBSD has some memory about how 
process is working with files and after some time enables some 
optimization or caching? I just do not have any clue... :(


Can anyone explain this please?


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


FreeBSD 7.2 Fatal trap 9 - general protection fault while in kernel mode

2009-11-20 Thread cronfy

Hello,

I have Fatal trap 9: general protection fault while in kernel mode with 
FreeBSD 7.2 and kernel csup'ed and build on 22 Oct using 
standard-supfile. How can I find out what is the problem?


Message:

Fatal trap 9: general protection fault while in kernel mode
cpuid = 11; apic id = 13
instruction pointer = 0x8:0x802a65c1
stack pointer   = 0x10:0x79d75380
frame pointer   = 0x10:0x79d753a0
code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 114 (php)

Backtrace:

db> bt

Tracing pid 114 tid 100403 td 0xff00452ec370
devstat_start_transaction() at devstat_start_transaction+0x11
g_io_request() at g_io_request+0x11f
breadn() at breadn+0xd3
bread() at bread+0x1e
ffs_vgetf() at ffs_vgetf+0x2dc
ufs_root() at ufs_root+0x21
lookup() at lookup+0x981
namei() at namei+0x33e
kern_statfs() at kern_statfs+0x60
statfs() at statfs+0x2a
syscall() at syscall+0x256
Xfast_syscall() at Xfast_syscall+0xab
--- syscall (396, FreeBSD ELF64, statfs), rip = 0x8022ade1c, rsp = 
0x7fffc528, rbp = 0x802536bb8 ---



Kernel config:

GENERIC config was changed: I disabled options for hardware that I do 
not require on this server and added some options


options QUOTA
options KDB
options DDB

I am using aacu RAID driver from Adaptec's site:

aacu0: Adaptec 2405, aac driver 2.2.8-17517



What can it be? Soft-updates? aacu driver problem? Something else? Any 
help would be appreciated.


___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


Re: get accounting info for running process

2009-11-19 Thread cronfy


Is it possible to find out how much a process have used CPU user 
time/system time/IO operations for now by it's pid? Like in sa, but for 
running process.



Dan, Mel, thanks for your answers. I examined 'ps' sources and decided 
to use  kvm_getprocs() and rusage structure.


I am trying to create a daemon that would report system accounting stats 
for every X seconds, let's say 10.  'sa' reports about terminated 
processes only, but it would be nice to have more detailed system usage 
stats per user for a given time interval (i.e. last 10 seconds), 
including tasks that are not finished at the moment of querying.


I can achieve this by querying list of processes each 10 seconds and 
producing diffs between previous and current list, saving these to some 
log and combining data with /var/account/acct file.


The only thing I do not want to do is to invent a wheel ;-) I googled 
much for such solutions, but did not find any. May be someone knows 
existing products that has this functionality already?


Thanks in advance.
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"


get accounting info for running process

2009-11-18 Thread cronfy

Hello.

Is it possible to find out how much a process have used CPU user 
time/system time/IO operations for now by it's pid? Like in sa, but for 
running process.


Thanks in advance.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"