RE: bad RAM? prove it with a crash dump?

2010-05-06 Thread Nate Eldredge

On Thu, 6 May 2010, Andrew Duane wrote:

It is also useful to make sure that the garbage itself is different. As 
mentioned before, a single bit error in an otherwise valid value, or 
maybe a missing/scrambled byte, these are good indications of memory 
problems. If random places are often overwritten with something else, 
that could just be another piece of misbehaving code that is writing 
someplace it shouldn't. I've often found code that writes some buffer 
into e.g. a piece of memory it no longer owns that looks like memory 
corruption until you realize the garbage is always something specific 
like a vnode structure.


There are trickier things too.  I once had a machine with bad cache memory 
where once in a while you would get a cache line that had come from 
somewhere else in memory.  This was particularly vexing when it happened 
to an I/O buffer, and I wound up with a large zip file that had 32 bytes 
of libc.so somewhere in the middle... :-(


And of course, swapping out the RAM wouldn't have fixed it.

--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Modifying ELF files

2010-04-08 Thread Nate Eldredge

On Thu, 8 Apr 2010, Patrick Mahan wrote:



In my job, we are producing applications and KLM's for our product
that require them to be signed so that our installer will recognize
and validate our images.

The signature is stored in each app as

unsigned char signature[40] __attribute__((section(.compsign)));

What I need to do is open the file for writing, locate the .compsign
section and stuff in the signature, write it out and close the file.
(simple ELF manipulation)

An 'ls -l' shows the following:

% ls compklm.ko
-rw-r--r--  1 pmahan  pmahan  125296 Apr  6 22:50 
/home/pmahan/temp/compklm.ko


When I try to run my program
./signfile --signature=A203239897C8EB360D1EB2C84E8E77B16E5B7C9A compklm.ko
open: Text file busy

Googling and looking at the kernel sources, it seems that it detects
this file contains 'shared text', that is, it is an executable file
and does not allow me to open it for writing.


My understanding was that ETXTBSY occurs when you attempt to open for 
writing a file which is actually being executed, i.e. is mapped into some 
process.  I'm not aware that open(2) actually looks at the file itself to 
see if it is an executable; that would be very surprising to me.


What does fstat -m compklm.ko say?

What happens if you cp compklm.ko foo.ko and try to sign foo.ko?  You 
should then be able to do mv foo.ko compklm.ko; if compklm.ko is 
in fact mapped into some process, it will continue to use the original 
version, which will be kept around (invisibly) until all mappings go away. 
This is what compilers, install(8), etc, normally do.


Does your signfile program do anything with the target file before 
open(..., O_RDWR)?


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: namei() returns EISDIR for / (Re: svn commit: r203990 - head/lib/libc/sys)

2010-03-01 Thread Nate Eldredge

On Sun, 28 Feb 2010, Garrett Cooper wrote:


On Sun, Feb 28, 2010 at 5:11 PM, Alexander Best alexbes...@wwu.de wrote:

i have a small test app to check {rm|mk}dir()'s errnos with certain args like
/, ., /proc and non-empty dirs. i'll submit it to this thread as soon as i
also add testcases for syscalls like rename(), unlink(), etc.

most of the errno codes returned after applying your patch look correct. i
wonder however why rmdir(/proc) returns EACCESS as unprivileged user.
wouldn't it make more sense to also return EBUSY? why complain about
permission related matters when even root won't be able to perform the
operation.


Hmm.. good question. POSIX doesn't fully expound on this case
(http://www.opengroup.org/onlinepubs/009695399/functions/rmdir.html),
and either seem possible...


At:
http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_03.html#tag_02_03
we have

If more than one error occurs in processing a function call, any one of 
the possible errors may be returned, as the order of detection is 
undefined.


So we're okay standard-wise.

In general, though, I'd think it makes sense to do permissions checks 
before anything else, because in some cases the error code can leak 
information.  For instance, if you try to open() a nonexistent file in a 
directory for which you don't have search permission ('x' bit), it's very 
important that open() fail with EACCES instead of ENOENT, since you aren't 
suppposed to be able to find out whether or not the file exists. 
Obviously that doesn't apply in this case, because anyone is entitled to 
know that /proc is the root of a mounted filesystem, but it seems to me 
that it's a good habit to check permission first.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


RE: ntpd hangs under FBSD 8

2010-02-24 Thread Nate Eldredge

On Mon, 22 Feb 2010, Peter Steele wrote:

Just out of curiosity, can you attach to the process via gdb and get a 
backtrace? This smells like a locked pthread_join I hit in my own code 
a few weeks ago


I'm not using the debug version of ntpd so the backtrace isn't too 
useful, but here's what I get:


(gdb) bt
#0  0x000800d52bfc in select () from /lib/libc.so.7
#1  0x00425273 in ?? ()
#2  0x0040540e in ?? ()
#3  0x00080058 in ?? ()
#4  0x in ?? ()


I bet ntpd doesn't call select() in all that many places.  Instead of 
going to all this trouble to build a debugging libc, you could just grep 
for select() and place breakpoints on all occurrences.  (It might also be 
obvious from looking at them which one is the offender.)


Also, since a system call is causing the trouble, you might learn 
something from truss or ktrace.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: NFS write corruption on 8.0-RELEASE

2010-02-12 Thread Nate Eldredge

On Fri, 12 Feb 2010, Dmitry Marakasov wrote:


* Oliver Fromme (o...@lurza.secnetix.de) wrote:


This is an excerpt from Solaris' mount_nfs(1M) manpage:

File systems that are mounted read-write or that  con-
tain  executable  files  should always be mounted with
the hard option.  Applications using soft mounted file
systems  may incur unexpected I/O errors, file corrup-
tion, and unexpected  program  core  dumps.  The  soft
option is not recommended.

FreeBSD's manual page doesn't contain such a warning, but
maybe it should.  (It contains a warning not to use soft
with NFSv4, though, for different reasons.)


Interesting, I'll try disabling it. However now I really wonder why
is such dangerous option available (given it's the cause) at all,
especially without a notice. Silent data corruption is possibly the
worst thing to happen ever.


Tell me about it. :)

But in this case I'm not sure I understand.  As I understand it, the 
difference between soft and hard is that in the case of soft, a timeout 
will result in the operation failing and returning EIO or the like (hence 
unexpected I/O errors).  And if the operation is being done to fault in 
a mapped page, you'd have to notify the process asynchronously by sending 
a signal like SIGBUS which it may not be expecting (hence unexpected core 
dumps).  But in what scenario would you see file corruption?  Unless you 
have a buggy program that doesn't check return values from system calls or 
handles signals in a stupid way, I don't see how this can happen, and I'm 
not sure what the Sun man page is referring to.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ptrace bug or feature?

2010-01-17 Thread Nate Eldredge

On Sun, 17 Jan 2010, Kostik Belousov wrote:


It may be a missed feature, not a bug. There is obvious hack value
in ability to modify syscall arguments from the debugger.

Do you know whether other operating systems allow this ?


Linux does, I've used it.

--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Suggestion: rename killall to fkill, but wait five years to phase the new name in

2009-12-22 Thread Nate Eldredge

On Tue, 22 Dec 2009, Craig Small wrote:


I also agree with Daniel; why would anyone want to literally kill every
process?


AFAIK, it's a helper program for shutdown(8) (or shutdown(1M) as they call 
it) and isn't really intended to be useful otherwise.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Email sent from at command going to the wrong account

2009-12-14 Thread Nate Eldredge

On Mon, 14 Dec 2009, Holger Kunst wrote:


Hi,

The at command sends an email with the output of the scheduled job. I've 
experienced inconsistent results when running jobs, receiving emails in 
accounts not associated with the user currently logged in.


To reproduce in FreeBSD 7.2-RELEASE-p2

Case #1
login as user a (new shell through ssh)
echo echo 1 | at now
-- user a will receive an email containing 1 - this is as expected

Case #2
login as user a (new shell through ssh)
login as user b


How are you accomplishing this?


exit
echo echo 1 | at now
-- user b will receive an email containing 1 - this is not as expected, 
since I am user a again


A look at the source for at reveals that at is getting the mailname from 
getlogin(). Running a small test program that outputs getlogin(), confirms 
the above behavior: A log-in and out of another account makes getlogin() 
return that account's name, even though the shell has been closed and we are 
back to the original shell and the original user a.


Is this the intended behavior? Any hints would be appreciated.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Superpages on amd64 FreeBSD 7.2-STABLE

2009-12-10 Thread Nate Eldredge

On Thu, 10 Dec 2009, Linda Messerschmidt wrote:


Also...

On Thu, Dec 10, 2009 at 9:50 AM, Bernd Walter ti...@cicely7.cicely.de wrote:

I use fork myself, because it is easier sometimes, but people writing
big programms such as squid should know better.
If squid doesn't use vfork they likely have a reason.


Actually they are probably going to switch to vfork().  They were
previously not using it because they thought there was some ambiguity
about whether it was going to be around long term.


Well, the worst that would likely happen to vfork() is it would become an 
alias of fork(), and you'd be back to where you are now (or better if 
fork() were fixed in the meantime).  I'd be more worried about the 
mysterious bugs which it's so easy to introduce with vfork() if you do 
anything at all nontrivial before exec() and accidentally touch the 
parent's memory.


What about using posix_spawn(3)?  This is implemented in terms of 
vfork(), so you'll gain the same performance advantages, but it avoids 
many of vfork's pitfalls.  Also, since it's a POSIX standard function, you 
needn't worry that it will go away or change its semantics someday.



I actually am not a huge fan of vfork() since it stalls the parent
process until the child exec()'s.


If you're doing so much work between vfork() and exec() that this delay is 
significant, then I would think you're really abusing vfork().



To me, this case actually highlights why that's an issue.  If the
explanation is that stuff is happening in the parent process between
fork() and the child's exec() causes the fragmentation, that's stuff
that would be deferred in a vfork() regime, with unknown potential
consequences.  (At a minimum, decreased performance.)


Not necessarily.  In the fork() case, presumably copy-on-write is to blame 
for the fragmentation.  In the vfork() case, there's no copy at all.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ucred when euid/egid

2009-11-29 Thread Nate Eldredge

On Sun, 29 Nov 2009, Clifton Royston wrote:


On Sun, Nov 29, 2009 at 01:19:02PM +0300, Anthony Pankov wrote:


Thank you for reply.

So, seteuid/gid isn't enough to gain group access as for real uid.
But how i can achieve this? What functions should i call from
'theprog' to gain access for the groups euid user belongs to?

May be i solve the problem in wrong way?

The full problem is:

There is a file owned by group filegroup:
 rw-rw   someone:filegroupthefile

There is a programs data owned by group proggroup:

 rw-rw   someone2:proggroupprogdata

I need a program (theprog) that can access 'thefile' and
'progdata' simultaneously. Program can be executed by anyone.


This is a clearer statement of the problem, in terms of what you're
trying to accomplish.

If you can make the program data owned by a special program user, and
require the users of the program to make their files group-accessible
by this special filegroup, then you can do it fairly simply, like this:

Make each users' thefile be owned by group filegroup, for example:
 rw-rw   someone:filegroup~someone/thefile
 rw-rw   someone2:filegroup   ~someone2/thefile
 rw-rw   someone3:filegroup   ~someone3/thefile
 ...

Make the program's data file owned by *user* proguser:
 rw-rw   proguser:proggroupprogdata

Now you can make the program setuid proguser/setgid filegroup:
 r-sr-sr-x   proguser:filegrouptheprog

This lets it be executed by any user and access its own data (via the
suid) and the files the users have put into filegroup (via the sgid).


If you can't make progdata owned by proguser, or if more groups are 
needed, you might be able to abuse newgrp(1), which will let you run a 
program with your real and effective gids set to any specified group of 
which your real uid is a member.  This would require, though, that you 
break the code that requires access to those files into separate programs. 
(Though maybe they are as simple as cat'ing a file into a pipe or 
something.)


Example:

setuid(proguser);
FILE *data = popen(echo \cat progdata\ | newgrp proggroup, r);
/* read data */

etc.

If your program needs to do something really elaborate with the files that 
can't be factored out into a separate program, you could use newgrp to run 
a program that opens the file and passes its fd over a unix socket.  But 
then it's really becoming a hack. :)


Caution: I haven't tested any of this.

--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: [patch] burncd: honour for envar SPEED

2009-11-10 Thread Nate Eldredge

On Tue, 10 Nov 2009, Alexander Best wrote:


ps: would be nice if strcasecmp could protect itself from segfault with one or
both of the args being NULL.


I disagree.  What do you think it should do instead?  Return 0?  If it 
did, would you have found your bug?


The same argument could be made for any of the string.h functions, but I 
don't think it actually holds water.  Such checks add overhead, and only 
provide an illusion of safety.  Sure, strcasecmp could avoid causing the 
segfault itself, but at the cost of letting a broken program continue and 
possibly cause more damage.  It could call abort(), but then you'd just 
have the same result (program terminates) with a different signal, and 
doing your check in software rather than letting the MMU hardware do it. 
It could print a message, but that pollutes the program's output, and 15 
seconds debugging the core dump will reveal the problem anyway.


Having a library function protect itself in this manner is not actually 
helpful, IMHO.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: help needed to fix contrib/ee crash/exit when receiving SIGWINCH

2009-10-22 Thread Nate Eldredge

On Fri, 23 Oct 2009, Antony Mawer wrote:


On Fri, Oct 23, 2009 at 1:35 PM, Alexander Best
alexbes...@math.uni-muenster.de wrote:

hi everyone,

together with hugh mahon (the author of ee) i've been trying to fix a nasty
bug in ee. for some reason ee exits (not crashes) and leaves the console
corrupted when receiving SIGWINCH (`killall -SIGWINCH ee` should exit all
running ee instances).


I noticed this the other day when working on a new 8.0-RC1 system...
in my case I was using putty (Windows ssh client) to access the system
and maximised the window I had ee running in, and noticed ee just
dumped me straight to the prompt.


Seems a good start might be to compile ncurses with -g, link ee against 
it, put a breakpoint on the SIGWINCH handler, and start single stepping...


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: mmap(2) segaults with certain len values and MAP_ANON|MAP_FIXED

2009-10-20 Thread Nate Eldredge

On Wed, 21 Oct 2009, Alexander Best wrote:


hi there,


This is on a 32-bit platform I take it?


just a little mmap(2) related question. running the following code causes a
segfault:

mmap( (void*)0x1000, 0x80047000, PROT_NONE, MAP_ANON|MAP_FIXED, -1, 0 );


I don't doubt it.  You mapped over a big chunk of your address space with 
memory that's inaccessible (PROT_NONE).  This probably includes your 
program's code.  So when the mmap call returns from the kernel and tries 
to execute the next instruction of your program, it finds that the 
instruction pointer is pointing to inaccessible memory.  Result: segfault. 
This is quite normal.


What are you actually trying to accomplish with this?


while the following doesn't:

mmap( (void*)0x1000, 0x, PROT_NONE, MAP_ANON|MAP_FIXED, -1, 0 );


Did you check whether the mmap actually succeeded?  I bet it didn't.  You 
have a length that isn't a multiple of the page size and wraps around 32 
bits.  I bet you got an EINVAL, and the mmap call didn't actually do 
anything.



is this a known problem? seems reproducible on all branches.


Not a problem at all, I suspect.

--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Running a program through gdb without interfering

2009-10-09 Thread Nate Eldredge

On Fri, 9 Oct 2009, Mel Flynn wrote:


On Friday 09 October 2009 11:38:29 Dag-Erling Smørgrav wrote:

Mel Flynn mel.flynn+fbsd.hack...@mailing.thruhere.net writes:

is there a way to have a program run through gdb and gdb only record a
segfault, but otherwise let the program run?


Yes, just run gdb /path/to/program and type run.


Not what I was looking for. The segfaults are random and the only way to
somewhat reliably reproduce it is to have portmaster invoke it as it's
PM_SU_CMD. And no, running that same command again doesn't trigger the
segfault, so it's something environmental. Hence I'm looking for something
like:
gdb -batch -x script_with_run_cmd.gdb -exec /usr/local/bin/sudo $argv

where somehow I need $argv to be passed as arguments to sudo. I'm thinking i
should just wrap it and mktemp(1) a new command script for gdb to use with set
args $*, but if anyone has a more clever idea, I'd love to hear it.


This won't work.  You can't debug setuid programs (for reasons which 
should be obvious).  You could do it if you ran everything as root, but it 
sounds like the bug doesn't occur in that case.


--

Nate Eldredge
n...@thatsmathematics.com___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: genuine cpu I386_CPU kernel support

2009-09-23 Thread Nate Eldredge

On Wed, 23 Sep 2009, John Baldwin wrote:


On Wednesday 23 September 2009 1:21:59 pm Julian Elischer wrote:

John Baldwin wrote:

Other things added since then assume at least a 486.  Not having cmpxchg is a
bit of a killer.


I think a 386 can assume non-SMP in which case that can be simulated
just fine :-)
  it also simplifies a lot of the other breakages..

#if (CPU == 80386)  defined(SMP)
#error can't have smp on a 386
#endif


No, it actually does not.  The in-kernel version of cmpset for 386 was to
disable interrupts while doing a cmp and jmp around a mov (even 386's have
preemption, so you do have to disable interrupts).  You can't do that in
userland (cli is a privileged instruction), which probably mandates doing a
cmpxchg emulator in the kernel for userland code.  That and disabling
interrupts is actually far less efficient than spl() for a UP 80386 machine.
I suspect newer kernels will run slower on an 80386 than 4.x.


Another issue that I know affected Linux is that the 386 would allow 
kernel code (CPL 0) to write to a page that was marked read-only.  The 486 
and later would generate a page fault.  Linux takes advantage of the 486 
behavior to avoid having to do explicit access checks when copying to user 
space, though AFAIK it checks the CPU at boot time to decide if this can 
be done.  I haven't checked whether FreeBSD uses this feature, but it 
would be another thing to watch out for.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: genuine cpu I386_CPU kernel support

2009-09-22 Thread Nate Eldredge

On Tue, 22 Sep 2009, John Baldwin wrote:


My comment is to just use 4.x (seriously).  A true 386 is going to be quite
slow and the overhead of many things added that work well on newer processors
is going to be very painful on a 386 (probably on a 486 as well).  4.x runs
fine on a 386 and should support all the hardware you can stick into a
machine with an 80386 CPU.


Unless, of course, you plan to put it on a network.  I doubt that 4.x is 
up to date with respect to security patches.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ZFS group ownership

2009-09-15 Thread Nate Eldredge

On Sat, 12 Sep 2009, Giulio Ferro wrote:


I don't know if this is the correct list to discuss this matter, if not
I apologize in advance.


freebsd-questions might have been better, but I don't think you're too far 
off.  It wasn't necessary to post three times though :)


[On UFS, files are created with the same group as the directory that 
contains them.  On ZFS, they are created with the primary group of the 
user who creates them.]



What I ask now is: is this a bug or a feature?


Both, I think :)

The behavior you describe on UFS (group comes from the directory) is 
standard for BSD-based systems like FreeBSD.  On SysV-based systems, 
however, the default is that the group comes from the user, as you 
describe on ZFS.  ZFS was originally developed for Solaris, a descendent 
of SysV, so it's not surprising that it also has this behavior.  However, 
this is at least a documentation bug, since the open(2) man page describes 
the BSD behavior without mentioning exceptions.



How can I achieve my goal in ZFS, that is allowing members of the same
group to operate with the files / dirs they create?


On SysV, you can get BSD-type behavior by setting the sgid bit on the 
directory in question, e.g. chmod g+s dir.  Then new files will inherit 
their group from the directory.  I suspect this will work on FreeBSD/ZFS 
too even though chmod g+s on a directory is undocumented.


--

Nate Eldredge
n...@thatsmathematics.com
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Intermittent system hangs on 7.2-RELEASE-p1

2009-08-26 Thread Nate Eldredge

On Wed, 26 Aug 2009, Linda Messerschmidt wrote:


I'm trying to troubleshoot an intermittent Apache performance problem,
and I've narrowed it down using to what appears to be a brief
whole-system hang that last from 0.5 - 3 seconds.  They occur every
few minutes.


One thought would be to use ps to try to determine which process, if 
any, is charged with CPU time during the hang.


If you could afford a little downtime, it would be worth seeing if the 
hang occurs in single-user mode (perhaps with a simple program that loops 
calling gettimeofday() and warns when the time between successive 
iterations is large).


I once had a problem like this that I eventually traced to a power 
management problem.  (Specifically, the machine had a modem, and would 
hang for a few seconds whenever the line would ring.  It was apparently 
related to the Wake-On-Ring feature.)  If I remember correctly, disabling 
ACPI made it go away.  So that might be something to try, if rebooting is 
an option.


What are the similarities and differences in hardware and software among 
the affected machines (you mentioned there were several)?


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: mmap/munmap with zero length

2009-07-04 Thread Nate Eldredge

On Sun, 5 Jul 2009, Alexander Best wrote:


i'm wondering why mmap and munmap behave differently when it comes to a length
argument of zero. allocating memory with mmap for a zero length file returns a
valid pointer to the mapped region.

munmap however isn't able to remove a mapping with no length.

wouldn't it be better to either forbid this in mmap or to allow it in munmap?


POSIX has an opinion:

http://www.opengroup.org/onlinepubs/9699919799/functions/mmap.html

If len is zero, mmap() shall fail and no mapping shall be established.

http://www.opengroup.org/onlinepubs/9699919799/functions/munmap.html

The munmap() function shall fail if:
...
[EINVAL]
The len argument is 0.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?

2009-05-21 Thread Nate Eldredge

On Wed, 20 May 2009, Yuri wrote:


Seems like failing system calls (mmap and sbrk) that allocate memory is more
graceful and would allow the program to at least issue the reasonable error 
message.
And more intelligent programs would be able to reduce used memory instead of 
just dying.


It's a feature, called memory overcommit.  It has a variety of pros and 
cons, and is somewhat controversial.  One advantage is that programs often 
allocate memory (in various ways) that they will never use, which under a 
conservative policy would result in that memory being wasted, or programs 
failing unnecessarily.  With overcommit, you sometimes allocate more 
memory than you have, on the assumption that some of it will not actually 
be needed.


Although memory allocated by mmap and sbrk usually does get used in fairly 
short order, there are other ways of allocating memory that are easy to 
overlook, and which may allocate memory that you don't actually intend 
to use.  Probably the best example is fork().


For instance, consider the following program.

#define SIZE 10 /* 1 GB */
int main(void) {
  char *buf = malloc(SIZE); /* 1 GB */
  memset(buf, 'x', SIZE); /* touch the buffer */
  pid_t pid = fork();
  if (pid == 0) {
execlp(true, true, (char *)NULL);
perror(true);
_exit(1);
  } else if (pid  0) {
for (;;); /* do work */
  } else {
perror(fork);
exit(1);
  }
  return 0;
}

Suppose we run this program on a machine with just over 1 GB of memory. 
The fork() should give the child a private copy of the 1 GB buffer, by 
setting it to copy-on-write.  In principle, after the fork(), the child 
might want to rewrite the buffer, which would require an additional 1GB to 
be available for the child's copy.  So under a conservative allocation 
policy, the kernel would have to reserve that extra 1 GB at the time of 
the fork(). Since it can't do that on our hypothetical 1+ GB machine, the 
fork() must fail, and the program won't work.


However, in fact that memory is not going to be used, because the child is 
going to exec() right away, which will free the child's copy.  Indeed, 
this happens most of the time with fork() (but of course the kernel can't 
know when it will or won't.)  With overcommit, we pretend to give the 
child a writable private copy of the buffer, in hopes that it won't 
actually use more of it than we can fulfill with physical memory.  If it 
doesn't use it, all is well; if it does use it, then disaster occurs and 
we have to start killing things.


So the advantage is you can run programs like the one above on machines 
that technically don't have enough memory to do so.  The disadvantage, of 
course, is that if someone calls the bluff, then we kill random processes. 
However, this is not all that much worse than failing allocations: 
although programs can in theory handle failed allocations and respond 
accordingly, in practice they don't do so and just quit anyway.  So in 
real life, both cases result in disaster when memory runs out; with 
overcommit, the disaster is a little less predictable but happens much 
less often.


If you google for memory overcommit you will see lots of opinions and 
debate about this feature on various operating systems.


There may be a way to enable the conservative behavior; I know Linux has 
an option to do this, but am not sure about FreeBSD.  This might be useful 
if you are paranoid, or run programs that you know will gracefully handle 
running out of memory.  IMHO for general use it is better to have 
overcommit, but I know there are those who disagree.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?

2009-05-21 Thread Nate Eldredge

On Thu, 21 May 2009, per...@pluto.rain.com wrote:


Nate Eldredge neldre...@math.ucsd.edu wrote:

With overcommit, we pretend to give the child a writable private
copy of the buffer, in hopes that it won't actually use more of it
than we can fulfill with physical memory.


I am about 99% sure that the issue involves virtual memory, not
physical, at least in the fork/exec case.  The incidence of such
events under any particular system load scenario can be reduced or
eliminated simply by adding swap space.


True.  When I said a system with 1GB of memory, I should have said a 
system with 1 GB of physical memory + swap.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Why kernel kills processes that run out of memory instead of just failing memory allocation system calls?

2009-05-21 Thread Nate Eldredge

On Thu, 21 May 2009, Yuri wrote:


Nate Eldredge wrote:
Suppose we run this program on a machine with just over 1 GB of memory. The 
fork() should give the child a private copy of the 1 GB buffer, by 
setting it to copy-on-write.  In principle, after the fork(), the child 
might want to rewrite the buffer, which would require an additional 1GB to 
be available for the child's copy.  So under a conservative allocation 
policy, the kernel would have to reserve that extra 1 GB at the time of the 
fork(). Since it can't do that on our hypothetical 1+ GB machine, the 
fork() must fail, and the program won't work.


I don't have strong opinion for or against memory overcommit. But I can 
imagine one could argue that fork with intent of exec is a faulty scenario 
that is a relict from the past. It can be replaced by some atomic method that 
would spawn the child without ovecommitting.


I would say rather it's a centerpiece of Unix design, with an unfortunate 
consequence.  Actually, historically this would have been much more of a 
problem than at present, since early Unix systems had much less memory, no 
copy-on-write, and no virtual memory (this came in with BSD, it appears; 
it's before my time.)


The modern atomic method we have these days is posix_spawn, which has a 
pretty complicated interface if you want to use pipes or anything.  It 
exists mostly for the benefit of systems whose hardware is too primitive 
to be able to fork() in a reasonable manner.  The old way to avoid the 
problem of needing this extra memory temporarily was to use vfork(), 
but this has always been a hack with a number of problems.  IMHO neither 
of these is preferable in principle to fork/exec.


Note another good example is a large process that forks, but the child 
rather than exec'ing performs some simple task that writes to very little 
of its copied address space.  Apache does this, as Bernd mentioned. 
This also is greatly helped by having overcommit, but can't be 
circumvented by replacing fork() with something else.  If it really 
doesn't need to modify any of its shared address space, a thread can 
sometimes be used instead of a forked subprocess, but this has issues of 
its own.


Of course all these problems are solved, under any policy, by having more 
memory or swap.  But overcommit allows you to do more with less.


Are there any other than fork (and mmap/sbrk) situations that would 
overcommit?


Perhaps, but I can't think of good examples offhand.

--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: C99: Suggestions for style(9)

2009-04-28 Thread Nate Eldredge

On Tue, 28 Apr 2009, Peter Jeremy wrote:


On 2009-Apr-26 09:02:36 +0200, Christoph Mallon christoph.mal...@gmx.de wrote:

as some of you may have noticed, several years ago a new millenium
started and a decade ago there was a new C standard.


Your implication that FreeBSD is therefore a decade behind the times
is unfair.  Whilst the C99 standard was published a decade ago,
compilers implementing that standard are still not ubiquitous.


HEAD recently
switched to C99 as default (actually gnu99, but that's rather close).


Note that gcc 4.2 (the FreeBSD base compiler) states that it is not
C99 compliant.


However, if you take a look at http://gcc.gnu.org/gcc-4.2/c99status.html , 
you will see that it is very close.  The vast majority of C99 features are 
implemented and working correctly.  Even those which are marked as 
broken generally work in most cases, and fail only in rather obscure 
corner cases that real programs are unlikely to encounter.  In particular, 
the features Christoph proposes to use work fine.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Bug in tcp wrappers?

2009-03-15 Thread Nate Eldredge

On Sun, 15 Mar 2009, Mikko Työläjärvi wrote:


The real fix involves rewriting chunks of the libwrap code, or finding
a version where someone has already done so.


It doesn't seem like it should be too bad.  xgets is only called in three 
places.  It would be easy enough to replace it with something like glibc's 
getline(3), that uses realloc to size a buffer appropriately.


If nobody else feels like doing this, maybe I will.

--

Nate Eldredge
neldre...@math.ucsd.edu___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: Debugging init process.

2009-03-10 Thread Nate Eldredge

On Tue, 10 Mar 2009, vasanth raonaik wrote:


Hello Team,

I need to debug init process. I am not able to attach init to gdb and it
throws


As others mentioned, this is explicitly disabled.  You could re-enable it 
by hacking the kernel, but it could cause other unexpected problems.


Alternatively, there's always printf debugging.

What is wrong with init, that you need to debug it?  It's a fairly simple 
program that's been around for a long time and should be pretty stable.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Spin down HDD after disk sync or before power off

2009-03-05 Thread Nate Eldredge

On Thu, 5 Mar 2009, Tobias Blersch wrote:


Oliver Fromme wrote:

 Joerg Sonnenberger wrote:
 This is not true. Many hard disks don't like having to do an emergency
 shutdown as it affects the disk life time negatively. That's what
 happens if you poweroff the machine when the disks are still spinning.

Can you point to any authoritative information (URL) about
that claim, such as vendor specs, white paper or similar?


http://www.hitachigst.com/tech/techlib.nsf/techdocs/28DCCB17E0EEC5A086256F4E006E2F5B

Thats the specification for my notebooks hard drive. Section 6.6
Reliability gives data about how to power-off the disk. It also contains
numbers of supported load/unloads and emergency unloads. Emergency
unloads are invoked when the heads are still loaded and power fails.


Ok, I didn't know that.  There are some drives that can unload the 
heads normally on power loss and don't need any special handling, and I 
was under the mistaken impression that this was universal.


But the documentation suggests that this should be a BIOS function.  When 
the kernel tries to poweroff the system, isn't that normally done via the 
BIOS (perhaps with ACPI/APM)?  So maybe the BIOS is supposed to unload the 
heads (by sending a standby/sleep command) before cutting the power.


This makes sense in some ways.  Suppose the drive is attached to a weird 
ATA controller that FreeBSD doesn't know anything about.  (Maybe it's used 
by the other system in a dual-boot setup.)  There's no way that FreeBSD 
could send it a power-down sequence, but the BIOS could.


Perhaps the OP's BIOS for some reason doesn't do this correctly.

--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: ln: posixly confused

2009-03-03 Thread Nate Eldredge

On Tue, 3 Mar 2009, Andriy Gapon wrote:



Test case.
Preparation:
$ mkdir linktest
$ cd linktest
$ mkdir some_dir
$ mkdir other_dir
The test:
$ ln -s some_dir the_link
$ ln -s -f other_dir the_link

Expected: the_link points to other_dir.
Actual result: some_dir contains symlink other_dir - other_dir.


From ln(1):

SYNOPSIS
ln [-s [-F]] [-f | -iw] [-hnv] source_file [target_file]
ln [-s [-F]] [-f | -iw] [-hnv] source_file ... target_dir

I thought that only true directory would trigger the second form.
I thought that the second argument being a symlink (to a file or to a directory)
should trigger the first form.

I also read this:
http://www.opengroup.org/onlinepubs/009695399/utilities/ln.html

I think that the text there (and in ln(1)) implies what I expected, but this is
not spelled out clearly.


FWIW, Linux and Solaris have the same behavior as FreeBSD.

The standard says the second form is triggered if the second argument 
names an existing directory.  An informative note in the symlink() 
specification at 
http://www.opengroup.org/onlinepubs/009695399/functions/symlink.html says 
a symbolic link allows a file to have multiple logical names. 
Therefore, I think it's a fair interpretation to say that a symbolic link 
to an existing directory names it.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


portupgrade spurious skips

2009-02-26 Thread Nate Eldredge

Hi folks,

In the past few months I've noticed a bug in portupgrade.  When I update 
my ports tree and do `portupgrade -a', often a few ports will be skipped, 
supposedly because another port on which they depend failed to install. 
However, the apparently failed port actually did not fail, and if I rerun 
`portupgrade -a', some of the skipped ports will install successfully 
without complaint.  After enough iterations I can eventually get all of 
them.


I'd like to file a PR about this, but it's a little bit tricky coming up 
with a test case, since the behavior depends on having outdated packages 
installed, and on the dependencies between them.  Moreover, after I run 
`portupgrade -a' and notice the problem, the state of the installed 
packages has changed and the same packages aren't skipped the next time. 
So my question is whether anyone has ideas about how to construct a 
reasonable test case that could help me make this reproducible and easier 
to investigate.  Any thoughts?


Thanks in advance.

--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: shmmax tops out at 2G?

2009-02-24 Thread Nate Eldredge

On Tue, 24 Feb 2009, Garrett Cooper wrote:


On Mon, Feb 23, 2009 at 12:16 PM, Bill Moran wmo...@potentialtech.com wrote:

In response to Christian Peron c...@freebsd.org:


On Mon, Feb 23, 2009 at 11:58:09AM -0800, Garrett Cooper wrote:
[..]


    Why isn't the field an unsigned int / size_t? I don't see much value
in having the size be signed...


No idea :) This code long predates me.


It's that way because the original Sun spec for the API said so.

It makes little sense to change it just to unsigned.  The additional 2G
it would give doesn't really solve the tuning problem on a 64G system.
This is simply a spec that has become outdated by modern hardware.


Ah, but an unsigned integer on a 64-bit system supports that kind of
precision ;). Or are you saying you're crazy enough to run PAE mode
with that much RAM 0-o?


int and unsigned on amd64 are 32-bit types.  To get a 64-bit integer, you 
need (unsigned) long.


--

Nate Eldredge
neldre...@math.ucsd.edu___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: pahole - Finding holes in kernel structs

2009-02-12 Thread Nate Eldredge

On Thu, 12 Feb 2009, Marcel Moolenaar wrote:



On Feb 12, 2009, at 8:34 AM, Jille Timmermans wrote:


Julian Stacey schreef:

1) Is it worth my time trying to rearrange structs?

I wondered whether as a sensitivity test, some version of gcc (or
its competitor ?) might have capability to automatically re-order
variables ?  but found nothing in man gcc Optimization Options.
There is a __packed attribute, I don't know what it exactly does and 
whether it is an improvement.




__packed is always a gross pessimization. The side-effect of
packing a structure is that the alignment of the structure
drops to 1. That means that any field will be read 1 byte at
a time and reconstructed by logical operations.


The other alternative is to read/write that member by unaligned 
operations, on platforms that support it.  This also typically comes with 
a performance penalty, of course.  Usually it means the hardware reads the 
two words that overlap the member and pieces it back together.  But on 
such a platform the software does not need to handle it specially; it 
executes the same instruction, but it takes more time.


The only reason to use this would be (1) if you needed to have your 
structure occupy as little memory as possible; for instance, if your 
structure had two elements, one 'int' and one 'char', and you had 1 
billion of them, using __packed__ would save you 3 gigabytes.  Or (2) if 
you need to conform to an externally defined data structure that already 
does this.  Most places in the kernel, I don't think either of these would 
be true.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: gcc 4.3.2 libgcc_s.so exception handling broken?

2009-01-17 Thread Nate Eldredge

On Sat, 17 Jan 2009, xorquew...@googlemail.com wrote:


Hello.

I have some C code that's compiled with -fexceptions using
the lang/gnat-gcc43 port. I'm on 6.4-RELEASE-p2.

A function c_function in the C code takes a callback as an argument.

I'm passing this function the address of a function ext_function defined
in another language (Ada, to be precise, but it seems to happen
with C++ too). The main body of my program is written in this language
so C is effectively the foreign code (whatever).

If ext_function raises an exception, the exception is NOT propagated
through the C code, the process simply exits.


I tried a simple example of this in C++ and it works as expected.  (I am 
on 7.0-RELEASE/amd64.)  So it isn't completely busted, at least.


Can you post an example that exhibits the problem?  Ideally, something 
complete that can be compiled and is as simple as possible.  If you can do 
it with C++ rather than Ada it might be easier, so people don't have to 
install the Ada compiler.  Also please mention the commands you use to 
compile, and what they output when you compile using -v, and what 
architecture you are on.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Confused by segfault with legitimate call to strerror(3) on amd64 / sysctl (3) setting `odd' errno's

2009-01-16 Thread Nate Eldredge

On Fri, 16 Jan 2009, Garrett Cooper wrote:


On Fri, Jan 16, 2009 at 2:52 AM, Thierry Herbelot
thierry.herbe...@free.fr wrote:

Le Friday 16 January 2009, Garrett Cooper a écrit :

On Fri, Jan 16, 2009 at 2:21 AM, Christoph Mallon

#include errno.h
#include stdio.h
#include sys/stat.h

int
main()
{

struct stat sb;

int o_errno;

if (stat(/some/file/that/doesn't/exist, sb) != 0) {
o_errno = errno;
printf(Errno: %d\n, errno);
printf(%s\n, strerror(o_errno));
}

return 0;

}


with this, it's better on an amd64/ RELENG_7 machine :

% diff -ub badfile.c.ori badfile.c
--- badfile.c.ori   2009-01-16 11:49:44.778991057 +0100
+++ badfile.c   2009-01-16 11:49:03.470465677 +0100
@@ -1,6 +1,7 @@
 #include errno.h
 #include stdio.h
 #include sys/stat.h
+#include string.h

 int
 main()

   Cheers

   TfH


That's hilarious -- why does it pass though without issue on x86 though?
-Garrett


As pointed out, when you don't have a declaration for strerror, it's 
implicitly assumed to return `int'.  This feature was widely used in the 
early days of C and so continues to be accepted by compilers, and gcc by 
default doesn't warn about it.


On x86, int and char * are the same size.  So even though the compiler 
thinks strerror is returning an int which is being passed to printf, the 
code it generates is the same as for a char *.  On amd64, int is 32 bits 
but char * is 64.  When the compiler thinks it's using int, it only keeps 
track of the lower 32 bits, and the upper 32 bits get zeroed.  So the 
pointer that printf receives has had its upper 32 bits zeroed, and no 
longer points where it should.  Hence segfault.


Since running on amd64 I've seen a lot of bugs where people carelessly 
assume (perhaps without noticing) that ints and pointers are practically 
interchangeable, which works on x86 and the like but breaks on amd64. 
Variadic functions are special offenders because the compiler can't do 
much type checking.


Pop quiz: which of the following statements is correct?

#include stdlib.h
#include unistd.h

execl(/bin/sh, /bin/sh, 0);
execl(/bin/sh, /bin/sh, NULL);

--

Nate Eldredge
neldre...@math.ucsd.edu___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org

Re: lzo2 shows insane speed gap

2008-12-29 Thread Nate Eldredge

On Mon, 29 Dec 2008, Christian Weisgerber wrote:


The archivers/lzo2 port runs a series of regression tests after the
actual build.  These tests show extremely divergent behavior on
different machines.  There are two types of machines:

Type #1:
 Running the tests takes roughly the same time as configure and
 compile did, whether it's 30 seconds on a fast machine or 10
 minutes on an old slow one.

Type #2:
 Running the tests takes much, much, MUCH longer.

I've tried this across alpha, amd64, i386, and sparc64, partially
on FreeBSD, partially on OpenBSD.  The operating system doesn't
matter and there is no pattern related to endianness or 32/64 bits.

You can find machines that are the same architecture (e.g. amd64)
and are of similar overall speed (e.g. an Intel Xeon Xeon E5405 and
an AMD Phenom 9350e) and one of these machines will be type #1 and
the other will be #2 and take _a hundred_ times longer to run the
tests.  A hundred times.

I have never seen anything like this before.


It might be good first to rule out compiler / library differences.

First, can you isolate a single lzo command / input combination whose time 
differs dramatically?  This would simplify tests compared to running the 
whole test suite.  (It should be easy because it looks like the test suite 
prints the time for each test.)  It might also simplify things to work on 
one fast and one slow machine.


Then try copying the lzo binary from the fast machine to the slow 
machine (and vice versa) and see if the same test speeds up with the 
copied binary.  If not, try again with the binary statically linked.  If 
still not, it would be good to have a copy of the binary made available, 
along with more information about the fast and slow machines (CPU, 
amount of memory, load on the machine, kernel version, disk, etc).


If the copied binary isn't faster than the natively produced one, then it 
would be good to have information about the compiler options, versions, 
etc.


--

Nate Eldredge
neldre...@math.ucsd.edu
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: AMD64 qemu completely broken?

2008-12-07 Thread Nate Eldredge

On Sun, 7 Dec 2008, Juergen Lock wrote:


On Thu, Dec 04, 2008 at 02:43:47PM -0800, Nate Eldredge wrote:

On Thu, 4 Dec 2008, Juergen Lock wrote:


I forgot to say the qemu-devel port (as well as the later snapshots I
posted about on -emulation) also support -curses, which shows the emulated
vga text(!)console on qemu's tty.  This works quite well with FreeBSD guests
(even the isos) if you extend your xterm/whatever by one line (the default
vga textconsole is 80x25 instead of 80x24.)


As long as we're sharing tips about qemu:

I've recently been working with qemu on amd64 and have set up a Debian etch
i386 guest which is working well.  I am using the qemu-devel and
kqemu-kmod-devel ports.  I am not using -kernel-kqemu at the moment; I
thought I would get things working before trying to speed up.

Using qemu I've finally achieved my goal of being able to use flash on
FreeBSD/amd64 (in some sense :-O).


Actually at least on RELENG_7 and later the original www/linux-flashplugin9
+ www/nspluginwrapper don't work too bad at least for video sites these
days (on 6 and 7.0 you need a patch and there it probably doesn't quite
work on SMP because another patch concerning SMP can't be merged.)  See
e.g. this thread on -emulation for more:
   
http://lists.freebsd.org/pipermail/freebsd-emulation/2008-October/005433.html


Thanks for the pointer.  I will probably wait until 7.1 is out and ports 
are defrosted, so I can go straight to flash10 and not to have to do 
everything twice, but this information should be very helpful.



 '-net tap' works fine, but requires root privileges and
is more work to set up.


Actually it doesn't require root privs to run, only to setup.
(Ok you _might_ need sudo to ifconfig the tap device and/or bridge
in the qemu-ifup script...  But qemu itself can certainly run as user.)


Okay.  I was being lazy and letting qemu do some of that work for me.


[*] Out of curiosity, I looked at some Unix Archive stuff and found the
identical code in BSD's Net2, circa 1991.  It is identified in a comment as
a quick hack and adorned with several /* XXX */.  Naturally the code and
the comments survive intact, 17 years later. :-(


This might be somewhat more understandable if you know that the original
slirp code was written many moons ago and only later resurrected for
emulation purposes.  (It was originally invented for dialup users that
logged into shellservers' gettys via serial modem lines so they could
also use the box' inet connection locally before things like ppp were
available...)


Yep, I think I remember trying to use some slip implementation over a 
serial modem once.  It's just unfortunate that qemu chose that code for 
their TCP/IP implementation rather than something else more modern.  Not 
that I'm volunteering to update it :)


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RFC: small syscons and kbd patch

2008-12-05 Thread Nate Eldredge

On Fri, 5 Dec 2008, Garrett Cooper wrote:


On Fri, Dec 5, 2008 at 1:11 AM, Christoph Mallon
[EMAIL PROTECTED] wrote:

Garrett Cooper schrieb:


(I feel like I'm getting off on a bikeshed topic, but...)

1. What dialect of C was it defined in? Is it still used in the
standard dialect (honestly, this is the first time I've ever seen it
before, but then again I am a younger generation user)?


Dialect? The ! operator is plain vanilla standard C. It takes a scalar
operand and returns 1, if it compares equal to 0, otherwise it returns 0.
!!, i.e. two consecutive ! operators, is one of the oldest tricks in the
book, right next to (a  b) - (a  b) for comparison functions and countless
other idioms.


3. What's the real loss of going to `? :', beyond maybe 3 extra
keystrokes if it's easier for folks who may not be as experienced to
read?


I'd like my bikeshed grass green, please.

   Christoph


If you really want to split hairs, ! only negates the logic value,
whereas ~ actually negates the bits. So technically, you're not
flipping 0 to make 1 and vice versa, but instead flipping 0 to make
non-zero, etc. There is a clear distinction in hardware.

The point was that !! isn't obvious at first glancing the C code. It's
important for code to be readable as well as functional (that's why we
have style(9)). Getting down to it I'd like to see what the compiler
optimizes each as, because I can see dumb compilers saying `!!'
translates to `not, bne = set, else set, continue', whereas `? :'
could be translated to `bne, set, else set, continue'; I'm sure gcc
has moved passed these really minute details.


Out of curiosity, I tried some various compilers, including gcc on i386, 
amd64, and sparc; Intel's C compiler on i386; tcc (tiny, non-optimizing C 
compiler) on i386; and Sun's compiler (old version) on sparc.


I compiled the following file:

int bangbang(int x) { return !!x; }
int ternary(int x) { return x ? 1 : 0; }

Intel's compiler generated different code for these two functions when 
optimization was turned off.  bangbang used a conditional set instruction, 
while ternary used conditional jumps.  With optimization on the two were 
identical.


All other compilers generated identical code for the two functions whether 
optimization was on or off.  (Of course, the generated code varied between 
compilers; tcc's in particular was decidedly non-optimized.)


I really don't think something as simple as this is worth worrying about 
in terms of code efficiency.  Even if they weren't identical, the 
difference is at most a couple of instructions and a pipeline flush, and 
if that's a serious problem you need to be using assembly anyway. 
Besides, it's not a piece of code that comes up all that often.


The only basis for arguing about it is style, and I think we've 
established that it's purely a matter of taste.  In particular, there 
isn't a clear favorite for which is easier to read.  IMHO, style(9) should 
remain agnostic and let the programmer decide.


However, if people really feel that consistency is necessary here, I 
propose the following: if the cents digit of the closing price of the Dow 
Jones Industrial Average on this coming Monday, December 8, 2008, is even, 
then style(9) shall be edited to indicate that `!!x' is preferred.  If 
odd, then style(9) shall prefer `x ? 1 : 0'.


:-)

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RFC: small syscons and kbd patch

2008-12-05 Thread Nate Eldredge

On Fri, 5 Dec 2008, Stephen Montgomery-Smith wrote:


Nate Eldredge wrote:


int bangbang(int x) { return !!x; }
int ternary(int x) { return x ? 1 : 0; }


Stylewise, I prefer

int notzero(int x) { return x!=0; }


icc -O0 compiles notzero the same as bangbang (better than ternary).  tcc 
produces better code for notzero than the other two.  Sun cc without 
optimization produces slightly better code for notzero than the other two 
(one jump instead of two).  For everything else all three produce 
equivalent code.


`x  1' and `x || 0' are some other possibilities.

Anyway, maybe there is something more useful we could all be doing. :)

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: AMD64 qemu completely broken?

2008-12-04 Thread Nate Eldredge

On Thu, 4 Dec 2008, Juergen Lock wrote:


I forgot to say the qemu-devel port (as well as the later snapshots I
posted about on -emulation) also support -curses, which shows the emulated
vga text(!)console on qemu's tty.  This works quite well with FreeBSD guests
(even the isos) if you extend your xterm/whatever by one line (the default
vga textconsole is 80x25 instead of 80x24.)


As long as we're sharing tips about qemu:

I've recently been working with qemu on amd64 and have set up a Debian 
etch i386 guest which is working well.  I am using the qemu-devel and 
kqemu-kmod-devel ports.  I am not using -kernel-kqemu at the moment; I 
thought I would get things working before trying to speed up.


Using qemu I've finally achieved my goal of being able to use flash on 
FreeBSD/amd64 (in some sense :-O).


savevm and loadvm don't work due to a security patch.  Since my guest 
system is trusted I reverted the patch.  I filed a PR as ports/129417 .


I found that '-net user' is horribly broken on amd64 (qemu segfaults). 
It uses some ancient [*] BSD TCP/IP code (via slirp) which assumes that 
pointers are 32 bits and doesn't hesitate to shove them into random 32-bit 
corners of externally defined structures if it's convenient.  Looks like a 
pain to clean up.  '-net tap' works fine, but requires root privileges and 
is more work to set up.


[*] Out of curiosity, I looked at some Unix Archive stuff and found the 
identical code in BSD's Net2, circa 1991.  It is identified in a comment 
as a quick hack and adorned with several /* XXX */.  Naturally the code 
and the comments survive intact, 17 years later. :-(


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: tcsh loses the foreground process group?

2008-12-03 Thread Nate Eldredge

On Tue, 2 Dec 2008, Steve Watt wrote:


In article [EMAIL PROTECTED] you write:
[ ... ]

I'm running 6-STABLE (6.4-PRE as of 24 Nov right now), tcsh 6.15.00, which
shows

 tcsh 6.15.00 (Astron) 2007-03-03 (i386-intel-FreeBSD) options 
wide,nls,dl,al,kan,sm,rh,color,filec

as $version.

The symptom is that when I do a long-ish running task inside a `` expansion
that I then ^C, nobody gets the foreground process group...  I never get
a prompt back after the ^C, and ^T gets me

 load: 0.27 no foreground process group


[ ... ]


One portable reproduction:
# cd /usr/src
# less `egrep -lir '^Foo.*baz' *`
^Cload: 0.02 no foreground process group

(I typed ^C ^T)

SIGKILL to the shell seems to be the only way to get things back to
normal.


I've gotten one me too, which indicated that SIGHUP to the shell
will also make it go away, but does not solve the problem.

I've got another FreeBSD machine available that was running tcsh
6.14.00, and it does _NOT_ display the problem.  When I build
6.15.00 on that same box (/usr/src is more up to date than the
install right now), that does fail.

Thus I'm pretty comfortable saying that it's a tcsh bug of some
sort, and probably a regression.  Hopefully this can be fixed
(PR being filed now) before 6.4 releases...


Thanks for the report.  It looks like this is yet another manifestation of 
a problem in tcsh, where it does inappropriate things in a vfork'ed 
subshell.  In my tests, running tcsh with -F (which causes it to use fork 
instead of vfork) causes the problem to go away.  It is also present in 
7.0-RELEASE and probably all later versions.


There are several open bugs related to this problem, but so far they do 
not seem to have attracted the interest of any committers.  Among them 
are:


bin/41297
bin/52746
bin/125185
amd64/128259
bin/129378 (which you just opened)

The fix is simple: make -F the default.  There is a minor performance 
penalty, but that's a small price to pay for correct behavior.  A more 
involved fix would be to make tcsh not do inappropriate things after vfork 
(modifying global variables), or at least clean up before exiting, but 
IMHO that is less clean; vfork really shouldn't be used here at all.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: tcsh loses the foreground process group?

2008-12-03 Thread Nate Eldredge

On Wed, 3 Dec 2008, Nate Eldredge wrote:

Thanks for the report.  It looks like this is yet another manifestation of a 
problem in tcsh, where it does inappropriate things in a vfork'ed subshell. 
In my tests, running tcsh with -F (which causes it to use fork instead of 
vfork) causes the problem to go away.  It is also present in 7.0-RELEASE and 
probably all later versions.


There are several open bugs related to this problem, but so far they do not 
seem to have attracted the interest of any committers.  Among them are:


bin/41297
bin/52746
bin/125185
amd64/128259
bin/129378 (which you just opened)


I have opened bin/129405 as an omnibus PR for these problems.

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [Testers wanted] /dev/console cleanups

2008-11-20 Thread Nate Eldredge

On Thu, 20 Nov 2008, Jeremy Chadwick wrote:


On Wed, Nov 19, 2008 at 11:48:36PM -0800, Nate Eldredge wrote:

On Wed, 19 Nov 2008, Jeremy Chadwick wrote:


On Thu, Nov 20, 2008 at 05:39:36PM +1100, Peter Jeremy wrote:



I hope that never gets committed - it will make debugging kernel
problems much harder.  There is already a kern.msgbuf_clear sysctl and
maybe people who are concerned about msgbuf leakage need to learn to
use it.


And this sysctl is only usable *after* the kernel loads, which means
you lose all of the messages shown from the time the kernel loads to
the time the sysctl is set (e.g. hardware detected/configured).  This is
even less acceptable, IMHO.


But surely you can arrange that the contents are written out to
/var/log/messages first?

E.g. a sequence like

- mount /var
- write buffer contents via syslogd
- clear buffer via sysctl
- allow user logins


This has two problems, but I'm probably missing something:

1) See my original post, re: users of our systems use dmesg to find
out what the status of the system is.  By status I don't mean from
the point the kernel finished to now, I literally mean they *expect*
to see the kernel device messages and all that jazz.  No, I'm not
making this up, nor am I arguing just to hear myself talk (despite
popular belief).  I can bring these users into the discussion if people
feel it would be useful.


I forgot about that point.  I can sympathize with those users; I 
feel the same way.  It's a good way to learn about a system as a 
mere user (since usually sysadmins don't remember or bother to 
disable it).


However, in my experience dmesg isn't really the best thing for that 
purpose; the kernel message buffer tends to get wiped out once the system 
has been up for a while.  (It fills with ipfw logs, ethernet link state 
changes, etc.)


Maybe a better approach would be to point them to /var/log/messages 
or whichever log file stores them permanently.  Or, better yet, do some 
syslogd magic to make a logfile that can be appropriately readable but 
doesn't have any overly sensitive messages directed there (e.g. kernel 
yes, sshd no).



2) I don't understand how this would work (meaning, technically and
literally: I do not understand).  How do messages like CPU: Intel(R)
Core(TM)2 Duo CPU E8400 @ 3.00GHz (2992.52-MHz K8-class CPU) get
written to syslog when syslogd isn't even running (or any filesystems)
mounted at that time?  There must be some magic involved there (since
syslog == libc, not syscall) when syslogd starts, but I don't know
how it works.


I think you're conflating a couple of things, and I also explained my idea 
poorly.


As I understand it (from memory, which is a little vague), syslogd gets 
messages from two places: from the kernel via /dev/klog, and from other 
processes via a Unix domain socket in /var/run.  These messages then get 
sent to the appropriate log files.  The syslog(3) function of libc just 
connects and writes the message to the Unix domain socket.  If syslogd 
isn't running to listen on that socket, syslog(3) won't work very well.


Now /dev/klog should be a direct line to the kernel's message buffer.  So 
when syslogd starts and reads /dev/klog for the first time, it will get 
everything that's accumulated so far, including the earliest boot 
messages.  It should then write them to the appropriate log files.  This 
already works, which is why /var/log/messages contains the kernel 
copyright message, etc.


My idea is, after syslogd does this, but before the system goes 
multi-user, you should clear the kernel buffer.  Early boot messages are 
already in the log files, so they won't be lost.  Maybe the best thing 
would be to build this functionality into syslogd itself, to minimize the 
possibility of losing messages due to a race.



This way the buffer is cleared before any unprivileged users get to do
anything.  No kernel changes needed, just a little tweaking of the init
scripts at most.

If you should have a crash and suspect there is useful data in the
buffer, you can boot to single-user mode (avoiding the clear) and
retrieve it manually.

Seems like this should make everyone happy.


What I'm not understanding is the resistance towards Rink's patch,
assuming the tunable defaults to disabled/off.


It seems reasonable to me.  The only catch I can see is that if you have a 
crash and you want to look at the message buffer after rebooting, you'll 
have to remember to stop at the loader prompt and turn off that tunable. 
Which might be easy to forget in the heat of the moment.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [Testers wanted] /dev/console cleanups

2008-11-19 Thread Nate Eldredge

On Wed, 19 Nov 2008, Jeremy Chadwick wrote:


On Thu, Nov 20, 2008 at 05:39:36PM +1100, Peter Jeremy wrote:



I hope that never gets committed - it will make debugging kernel
problems much harder.  There is already a kern.msgbuf_clear sysctl and
maybe people who are concerned about msgbuf leakage need to learn to
use it.


And this sysctl is only usable *after* the kernel loads, which means
you lose all of the messages shown from the time the kernel loads to
the time the sysctl is set (e.g. hardware detected/configured).  This is
even less acceptable, IMHO.


But surely you can arrange that the contents are written out to 
/var/log/messages first?


E.g. a sequence like

- mount /var
- write buffer contents via syslogd
- clear buffer via sysctl
- allow user logins

This way the buffer is cleared before any unprivileged users get to do 
anything.  No kernel changes needed, just a little tweaking of the init 
scripts at most.


If you should have a crash and suspect there is useful data in the buffer, 
you can boot to single-user mode (avoiding the clear) and retrieve it 
manually.


Seems like this should make everyone happy.

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: How can I add new binaries to the mfsroot image?

2008-11-16 Thread Nate Eldredge

On Sun, 16 Nov 2008, Peter Steele wrote:


I want to make a custom FreeBSD install CD-ROM with additional commands
available in the mfsroot image. Adding the new commands to the image is
easy enough, and I've made an install.cfg file on the CD-ROM as well so
that when the CD runs the commands in install.cfg are automatically
executed. This all works, except none of the new binaries I add to the
mfsroot image run during the automated sysinstall session. If I
reference one of the default commands (the ones stored in /stand) they
run fine, but if I add a new FreeBSD binary to the /stand directory
(e.g. gmirror), the command fails.


How does it fail?

Is the binary you added statically linked?


What's weird is that I can open a fixit shell after the install.cfg
script fails and then run the same commands interactively and they work
fine. Why would work these commands work in an interactive fixit shell
but not during the automated sysinstall session?


Wild guess: the shared libraries are present somewhere else on the CD, 
which perhaps is either not mounted or not pointed to by LD_LIBRARY_PATH 
or similar until the fixit shell is run.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Unprivileged user can't set sticky bit on a file; why?

2008-11-14 Thread Nate Eldredge

On Fri, 14 Nov 2008, Volodymyr Kostyrko wrote:


Nate Eldredge wrote:

I came across this when trying to rsync some files which had the sticky bit 
set on the remote side.  (It's the historical Unix archive from tuhs.org; 
the files in question are part of an unpacked V7 UNIX installation, for 
which the sticky bit of course had meaning. :-) )  It's annoying that this 
makes rsync fail; it messes up my mirroring script.


You can ask rsync to change file attributes on the fly with the --chmod 
option. Just my 2c.


Thanks for this hint.  --chmod=F-t solves my problem.  But I am still 
curious about this behavior.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Unprivileged user can't set sticky bit on a file; why?

2008-11-13 Thread Nate Eldredge

Hi folks,

FreeBSD doesn't allow an unprivileged user to set the sticky bit (mode 
S_ISTXT, octal 01000) on a file, though it does allow root to do so.


[EMAIL PROTECTED]:/tmp$ chmod +t foo
chmod: foo: Inappropriate file type or format
[EMAIL PROTECTED]:/tmp$ su
Password:
vulcan# chmod +t foo
vulcan# ls -l foo
-rw-r--r-T  1 nate  wheel  0 Nov 13 22:46 foo

Why is this?

I don't expect the sticky bit to actually do anything on a regular file in 
this day and age (I know what its historical behavior was, and what it 
does for directories), but I'd think it would be harmless to set it. 
Linux lets a user set the sticky bit, and Solaris silently masks it off.


I came across this when trying to rsync some files which had the sticky 
bit set on the remote side.  (It's the historical Unix archive from 
tuhs.org; the files in question are part of an unpacked V7 UNIX 
installation, for which the sticky bit of course had meaning. :-) )  It's 
annoying that this makes rsync fail; it messes up my mirroring script.


sticky(8) says the bit is ignored for regular files, which evidently 
isn't accurate.  chmod(2) says on UFS-based file systems (FFS, LFS) the 
sticky bit may only be set upon directories, which isn't right either 
since root is able to do it.  src/sys/ufs/ufs/ufs_vnops.c has the 
following comment:


/*
 * Privileged processes may set the sticky bit on non-directories,
 * as well as set the setgid bit on a file with a group that the
 * process is not a member of.  Both of these are allowed in
 * jail(8).
 */

but does not explain why unprivileged process should be forbidden to set 
the sticky bit.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ukbd attachment and root mount

2008-11-12 Thread Nate Eldredge

On Wed, 12 Nov 2008, Andriy Gapon wrote:


on 05/11/2008 17:24 Andriy Gapon said the following:

[...]

I have a legacy-free system (no PS/2 ports, only USB) and I wanted to
try a kernel without atkbd and psm (with ums, ukbd, kbdmux), but was
bitten hard when I made a mistake and kernel could not find/mount root
filesystem.

So I stuck at mountroot prompt without a keyboard to enter anything.
This was repeatable about 10 times after which I resorted to live cd.

Since then I put back atkbdc into my kernel. I guess BIOS or USB
hardware emulate AT or PS/2 keyboard, so the USB keyboard works before
the driver attaches. I guess I need such emulation e.g. for loader or
boot0 configuration. But I guess I don't have to have atkbd driver in
kernel.


This turned out not to be a complete solution as it seems that there are
some quirks about legacy USB here, sometimes keyboard stops working even
at loader prompt (this is described in a different thread).

ukbd attachment still puzzles me a lot.
I look at some older dmesg, e.g. this 7.0-RELEASE one:
http://www.mavetju.org/mail/view_message.php?list=freebsd-usbid=2709973
and see that ukbd attaches along with ums before mountroot.

I look at newer dmesg and I see that ums attaches at about the same time
as before but ukbd consistently attaches after mountroot.
I wonder what might cause such behavior and how to fix it.
I definitely would like to see ukbd attach before mountroot, I can debug
this issue, but need some hints on where to start.


I haven't been following this thread, and I'm pretty sleepy right now, so 
sorry if this is irrelevant, but I had a somewhat similar problem that was 
fixed by adding


hint.atkbd.0.flags=0x1

to /boot/device.hints .

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Asynchronous pipe I/O

2008-11-05 Thread Nate Eldredge

On Wed, 5 Nov 2008, rihad wrote:


Imagine this shell pipeline:

sh prog1 | sh prog2


As given above, prog1 blocks if prog2 hasn't yet read previously written
data (actually, newline separated commands) or is busy. What I want is
for prog1 to never block:

sh prog1 | buffer | sh prog2


[and misc/buffer is unsuitable]

I found an old piece of code laying around that I wrote for this purpose. 
Looking at it, I can see a number of inefficiencies, but it might do in a 
pinch.  You're welcome to use it; I hereby release it to the public 
domain.


Another hack that you could use, if you don't mind storing the buffer on 
disk rather than memory, is


sh prog1  tmpfile 
tail -f -c +0 tmpfile | sh prog2

Here's my program.

/* Buffering filter. */

#include stdio.h
#include unistd.h
#include sys/types.h
#include stdlib.h
#include errno.h
#include string.h

/* Size of a single buffer. */

#define BUFSIZE 512

struct buffer {
  struct buffer *next;
  size_t length;
  unsigned char buf[BUFSIZE];
};

struct buffer *reader;
struct buffer *writer;


int max_mem = 100 * 1024;

int current_mem;

#define OK 0
#define WAIT 1
#define GIVEUP 2

int read_one (int fd)
{
  int result;

  if (current_mem  (max_mem - sizeof(*reader-next)))
{
  fprintf(stderr, Reached max_mem!\n);
  return WAIT;
}
  /* Get a new buffer. */
  reader-next = malloc(sizeof(*reader-next));
  if (reader-next)
{
  current_mem += sizeof(*reader-next);
  fprintf(stderr, \rReading: \t%u bytes in buffer ,
  current_mem);
}
  else
{
  fprintf(stderr, Virtual memory exhausted\n);
  return WAIT;
}

  reader = reader-next;
  reader-next = NULL;

  result = read(fd, reader-buf, BUFSIZE);
  if (result  0)
reader-length = result;
  else if (result == 0)
{
  fprintf(stderr, Hit EOF on reader\n);
  return GIVEUP;
}
  else if (result  0)
{
  fprintf(stderr, Error on reader: %s\n, strerror(errno));
  return GIVEUP;
}
  return OK;
}



int write_one (int fd)
{
  struct buffer *newwriter;

  if (reader == writer)
return WAIT; /* the reader owns the last buffer */

  if (writer-length  0)
{
  int result;
  result = write(fd, writer-buf, writer-length);
  if (result == 0)
{
  fprintf(stderr, Hit EOF on writer\n);
  return GIVEUP;
}
  else if (result  0)
{
  fprintf(stderr, Error on writer: %s\n, strerror(errno));
  return GIVEUP;
}
}
  newwriter = writer-next;
  free(writer);
  current_mem -= sizeof(*writer);
  fprintf(stderr, \rWriting: \t%u bytes in buffer ,
  current_mem);
  writer = newwriter;
  return OK;
}

void move_data(int in_fd, int out_fd)
{
  int reader_state = OK;
  int writer_state = OK;

  int maxfd = ((in_fd  out_fd) ? in_fd : out_fd) + 1;

  reader = malloc(sizeof(*reader));
  if (!reader)
{
  fprintf(stderr, No memory at all!\n);
  return;
}
  reader-next = NULL;
  reader-length = 0;
  writer = reader;
  current_mem = sizeof(*reader);

  while (1) /* break when done */
{
  int result;
  fd_set read_set, write_set;
  FD_ZERO(read_set);
  FD_ZERO(write_set);
  if (reader_state == OK)
FD_SET(in_fd, read_set);
  if (writer_state == OK)
FD_SET(out_fd, write_set);
  result = select(maxfd, read_set, write_set, NULL, NULL);

  /* If we're ready to do something, do it.  Also let the other
 end get a chance if something changed. */

  if (FD_ISSET(in_fd, read_set))
{
  reader_state = read_one(in_fd);
  if (writer_state == WAIT)
writer_state = OK;
}

  if (FD_ISSET(out_fd, write_set))
{
  writer_state = write_one(out_fd);
  if (reader_state == WAIT)
reader_state = OK;
}

  /* Check for termination */
  if (writer_state == GIVEUP)
break; /* can't write any more */
  if (reader_state == GIVEUP  writer_state == WAIT)
break; /* can't read any more, and wrote all we have */
}
}

int main(void)
{
  move_data(0, 1);
  return 0;
}

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: memtest86+ can not link: binutils issue?

2008-10-31 Thread Nate Eldredge

On Fri, 31 Oct 2008, Andriy Gapon wrote:


on 30/10/2008 20:46 Peter Jeremy said the following:

On 2008-Oct-30 18:08:35 +0200, Andriy Gapon [EMAIL PROTECTED] wrote:

ld --warn-constructors --warn-common -static -T memtest_shared.lds \
   -o memtest_shared head.o reloc.o main.o test.o init.o lib.o
patn.o screen_buffer.o config.o linuxbios.o memsize.o pci.o controller.o
random.o extra.o spd.o error.o dmi.o  \
   ld -shared -Bsymbolic -T memtest_shared.lds -o memtest_shared
head.o reloc.o main.o test.o init.o lib.o patn.o screen_buffer.o
config.o linuxbios.o memsize.o pci.o controller.o random.o extra.o spd.o
error.o dmi.o
head.o(.text+0x7): In function `startup_32':
: undefined reference to `_GLOBAL_OFFSET_TABLE_'
Segmentation fault (core dumped)
gmake: *** [memtest_shared] Error 139


I can't help here.  _GLOBAL_OFFSET_TABLE_ is related to the binutils
PIC support and it appears that the linker doesn't like the code (in
head.S) is explicitly referencing it.


Not only linking fails, but ld even crashes.


I agree this shouldn't happen.


Can anybody suggest anything about this problem?


It looks like stand-alone PIC code on FreeBSD needs some different
incantations to Linux.  My understanding is that several of the
i386 bootstraps are relocatable so you might like to peruse the
code in /usr/src/sys/boot/i386 for ideas.


I wonder if this is something about out port of binutils or is it an
issue in older version of binutils.
I'll try to look at the boot code, thank you for the hint.


FreeBSD's version of binutils is quite old.  I've definitely found bugs in 
it which are fixed in GNU's current version.  So you might try building 
the official GNU binutils and see if that works any better.  I don't know 
if it will fix your error but maybe it at least won't crash.


ld crashing is definitely a bug, and it would be nice if you could file a 
PR, including the object files.  If the GNU version doesn't crash that 
would be useful information for the PR also, as it might encourage Them to 
consider importing a newer version.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: includes, configure, /usr/lib vs. /usr/local/lib, and linux coders

2008-10-31 Thread Nate Eldredge

On Fri, 31 Oct 2008, Steve Franks wrote:


Let's backup.  What's the 'right' way to get a bloody linux program
that expects all it's headers in /usr/include to compile on freebsd
where all the headers are in /usr/local/include?  That's all I'm
really asking.  Specifically, it's looking for libusb  libftdi.  If I
just type gmake, it can't find it, but if I manually edit the
Makefiles to add -I/usr/local/include, it can.  Obviously, manually
editing the makefiles is *not* the right way to fix it (plus it's
driving me crazy).


C_INCLUDE_PATH=$C_INCLUDE_PATH:/usr/local/include
LIBRARY_PATH=$LIBRARY_PATH:/usr/local/lib
export C_INCLUDE_PATH LIBRARY_PATH
./configure
gmake

Adjust as appropriate if using csh.

Personally, I set those environment variables in my .profile.

By the way, I think you're being a little unfair to blame this on Linux 
programs or programmers.  Normally it's the user's responsibility to 
ensure that their compiler searches for include files, etc, in the 
appropriate place.  Many Linux distributions put everything in 
/usr/include, which is searched by default.  FreeBSD puts stuff from ports 
in /usr/local/include which isn't searched by default.  I find that 
behavior inconvenient, which is why I set those environment variables, so 
I don't have to think about it.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: neophyte: tcsetattr() gives 22 error in i386, not in amd64?

2008-10-24 Thread Nate Eldredge

On Fri, 24 Oct 2008, Steve Franks wrote:


Hi,

I'm getting a 22 errno from tcsetattr() on 7-STABLE i386 in code which
was working under 7-STABLE amd64.  Serial device is a ucom (silabs
cp2103).  Permissions on /dev/cuaU0 look fine.  Cutecom/Minicom
appears to open the port without error...


I don't see anything obviously wrong, but I'd bet a bug related to 
32/64-bit types.  Can you post a complete piece of code that can be 
compiled and run and demonstrates the problem?  Also, try compiling with 
-Wall -W and investigate any warnings that are produced.


By the way, errno 22 is EINVAL, Invalid argument.  perror() is your 
friend.


[snip code]

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Why does adding /usr/lib32 to LD_LIBRARY_PATH break 64-bit binaries?

2008-10-23 Thread Nate Eldredge

On Thu, 23 Oct 2008, Alexander Sack wrote:


Alright, well I found some weirdness:

[EMAIL PROTECTED] ~]# export 
LD_LIBRARY_PATH=/usr/bin:/usr/lib:/usr/lib32:/usr/lib64
[EMAIL PROTECTED] ~]# LD_DEBUG=1 ls
/libexec/ld-elf.so.1 is initialized, base address = 0x800506000
RTLD dynamic = 0x80062ad78
RTLD pltgot  = 0x0
processing main program's program header
Filling in DT_DEBUG entry
lm_init((null))
loading LD_PRELOAD libraries
loading needed objects
Searching for libutil.so.5
 Trying /usr/bin/libutil.so.5
 Trying /usr/lib/libutil.so.5
 Trying /usr/lib32/libutil.so.5
loading /usr/lib32/libutil.so.5
/libexec/ld-elf.so.1: /usr/lib32/libutil.so.5: unsupported file layout

That's because libutil.so.5 does not exist in /usr/lib only in /lib.
The /usr/lib directory has:

[EMAIL PROTECTED] ~]# ls -l  /usr/lib/libutil*
-r--r--r--  1 root  wheel  100518 Aug 21  2007 /usr/lib/libutil.a
lrwxrwxrwx  1 root  wheel  17 Sep 11 11:44 /usr/lib/libutil.so -
/lib/libutil.so.5
-r--r--r--  1 root  wheel  103846 Aug 21  2007 /usr/lib/libutil_p.a

So rtld is looking for major number 5 of libutil, without the standard
/lib in my LD_LIBRARY_PATH it searches /usr/lib, doesn't find it but:

[EMAIL PROTECTED] ~]# ls -l  /usr/lib32/libutil*
-r--r--r--  1 root  wheel  65274 Aug 21  2007 /usr/lib32/libutil.a
lrwxrwxrwx  1 root  wheel 12 Sep 11 11:45 /usr/lib32/libutil.so -
libutil.so.5
-r--r--r--  1 root  wheel  46872 Aug 21  2007 /usr/lib32/libutil.so.5
-r--r--r--  1 root  wheel  66918 Aug 21  2007 /usr/lib32/libutil_p.a

And whalah, I'm broke since there is a libutil.so.5 in there.

So my question to anyone out there, WHY does /usr/lib32 contain major
numbers but /usr/lib does not?  This seems like a bug to me (FreeBSD
7.0-RELEASE is the same) or at least a dubious design decision.


I think the distinction is this.  rtld is looking for libutil.so.5 (with 
version number).  This file has to be in /lib, in the root filesystem, so 
that programs can run before /usr is mounted.


libutil.so on the other hand is not searched for by rtld, but by ld 
(driven by cc), when the program is built.  /usr/lib is the traditional 
place for it to search; I'm not sure if it searches /lib at all.  In the 
case of static libraries, /usr/lib is certainly the right place for 
libutil.a to go, so having libutil.so there makes sense in my mind.


I think your best bet is to dig into whatever is setting LD_LIBRARY_PATH 
and get it set correctly.  Remove /usr/lib32 or at least ensure that /lib 
is searched first.  Trying to change rtld's behavior is not the right 
approach, IMHO.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Laptop suggestions?

2008-10-22 Thread Nate Eldredge

On Wed, 22 Oct 2008, Gary Kline wrote:


On Wed, Oct 22, 2008 at 01:06:29PM +0200, Dag-Erling Sm?rgrav wrote:

martinko [EMAIL PROTECTED] writes:

I have always thought that Fn key in left most bottom corner of the
keyboard is, especially for programmers, a very bad idea.  :-(


Seconded.  Worse still, on my Lenovo T60, if the Fn key is held down
longer than a fraction of a second, it generates an input event which
just happens to correspond to Gnome's default key binding for the next
track function in media players...




I've seen that Fn key, but don't know what it is for.  What? you press
it, then follow with the integers [ 1, 2, 3 ... ]?   At any rate, maybe
you can remap the key with ~/.xmodmaprc.


Fn is usually used on laptop keyboards to allow two logical keys to share 
a single physical key.  For example, see the keyboard pictured at
http://www.notebookreview.com/assets/3415.jpg .  On the extreme lower 
right is a key with - in white and End in blue.  Pressing it by 
itself sends the keycode corresponding to an ordinary keyboard's - key. 
Holding Fn and pressing that key sends the keycode corresponding to an 
ordinary keyboard's End key.  On many keyboards, pressing Fn by itself 
sends no keycode at all, so it cannot be remapped.


It is also sometimes used to control hardware features which on a desktop 
machine might have a different interface.  For instance, on the laptop 
pictured, holding Fn and pressing F6 would increase the screen brightness, 
probably without sending a keycode.  A desktop machine would probably have 
a button on the monitor itself to do this.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: indicating a debug image

2008-10-16 Thread Nate Eldredge

On Thu, 16 Oct 2008, Chuck Robey wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I was wondering, for FreeBSD images, is there a symbol that one could look for,
to indicate if image had debug symbols?  I know you could destroy that by just
stripping, I just wanted to know if there is any way to definitely tell, short
of firing up gdb and looking for info.


There's really three possibilities:

1. Image has no symbols
2. Image has only non-debug symbols (e.g. global functions and variables)
3. Image has debug symbols (e.g. line numbers, local variables)

strip(1) or gcc -s produces #1.  gcc without -g produces #2.  gcc -g 
produces #3.


You can distinguish #1 because 'nm image' will give no output.  nm and 
objdump don't appear able to distinguish #2 and #3, but readelf -w will 
give a bunch of output for #3 and none for #2.


Does that help?

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS boot

2008-10-11 Thread Nate Eldredge

On Sat, 11 Oct 2008, Pegasus Mc Cleaft wrote:


FWIW, my system is amd64 with 1 G of memory, which the page implies is
insufficient.  Is it really?


This may be purely subjective, as I have never bench marked the speeds, 
but
when I was first testing zfs on a i386 machine with 1gig ram, I thought the
performance was mediocre. However, when I loaded the system on a quad core -
core2 with 8 gigs ram, I was quite impressed. I put localized changes in my
/boot/loader.conf to give the kernel more breathing room and disabled the
prefetch for zfs.

#more loader.conf
vm.kmem_size_max=1073741824
vm.kmem_size=1073741824
vfs.zfs.prefetch_disable=1


I was somewhat confused by the suggestions on the wiki.  Do the kmem_size 
sysctls affect the allocation of *memory* or of *address space*?  It seems 
a bit much to reserve 1 G of memory solely for the use of the kernel, 
expecially in my case when that's all I have :)  But on amd64, it's 
welcome to have terabytes of address space if it will help.



The best advice I can give is for you to find an old machine and 
test-bed zfs
for yourself. I personally have been pleased with it and It has saved my
machines data 4 times already (dieing hardware, unexpected power bounces, etc)


Sure, but if my new machine isn't studly enough to run it, there's no 
hope for an old machine.  So I'm trying to figure out what I actually 
need.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Is it possible to recover from SEGV?

2008-10-11 Thread Nate Eldredge

On Sat, 11 Oct 2008, Yuri wrote:


Let's say I have signal(3) handler set.
And I know exactly what instruction caused SEGV and why.

Is there a way to access from signal handler CPU registers as they
were before signal, modify some of them, clear the signal and
continue from the instruction that caused SEGV initially?


Absolutely.  Declare your signal handler as

void handler(int sig, int code, struct sigcontext *scp);

You will need to cast the pointer passed to signal(3).  struct sigcontext 
is defined in machine/sysarch.h I believe.   struct sigcontext contains 
the CPU registers as they were when the faulting instruction began to 
execute.  You can modify them and then return from the signal handler. 
The program will resume the faulting instruction with the new registers. 
You can also alter the copy of the instruction pointer in the struct 
sigcontext if you want it to resume somewhere else.


There is also a libsigsegv which looks like it wraps some of this process 
in a less machine-specific way.


Out of curiosity, what are you looking to achieve with this?  And what 
architecture are you on?


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: continuous backup solution for FreeBSD

2008-10-08 Thread Nate Eldredge

On Wed, 8 Oct 2008, Evren Yurtesen wrote:


Thanks again for pointing out snapshots. It is more or less suitable :)


I'll just warn you that if you're planning to use snapshots for your own 
purposes, to first do an extensive stress test on a non-critical machine 
with backed up data.  I've had a lot of problems with snapshots 
occasionally causing deadlocks which hang the machine.  This was under 6.x 
but I had the same problem under many previous versions, so I don't 
necessarily expect that it's fixed.  Also, while it's never happened to 
me, I've heard other people report data corruption.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Debugging reboot with Linux emulation

2008-08-13 Thread Nate Eldredge

Hi folks,

I recently tried to run a Linux binary of Maple (commercial math software) 
on my FreeBSD 7.0-RELEASE/amd64 box, and the machine rebooted.  I tried it 
again while watching the console, and no panic message appeared to be 
produced.  Does anyone have any ideas on how to debug problems of this 
nature?  I realize I may not be able to get Maple to work, but in any case 
the system should not die like this, so I can at least try to fix that 
bug.


Incidentally, is it possible to run kdb with a USB keyboard?  Hitting 
Ctrl-Alt-Esc gives me the kdb prompt, but I can't type, so I can do 
nothing except hit the power button.  I do have hint.atkbd.0.flags=0x1 
in /boot/device.hints.  Unfortunately I don't have a PS/2 keyboard on 
hand, though I can try and get a hold of one if all else fails.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Debugging reboot with Linux emulation

2008-08-13 Thread Nate Eldredge

On Wed, 13 Aug 2008, Kostik Belousov wrote:


Then, the issue of mixing our reboot(2)/linux fcntl(2) is irrelevant.
The original reporter said that system just rebooted, and I believe
that filesystems where not synced and not unmounted properly. Our
reboot(2) does not have flag combination that could cause such
behaviour, I think.


You are right, file systems were not unmounted, and I doubt that they 
were synced either.  They had to be fscked when the system came back up.



Also, I doubt that the program being run is statically linked or
run by root. Confirmation ?


I did not run it as root.  Sorry, I should have said that before.

It is a little hard to trace their maze of shell scripts to figure out 
which binary was being run, but if I am looking at the right one, it is 
dynamically linked and branded SVR4.  I will make sure later today.



Overall, this looks like a nasty bug, hopefully in the linuxolator.


Indeed.

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: read with timeout ??

2008-08-08 Thread Nate Eldredge

On Fri, 8 Aug 2008, Chuck Robey wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Pieter de Goeje wrote:


I think poll(2) is also simpler than select for this purpose.



It does look like that, I need to check the implementation a bit, because the
name of this thing makes me really suspicious about how often it checks for an
fd for being ready for a read.  I know select comes right back, I was under the
impression that poll didn't use signals to do this.


AFAIK the effects are identical, just the arguments are set up in a 
different way.  Both of them will block until the fd is ready and then 
return immediately (subject to other processes running of course).  The 
name poll is a misnomer because it doesn't actually work by polling, 
but you can pretend that it does (and does so infinitely often). 
Neither one uses signals per se, though if the underlying hardware device 
is interrupt-driven, that will be what (indirectly) triggers the wake-up.


poll does seem to be more convenient than messing about with fd_set's. 
select is older and so it comes to my mind first, that's all.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: read with timeout ??

2008-08-07 Thread Nate Eldredge

On Thu, 7 Aug 2008, Chuck Robey wrote:


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I have my head lost in a code problem.  I just hit a point where I need to do a
read from an fd, but I need to associate it with a timeout, on the order of 1
second, something like that.  I had the feeling that there's a function in
FreeBSD's libc that makes that simple, but I forget the function name.  If
anyone can remember something like what I'm talking about, I sure would
appreciate a function name.  I can figure out how it works, if I could only
dredge up that name.


man 2 select

--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: consolekit on 7.0-STABLE i386

2008-07-30 Thread Nate Eldredge

On Wed, 30 Jul 2008, sam wrote:


hello

my trouble


FreeBSD static 7.0-STABLE FreeBSD 7.0-STABLE #23: Mon Jul 28 18:10:51 MSD 
2008 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/STATIC  i386



top_output-
|874 root17   00  8296K  2660K waitvt 1   0:00  0.00% 
console-kit-daemon|



---vmstat_output---
| procs  memory  pagedisks faultscpu
r b w avmfre   flt  re  pi  pofr  sr ad4 ad6   in   sy   cs us sy 
id
0 19 0   1113M29M   493   1   0   0   265 129   0   0  133 45119 4588  8 
5 87
0 20 0   1113M29M   249   0   2   0  3311   0   0  22  157 7872 2262  5 
7 88
0 19 0   1113M29M   346   0   0   0   148   0   0   0  110 78963 1793  4 
9 87
0 19 0   1113M29M   115   0   0   0 0   0   0   0  105 5743 1731 13 
1 85
0 19 0   1113M29M   318   0   0   0   138   0   0   0  108 78837 1732  3 
10 87
0 19 0   1113M29M   112   0   0   032   0   0   1  100 5549 1682 11 
1 88
0 19 0   1113M29M   297   0   0   0   136   0   0   2  122 78880 1749  6 
7 87

|

consolekit in |waitvt state, influencing on high volumes in procs-b


I don't understand what the problem is.  It looks like consolekit is 
sleeping and not using any CPU.  waitvt just indicates where in the 
kernel it's sleeping.  I don't understand what you mean by high volumes 
in procs-b.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: General questions about virtual memory

2008-07-30 Thread Nate Eldredge

On Wed, 30 Jul 2008, FreeBSD Hackers wrote:


If anyone is willing to help me understand this, I would greatly appreciate
it.  I would also value your input if there are other resources (people,
mailing lists, books, web pages, etc.) that you want to recommend instead of
taking some time to help teach me.


As a slightly less orthodox suggestion, I learned a lot of this from the 
practice side rather than the theory side, and it seems like maybe 
this is where some of your questions lie.  In addition to a textbook, you 
might find it useful to get a copy of the manual for your favorite CPU, 
which will explain, at the level of assembly language, how all these 
features work.  (They are usually available free on the manufacturer's 
website, though you may have to hunt around a bit or register for a 
developer program or something.)  You can read it in conjunction with the 
FreeBSD kernel source to see an actual example.  I found this approach 
very instructive.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SCHED_4BSD bad interactivity on 7.0 vs 6.3

2008-07-14 Thread Nate Eldredge

On Sun, 13 Jul 2008, Nate Eldredge wrote:


On Sun, 13 Jul 2008, Kris Kennaway wrote:


Nate Eldredge wrote:

On Sun, 13 Jul 2008, Kris Kennaway wrote:


Nate Eldredge wrote:

Hi folks,

Hopefully this is a good list for this topic.

It seems like there has been a regression in interactivity from 
6.3-RELEASE to 7.0-RELEASE when using the SCHED_4BSD scheduler.  After 
upgrading my single-cpu amd64 box, 7.0 has much worse latency.  When 
running a kernel compile, there is a noticeable lag to echo my typing or 
scroll my browser windows, and playing an mp3 frequently cuts out for a 
second or two.  This did not happen on 6.3-RELEASE.


Are you sure it's not the x.org server bug that was present in the 
version shipped with 7.0?  Update to the latest version and see if your X 
interactivity improves.


Yes, I had not yet upgraded my x.org port when testing this, so it was the 
same x.org that was fine under 6.3.  Also:


I wrote a small program which forks two processes that run 
gettimeofday() in a tight loop to see how long they get scheduled out. 
On 6.3 the maximum latency is usually under 100 ms.  On 7.0 it is 500 ms 
or more even when nothing else is running on the system.  When a compile 
is also running it is sometimes 1400 ms or more.


This test shows a difference even in single user mode, when X is not 
running at all.




It shows *a* difference, but perhaps not the *same* difference.  Please 
humour me and rule it out.


Okay.  I am in the process of recompiling all my ports, so after that is done 
I will boot with a GENERIC kernel and see what happens.


After trying this, I can't seem to reproduce the sound skipping behavior, 
unless I do something fairly extreme like make -j 6.  But the mouse does 
seem to skip when a compile is running, so I do believe there is a 
regression.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SCHED_4BSD bad interactivity on 7.0 vs 6.3

2008-07-13 Thread Nate Eldredge

On Sun, 13 Jul 2008, Kris Kennaway wrote:


Nate Eldredge wrote:

Hi folks,

Hopefully this is a good list for this topic.

It seems like there has been a regression in interactivity from 6.3-RELEASE 
to 7.0-RELEASE when using the SCHED_4BSD scheduler.  After upgrading my 
single-cpu amd64 box, 7.0 has much worse latency.  When running a kernel 
compile, there is a noticeable lag to echo my typing or scroll my browser 
windows, and playing an mp3 frequently cuts out for a second or two.  This 
did not happen on 6.3-RELEASE.


Are you sure it's not the x.org server bug that was present in the version 
shipped with 7.0?  Update to the latest version and see if your X 
interactivity improves.


Yes, I had not yet upgraded my x.org port when testing this, so it was the 
same x.org that was fine under 6.3.  Also:


I wrote a small program which forks two processes that run gettimeofday() 
in a tight loop to see how long they get scheduled out.  On 6.3 the maximum 
latency is usually under 100 ms.  On 7.0 it is 500 ms or more even when 
nothing else is running on the system.  When a compile is also running it 
is sometimes 1400 ms or more.


This test shows a difference even in single user mode, when X is not 
running at all.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SCHED_4BSD bad interactivity on 7.0 vs 6.3

2008-07-13 Thread Nate Eldredge

On Sun, 13 Jul 2008, Kris Kennaway wrote:


Nate Eldredge wrote:

On Sun, 13 Jul 2008, Kris Kennaway wrote:


Nate Eldredge wrote:

Hi folks,

Hopefully this is a good list for this topic.

It seems like there has been a regression in interactivity from 
6.3-RELEASE to 7.0-RELEASE when using the SCHED_4BSD scheduler.  After 
upgrading my single-cpu amd64 box, 7.0 has much worse latency.  When 
running a kernel compile, there is a noticeable lag to echo my typing or 
scroll my browser windows, and playing an mp3 frequently cuts out for a 
second or two.  This did not happen on 6.3-RELEASE.


Are you sure it's not the x.org server bug that was present in the version 
shipped with 7.0?  Update to the latest version and see if your X 
interactivity improves.


Yes, I had not yet upgraded my x.org port when testing this, so it was the 
same x.org that was fine under 6.3.  Also:


I wrote a small program which forks two processes that run gettimeofday() 
in a tight loop to see how long they get scheduled out.  On 6.3 the 
maximum latency is usually under 100 ms.  On 7.0 it is 500 ms or more 
even when nothing else is running on the system.  When a compile is also 
running it is sometimes 1400 ms or more.


This test shows a difference even in single user mode, when X is not 
running at all.




It shows *a* difference, but perhaps not the *same* difference.  Please 
humour me and rule it out.


Okay.  I am in the process of recompiling all my ports, so after that is 
done I will boot with a GENERIC kernel and see what happens.


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]


SCHED_4BSD bad interactivity on 7.0 vs 6.3

2008-07-12 Thread Nate Eldredge

Hi folks,

Hopefully this is a good list for this topic.

It seems like there has been a regression in interactivity from 
6.3-RELEASE to 7.0-RELEASE when using the SCHED_4BSD scheduler.  After 
upgrading my single-cpu amd64 box, 7.0 has much worse latency.  When 
running a kernel compile, there is a noticeable lag to echo my typing or 
scroll my browser windows, and playing an mp3 frequently cuts out for a 
second or two.  This did not happen on 6.3-RELEASE.


I wrote a small program which forks two processes that run gettimeofday() 
in a tight loop to see how long they get scheduled out.  On 6.3 the 
maximum latency is usually under 100 ms.  On 7.0 it is 500 ms or more even 
when nothing else is running on the system.  When a compile is also 
running it is sometimes 1400 ms or more.


SCHED_ULE is much better, so I've switched over.  But it's not the default 
yet, and most people are still going to be using SCHED_4BSD.  It used to 
be acceptable but now it isn't.  Does anyone know why it's regressed so 
badly?


--

Nate Eldredge
[EMAIL PROTECTED]
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to [EMAIL PROTECTED]