Re: read(1) garbage when input redirected from make incorrectly

2010-02-17 Thread Dag-Erling Smørgrav
Jan Mikkelsen janm-freebsd-hack...@transactionware.com writes:
 A redirection doesn't terminate the argument list.  [...]

Huh, you learn something every day...  :)

DES
-- 
Dag-Erling Smørgrav - d...@des.no
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Sudden mbuf demand increase and shortage under the load (igb issue?)

2010-02-17 Thread Ivan Voras
Maxim Sobolev wrote:

 So it looks like kernel issue of a sort, which causes all userland
 activity to cease for 2 minutes when the system reaches certain load.

You are not using ZFS, are you? :))

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Per core, per device interrupt counts

2010-02-17 Thread Andrew Brampton
After reading though the kernel source I realise what I want isn't
implemented at the moment but I wanted to discuss if this feature
would be an useful addition.

Basically I want to see counts of how many interrupts for a particular
interrupt have fired on each core. Linux has provided this kind of
information for a while and I've found it quite useful. I would like
this information when I am pinning particular interrupts to one (or
more cores). This is useful when I'm tweaking a system with, for
example, 10Gig network cards which have multiple queues (thus multiple
IRQs).

Having a look in the kernel I see that the count is kept in the
is_count field of the intsrc struct. This field seems to be backed by
the global intrcnt array. Could this be modified to perhaps use the
new PCPU macros, so there is a different count for each core? If I was
given a few pointers I might find time to implement this myself.

thanks
Andrew
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


linprocfs proc/pid/environ patch list question

2010-02-17 Thread Fernando Apesteguía
Hi,

I have a small patch (against 8.0-RELEASE-p2) that _should_ implement
the /proc/pid/environ file
under linprocfs.
However, it seems it does not work properly but I don't know what I'm
doing wrong.
Is this list the place to ask for help? I tried in the forums[1] but
got no answer.

Don't we have a 'kernel newbies'-like list?

Thanks in advance.

[1] http://forums.freebsd.org/showthread.php?t=11329

--- sys/compat/linprocfs/linprocfs.c.orig   2009-10-25 02:10:29.0 
+0100
+++ sys/compat/linprocfs/linprocfs.c2010-02-16 19:38:36.0 +0100
@@ -939,8 +939,38 @@
 static int
 linprocfs_doprocenviron(PFS_FILL_ARGS)
 {
+   int i, error;
+   struct ps_strings pss;
+   char **ps_envstr;

-   sbuf_printf(sb, doprocenviron\n%c, '\0');
+   PROC_LOCK(p);
+   if (p_cansee(td, p) != 0)
+   return (0);
+   PROC_UNLOCK(p);
+
+   error = copyin((void *)p-p_sysent-sv_psstrings, pss,
+   sizeof(pss));
+   if (error)
+   return (error);
+
+   ps_envstr = malloc(pss.ps_nenvstr * sizeof(char *),
+   M_TEMP, M_WAITOK);
+
+   error = copyin((void *)pss.ps_envstr, ps_envstr,
+   pss.ps_nenvstr * sizeof(char *));
+
+   if (error) {
+   free(ps_envstr, M_TEMP);
+   return (error);
+   }
+
+   /* NULL separated list of variable=value pairs */
+   
+   for (i = 0; i  pss.ps_nenvstr; i++) {
+   sbuf_copyin(sb, ps_envstr[i], 0);
+   }
+
+   free(ps_envstr, M_TEMP);
return (0);
 }
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: linprocfs proc/pid/environ patch list question

2010-02-17 Thread Kostik Belousov
On Wed, Feb 17, 2010 at 07:51:06PM +0100, Fernando Apestegu?a wrote:
 Hi,
 
 I have a small patch (against 8.0-RELEASE-p2) that _should_ implement
 the /proc/pid/environ file
 under linprocfs.
 However, it seems it does not work properly but I don't know what I'm
 doing wrong.
 Is this list the place to ask for help? I tried in the forums[1] but
 got no answer.
Putting aside any does not work questions, please see comment below.
 
 Don't we have a 'kernel newbies'-like list?
 
 Thanks in advance.
 
 [1] http://forums.freebsd.org/showthread.php?t=11329
 
 --- sys/compat/linprocfs/linprocfs.c.orig 2009-10-25 02:10:29.0 
 +0100
 +++ sys/compat/linprocfs/linprocfs.c  2010-02-16 19:38:36.0 +0100
 @@ -939,8 +939,38 @@
  static int
  linprocfs_doprocenviron(PFS_FILL_ARGS)
  {
 + int i, error;
 + struct ps_strings pss;
 + char **ps_envstr;
 
 - sbuf_printf(sb, doprocenviron\n%c, '\0');
 + PROC_LOCK(p);
 + if (p_cansee(td, p) != 0)
 + return (0);
 + PROC_UNLOCK(p);
 +
 + error = copyin((void *)p-p_sysent-sv_psstrings, pss,
 + sizeof(pss));
 + if (error)
 + return (error);
 +
 + ps_envstr = malloc(pss.ps_nenvstr * sizeof(char *),
 + M_TEMP, M_WAITOK);
This is essentially panic me code.  ps_nenvstr is user-controlled,
and allows to specify arbitrary integers.

Even ignoring exhaustion of the kernel map, it can cause allocation of
big amount of physical memory. Note that execve(2) implementation uses
swappable memory to store arguments and environment strings passed from
vm spaces.

 +
 + error = copyin((void *)pss.ps_envstr, ps_envstr,
 + pss.ps_nenvstr * sizeof(char *));
 +
 + if (error) {
 + free(ps_envstr, M_TEMP);
 + return (error);
 + }
 +
 + /* NULL separated list of variable=value pairs */
 + 
 + for (i = 0; i  pss.ps_nenvstr; i++) {
 + sbuf_copyin(sb, ps_envstr[i], 0);
 + }
 +
 + free(ps_envstr, M_TEMP);
   return (0);
  }
 ___
 freebsd-hackers@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
 To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


pgppd2oWLLsGb.pgp
Description: PGP signature


Re: Ktrace'ing kernel threads

2010-02-17 Thread John Baldwin
On Monday 15 February 2010 6:21:40 am Shrikanth Kamath wrote:
 Can ktrace trace another kernel thread which has roughly the semantics as
 below, right now it
 does not hit any of the designated interesting points that ktrace is built
 for, but what if I could define those,
 will ktrace still allow tracing another kernel thread?
 
 thread(client_info)
 {
 ...
 ...
 build_msg(client_info);  /* this will malloc a mbuf and fill the data in
 it */
 ...
 sosend(client_info);
 }
 
 I want to time the entry/return of build_msg, and the time sosend, dump
 client_info (some specific fields).

It is probably easier to do this with DTrace (albeit possibly with more 
overhead).  You can ktrace a kthread fine, but you would need to write your 
own ktrace hooks (and record parser for kdump) which would take a bit longer 
than a D script with DTrace.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Per core, per device interrupt counts

2010-02-17 Thread John Baldwin
On Wednesday 17 February 2010 1:09:04 pm Andrew Brampton wrote:
 After reading though the kernel source I realise what I want isn't
 implemented at the moment but I wanted to discuss if this feature
 would be an useful addition.
 
 Basically I want to see counts of how many interrupts for a particular
 interrupt have fired on each core. Linux has provided this kind of
 information for a while and I've found it quite useful. I would like
 this information when I am pinning particular interrupts to one (or
 more cores). This is useful when I'm tweaking a system with, for
 example, 10Gig network cards which have multiple queues (thus multiple
 IRQs).
 
 Having a look in the kernel I see that the count is kept in the
 is_count field of the intsrc struct. This field seems to be backed by
 the global intrcnt array. Could this be modified to perhaps use the
 new PCPU macros, so there is a different count for each core? If I was
 given a few pointers I might find time to implement this myself.

The simplest method would probably be to make intrcnt grow per-CPU counts, but 
that would change the ABI of intrcnt and require a good bit of userland 
hacking to fix vmstat -i, etc.

-- 
John Baldwin
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: Per core, per device interrupt counts

2010-02-17 Thread Fernando Gleiser




- Original Message 
 From: John Baldwin j...@freebsd.org
 To: freebsd-hackers@freebsd.org
 Cc: Andrew Brampton brampton+free...@gmail.com
 Sent: Wed, February 17, 2010 4:17:24 PM
 Subject: Re: Per core, per device interrupt counts
 
 
 The simplest method would probably be to make intrcnt grow per-CPU counts, 
 but 
 that would change the ABI of intrcnt and require a good bit of userland 
 hacking to fix vmstat -i, etc.

Or he can add some DTrace SDT probes after intrctl gets updated and export the 
device name, cpu index and count number from there as the probe argument list. 
Then he can get the stats he wants from a D script

Solaris' intrstat is built as a DTrace consumer, IIRC



Fer


  
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


tar tfv /dev/cd0 speedup patch

2010-02-17 Thread Juergen Lock
Hi!

 I recently wanted to quickly look at an optical disc without mounting it
and since bsdtar/libarchive know iso9660 I just did the command in the
Subject.  It worked, but it was sloow... :(  Apparently it read all of
the disc without seeking.  The following patch fixes this, is something
like this desired?  If yes I could look how to do the same for Linux,
I _think_ there you could just check for S_ISBLK and try to lseek to
the end and back, at least that seems to be how you find out the size
of a block device there...

 Cheers,
Juergen

Index: lib/libarchive/archive_read_open_filename.c
@@ -44,6 +44,10 @@
 #ifdef HAVE_UNISTD_H
 #include unistd.h
 #endif
+#ifdef __FreeBSD__
+#include sys/ioctl.h
+#include sys/disk.h
+#endif
 
 #include archive.h
 
@@ -83,6 +87,9 @@
struct read_file_data *mine;
void *b;
int fd;
+#ifdef __FreeBSD__
+   off_t mediasize = 0;
+#endif
 
archive_clear_error(a);
if (filename == NULL || filename[0] == '\0') {
@@ -143,6 +150,17 @@
 */
mine-can_skip = 1;
}
+#ifdef __FreeBSD__
+   /*
+* on FreeBSD if a device supports the DIOCGMEDIASIZE ioctl
+* it is a disk-like device and should be seekable.
+*/
+   else if (S_ISCHR(st.st_mode) 
+   !ioctl(fd, DIOCGMEDIASIZE, mediasize)  mediasize) {
+   archive_read_extract_set_skip_file(a, st.st_dev, st.st_ino);
+   mine-can_skip = 1;
+   }
+#endif
return (archive_read_open2(a, mine,
NULL, file_read, file_skip, file_close));
 }
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


Re: tar tfv /dev/cd0 speedup patch

2010-02-17 Thread Tim Kientzle

Juergen Lock wrote:


 ...  since bsdtar/libarchive know iso9660 I just did the command in the
Subject.  It worked, but it was sloow... :(  Apparently it read all of
the disc without seeking.  The following patch fixes this, is something
like this desired?  If yes I could look how to do the same for Linux,


Juergen,

This is great!  If you can figure out how to get this
right, I would really appreciate it.  If you have a
tape drive handy, definitely test with that.  My first
attempts here actually broke reading from tape drives,
which is why the current code is so conservative.

Minor style comments:

 else if (S_ISCHR(st.st_mode) 
!ioctl(fd, DIOCGMEDIASIZE, mediasize)  mediasize) {


Please be explicit:  S_ISCHR()  ioctl() == 0   mediasize  0


archive_read_extract_set_skip_file(a, st.st_dev, st.st_ino);


extract_skip_file isn't needed here; we don't read the
contents of device nodes.

Let me know as soon as you have something you're confident of.

Cheers,

Tim

___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org


unix socket: race on close?

2010-02-17 Thread Mikolaj Golub
Hi,

Below is a simple test code with unix sockets: the client does
connect()/close() in loop and the server -- accept()/close().

Sometimes close() fails with 'Socket is not connected' error:

a.out: parent: close error: 57

or

a.out: child: close error: 57

It looks for me like some race in close(). Looking at uipc_socket.c:soclose():

int
soclose(struct socket *so)
{
int error = 0;

KASSERT(!(so-so_state  SS_NOFDREF), (soclose: SS_NOFDREF on enter));

CURVNET_SET(so-so_vnet);
funsetown(so-so_sigio);
if (so-so_state  SS_ISCONNECTED) {
if ((so-so_state  SS_ISDISCONNECTING) == 0) {
error = sodisconnect(so);
if (error)
goto drop;
}

Isn't the problem here? so_state is checked for SS_ISCONNECTED and
SS_ISDISCONNECTING without locking and then sodisconnect() is called, which
closes both sockets of the connection. So it looks for me that if the close()
is called for both ends simultaneously it is possible that sodisconnect() will
be called for both ends and for one ENOTCONN will be returned. Or may I have
missed something?

We have been observing periodically ENOTCONN errors on unix socket close in
our applications, so it is not just curiosity :-) (I posted about our problem
to freebsd-net@ some time ago but then did not attract any attention
http://lists.freebsd.org/pipermail/freebsd-net/2009-December/024047.html).

#include sys/types.h
#include sys/socket.h
#include sys/un.h
#include netinet/in.h
#include arpa/inet.h
#include errno.h
#include fcntl.h
#include stdio.h
#include strings.h
#include string.h
#include unistd.h
#include sys/select.h
#include err.h

#define UNIXSTR_PATH /tmp/mytest.socket
#define USLEEP  100

int main(int argc, char **argv)
{
int listenfd, connfd, pid;
struct sockaddr_un  servaddr;

pid = fork();
if (-1 == pid)
errx(1, fork(): %d, errno);

if (0 != pid) { /* parent */

if ((listenfd = socket(AF_LOCAL, SOCK_STREAM, 0))  0)
errx(1, parent: socket error: %d, errno);

unlink(UNIXSTR_PATH);
bzero(servaddr, sizeof(servaddr));
servaddr.sun_family = AF_LOCAL;
strcpy(servaddr.sun_path, UNIXSTR_PATH);

if (bind(listenfd, (struct sockaddr *) servaddr, 
sizeof(servaddr))  0)
errx(1, parent: bind error: %d, errno);

if (listen(listenfd, 1024)  0)
errx(1, parent: listen error: %d, errno);

for ( ; ; ) {
if ((connfd = accept(listenfd, (struct sockaddr *) 
NULL, NULL))  0)
errx(1, parent: accept error: %d, errno);

//usleep(USLEEP / 2); // (I) uncomment this or (II) 
below to avoid the race

if (close(connfd)  0)
errx(1, parent: close error: %d, errno);
}

} else { /* child */

sleep(1); /* give the parent some time to create the socket */

for ( ; ; ) {

if ((connfd = socket(AF_LOCAL, SOCK_STREAM, 0))  0)
errx(1, child: socket error: %d, errno);

bzero(servaddr, sizeof(servaddr));
servaddr.sun_family = AF_LOCAL;
strcpy(servaddr.sun_path, UNIXSTR_PATH);

if (connect(connfd, (struct sockaddr *) servaddr, 
sizeof(servaddr))  0)
errx(1, child: connect error %d, errno);

// usleep(USLEEP); // (II) uncomment this or (I) above 
to avoid the race

if (close(connfd) != 0) 
errx(1, child: close error: %d, errno);

usleep(USLEEP);
}
}

return 0;
}

-- 
Mikolaj Golub
___
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to freebsd-hackers-unsubscr...@freebsd.org