Re: [RFC] LZO1X de/compression support
On Fri, May 18, 2007 at 11:14:57PM +0200, Krzysztof Halasa wrote: > I'm certainly missing something but what are the advantages of this > code (over current gzip etc.), and what will be using it? Richard's patchset added it to the crypto library and wired it into the JFFS2 file system. We recently started using LZO in a userland UDP proxy to do stateless per-packet payload compression over a WAN link. With ~1000 octet packets, our particular data stream sees 60% compression with zlib, and 50% compression with (mini-)LZO, but LZO runs at ~5.6x the speed of zlib. IIRC, that translates into > 700Mbps on the input side on a 2GHZ Opteron, without any further tuning. Once LZO is in the kernel, I'd like to see it wired into IPComp. Unfortunately, last I checked only the "deflate" algorithm had an assigned compression parameter index (CPI), so one will have to use a private index until an official one is assigned. Regards, Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Feature Request?] Inline compression of process core dumps
On Thu, Apr 12, 2007 at 11:52:38AM -0400, Christopher S. Aker wrote: > I've been trying to find a method for compressing process core dumps > before they hit disk. > > I ask because we've got some fairly large UML processes (1GB for some), > and we're trying to capture dumps to help Jeff debug an evasive bug. > Our systems use a small root partition and most of the other disk > resources on the host are allocated towards the UMLs. > > There are userspace solutions to this problem: allowing the > uncompressed core dump to spin out to disk and then coming in afterwards > and doing the compression, or maybe even a compressed filesystem where > the core dumps land, but I just thought I'd throw this out there since > it seems it would be a useful feature :) See Documentation/kernel.txt for kernels >= 2.6.19: core_pattern: core_pattern is used to specify a core dumpfile pattern name. . max length 128 characters; default value is "core" . core_pattern is used as a pattern template for the output filename; certain string patterns (beginning with '%') are substituted with their actual values. . backward compatibility with core_uses_pid: If core_pattern does not include "%p" (default does not) and core_uses_pid is set, then .PID will be appended to the filename. . corename format specifiers: % '%' is dropped %% output one '%' %p pid %u uid %g gid %s signal number %t UNIX time of dump %h hostname %e executable filename % both are dropped . If the first character of the pattern is a '|', the kernel will treat the rest of the pattern as a command to run. The core dump will be written to the standard input of that program instead of to a file. Regards, Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Feature Request?] Inline compression of process core dumps
On Thu, Apr 12, 2007 at 05:28:45PM +0100, Alan Cox wrote: > > There are userspace solutions to this problem: allowing the > > uncompressed core dump to spin out to disk and then coming in afterwards > > and doing the compression, or maybe even a compressed filesystem where > > the core dumps land, but I just thought I'd throw this out there since > > it seems it would be a useful feature :) > > Indeed. So useful that in current kernels you can set the core dump path > to be > > "|application" > > and it will call out to the helper. Take care with the helper as it will > get run for setuid apps, roots core dumps etc. The current functionality doesn't parse command line arguments into argv, nor provide the % variable replacements in the environment, so it is somewhat less useful than it could be. I supposed that parsing the command line introduces potential problems with file names that include whitespace. It would probably be better to split the command-line on whitespace, then replace variables in the argv[]? fs/exec.c: 1507 if (corename[0] == '|') { 1508 /* SIGPIPE can happen, but it's just never processed */ 1509 if(call_usermodehelper_pipe(corename+1, NULL, NULL, &file)) { 1510 printk(KERN_INFO "Core dump to %s pipe failed\n", 1511corename); 1512 goto fail_unlock; 1513 } 1514 ispipe = 1; 1515 } else 1516 file = filp_open(corename, 1517 O_CREAT | 2 | O_NOFOLLOW | O_LARGEFILE | flag, 1518 0600); Regards, Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFD: Kernel release numbering
On Thu, Mar 03, 2005 at 02:33:58PM -0500, Dave Jones wrote: > If you accelerate the merging process, you're lowering the review process. > The only answer to get regressions fixed up as quickly as possible > (because prevention is nigh on impossible at the current rate, so > any faster is just absurd), would be more regular releases, so that > they got spotted quicker. Right. My point, and I think Jeff's, was that being extra careful for the 'even' releases and waiting around N days to see whether someone will finally test the -rc and see that it is broken impedes the whole process. Getting more people to test is not necessarily a function of the wait duration. > > Dave has been building "unstable" bleeding-edge Fedora kernels from > > 2.6.x-rcM-bkN, as well as "test" kernels for Fedora updates; > > Actually only rawhide (FC4-to-be) has been getting -rc-bk patches. When Arjan started testing 2.6, he set up a repo, and FC1 users at the time could just pull rpms from that repo. I and many others did it. Currently I'm rolling my own, so I haven't been consistently testing your Fedora kernels, rawhide or update, but I occasionally do the "grab the SRPM from Rawhide and rebuild it on FC3"-thing for my laptop. I think that we should institutionalize that. > The FC2/FC3 updates have been release versions only, with -ac patches. > (and also some additional patches backing out bits of the -ac) I've watched you periodically announce "hey, I'm doing an update for FC3/FC2, please test" on the mail list, and a handful of people go test. If we could convince many of the the less risk-averse but lazy users to grab kernels automatically from updates/3/testing/ or updates/3/unstable/ as part of "yum update", and have a way to manage the plethora of (even daily) kernel updates by removing old unused kernels, then we'd only have to convince them *once* to set up their YUM repos, and then get them to poweroff or reboot [or use a Xen domain] occasionally. :-) > This is part of the problem with rebasing the existing releases to > new kernels. It's shoving a largely untested codebase into a release > that was never tested in that combination. It's expected that some > stuff will break, but the volume of breakage is increasing as time goes on. > Even if _I_ stopped rebasing the Fedora kernel, some of our users > will still want to build and run the latest kernel.org kernel on their > FC2 boxes. We shouldn't be expecting them to have to rebuild half of > their userspace just because we've been sloppy and broke interfaces. Yes, this is miserable, and exacerbated by the inability of almost all distros to deal with multiple installed versions of packages, or easily roll back changes, which crimps my argument w.r.t. wider testing, since the typical user willing to test while otherwise doing useful work wants to be able to roll back easily if there is a serious problem. The LVM packaging situation between 2.4 and 2.6 illustrated the problem well; one couldn't boot back into 2.4 unless LVM1 and LVM2 userland could coexist; I spent time rolling packages back then to do just that. In any case, kernel helper packages (udev, device-mapper, iptables, etc.) need to be added to a "kernel+related" package repo. Users could be encouraged to do more testing if they are provided with a simple mechanism to snapshot, update, test, and then either keep the changes or roll back. I do this in UML with the COW tools. Currently LVM2 has writable snapshots, but no easy way (at system boot) to reintegrate the changes into the base from a snapshot, or "fallback" to a snapshot. Still, using Xen/UML/QEMU, it is not difficult to take advantage of copy-on-write to update the kernel and other packages, then boot the image to start a shell or web/ftp/mail/... daemon(s). At least that would exercise the non-hardware-related, (and for now, non-SMP) parts of the kernel. Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFD: Kernel release numbering
On Thu, Mar 03, 2005 at 02:15:06AM -0800, Andrew Morton wrote: > If we were to get serious with maintenance of 2.6.x.y streams then that is > a 100% productisation activity. It's a very useful activity, and there is > demand for it. But it is a very different activity. And a lot of this > discussion has been getting these two activities confused. IMHO, Jeff Garzik has made two very useful points in this thread: 1. The number of changesets flowing towards the Linus kernel is accelerating, so the kernel developers should be trying to accelerate the merging process, not introducing delays. Having an extended -rc period that stuffs up merging just creates back pressure and causes changesets that could be getting reviewed, merged, and booted somewhere to instead lie dormant. 2. No matter what one calls it, -rc1, ., or just 2.6.X these days, intelligent consumers know a "dot-zero" release when they see one. [I've had experience of several boneheaded corporate policies dictating an unpatched kernel.org kernel, but they are uninteresting users.] The class of users that want to use the kernel in production are going to wait days to weeks, no matter what. The trick is in encouraging everyone else to overcome inertia and test new releases. As part of a solution to the "production kernel" problem, Jeff suggested a 2.6.x.y tree that gets pulled to 2.6.x+1. Neil Brown made a similar point: For the kernel, I am the "distribution" for my employer and I choose which kernel to use, with which patches. I really don't want to hunt around for all those stablisation patches, or sift through the patches in 2.6.X+1-pre to find things to apply to 2.6.X. I would be really happy there was a central place where maintainers can put suitably reviewed "important bug fix"es for recent releases, and from where kernel maintainers for any distribution (official or not) could pull them. I'm in the same boat with Neil. Determined to stay reasonably close to mainline, I started in the 2.6.9-bk series to try to nail down a stable production kernel. I spent about two months reading lkml and bk-commits-head, picking through -mm for patches that might be important for my workloads (e.g., vmtrunc), and spending my days with "quilt", merging up a new -bk kernel every few days, backing out "dangerous changes", and retesting. At 2.6.10, I stopped revving up and started to just merge fixes from 2.6.11-bk. I'm sure Neil and I are not alone. I perceive four groups of users for kernel.org users, with differing requirements: 1. Developers. For them, the Linus kernel is a synchronization point for merging, as well as their personal test environment. 2. "Casual" end-users who like to build their own kernels, and for whom a kernel oops, crash, or driver failure is not a big hassle; they just reboot into their previous kernel. They are content if a new kernel doesn't corrupt their data. 3. "Production" end-users, who need a kernel that is going to run stably, usually on many servers, indefinitely [until a bug or desired feature forces an upgrade/reboot]. Rolling out a new kernel is a hassle, and is usually done to fix a serious kernel bug or driver problem. 4. Vendors, who need a long period of stabilization and testing, as well as a (vendor-internal) mechanism for determining what features, drivers, etc. to support. As individuals, many of us live in multiple categories, e.g., I'm a (3) at work, and a mix of (2) [laptop] and (3) [file server] at home. Greg KH complained: Bug fixes for what? Kernel api changes that fix bugs? That's pretty big. Some driver fixes, but not others? Driver fixes that are in the middle of bigger, subsystem reworks as a series of patches? All of this currently happens today in the main tree in a semi-cohesive manner. To try to split it out is a very difficult task. Opinions will differ, but I think things are a lot more clear-cut than Greg allows. I certainly don't expect to download, build, and deploy a kernel devoid of patches without expecting at least a few problems. It's the incredible duplication of effort to sort through thousands of changesets in order to cull dozens to hundreds, with the result that everyone is running a subtly different kernel core. And most of us are far less qualified than subsystem maintainers to evaluate the risk of individual changesets. Folks in categories (3) and (4) care very deeply about subtle corruption [like the recent pty lost bytes], even if rare, as well as easily triggerable oopses, races, deadlock, livelock, resource leaks, massive performance regressions, and serious breakage in the (rapidly evolving) networking stack. These belong in 2.6.x.y. API changes do not, unless they are required to fix one if the above. Sure, this is going to c
Re: i8042 access timings
On Sun, Feb 13, 2005 at 09:22:46AM +0100, Vojtech Pavlik wrote: > And I suppose it was running just fine without the patch as well? Correct. > The question was whether the patch helps, or whether it is not needed. If you look again at the patch I posted, it only borrowed a few lines of the patch from Dmitry that started this thread; I eliminated Alan's recent udelay(50) addition, reduced the loop delay, and added debug printks to the *_wait routines to determine whether the loop is ever taken. At least so far, those debugging statements have produced no output. I'll use the machine a bit and report back if I trigger anything. Regards, Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i8042 access timings
On Thu, Jan 27, 2005 at 05:37:14PM +0100, Vojtech Pavlik wrote: > On Thu, Jan 27, 2005 at 11:34:31AM -0500, Bill Rugolsky Jr. wrote: > > I have a Digital HiNote collecting dust which had this keyboard problem > > with the RH 6.x 2.2.x boot disk kernels, IIRC. I can test if you like, > > but I won't be able to get to it until the weekend. > > That'd be very nice indeed. Sorry for the long delay in replying; the HiNote needed some effort to get the thing up and running again. [Various bits of hardware are broken; the power switch, floppy, and CD-ROM are busted/flakey.] I've now got Fedora Core 3 running on it. I was pleasantly surprised that the 2.6.9 i83265 PCMCIA module loads, and the internal Xircom CEM56 network/modem works. [Broken with 2.6.10+ though; the fix is probably trivial.] I wasn't sure exactly what to test. I applied the following patch to 2.6.11-rc3-bk9, and booted with i8042_debug=1. So far, it works as expected, and there is nothing of interest in the kernel log. [Also worked with the FC3 2.6.9 kernel and this patch+DEBUG.] Now that things are up and running, I will apply any patches that you would like tested. Bill Rugolsky --- linux/drivers/input/serio/i8042.c.udelay-backout2005-02-12 16:22:48.647851998 -0500 +++ linux/drivers/input/serio/i8042.c 2005-02-12 16:23:39.963997609 -0500 @@ -131,9 +131,10 @@ { int i = 0; while ((~i8042_read_status() & I8042_STR_OBF) && (i < I8042_CTL_TIMEOUT)) { - udelay(50); + udelay(I8042_STR_DELAY); i++; } + if (i > 0) dbg("i8042_wait_read: looped %d times",i); return -(i == I8042_CTL_TIMEOUT); } @@ -141,9 +142,10 @@ { int i = 0; while ((i8042_read_status() & I8042_STR_IBF) && (i < I8042_CTL_TIMEOUT)) { - udelay(50); + udelay(I8042_STR_DELAY); i++; } + if (i > 0) dbg("i8042_wait_write: looped %d times",i); return -(i == I8042_CTL_TIMEOUT); } @@ -161,7 +163,6 @@ spin_lock_irqsave(&i8042_lock, flags); while ((i8042_read_status() & I8042_STR_OBF) && (i++ < I8042_BUFFER_SIZE)) { - udelay(50); data = i8042_read_data(); dbg("%02x <- i8042 (flush, %s)", data, i8042_read_status() & I8042_STR_AUXDATA ? "aux" : "kbd"); --- linux/drivers/input/serio/i8042.h.udelay-backout2005-02-12 16:22:48.647851998 -0500 +++ linux/drivers/input/serio/i8042.h 2005-02-12 16:23:39.964997456 -0500 @@ -30,12 +30,18 @@ #endif /* - * This is in 50us units, the time we wait for the i8042 to react. This + * The time (in us) that we wait for the i8042 to react. + */ + +#define I8042_STR_DELAY20 + +/* + * This is in units of the time we wait for the i8042 to react. This * has to be long enough for the i8042 itself to timeout on sending a byte * to a non-existent mouse. */ -#define I8042_CTL_TIMEOUT 1 +#define I8042_CTL_TIMEOUT 25000 /* * When the device isn't opened and it's interrupts aren't used, we poll it at - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: i8042 access timings
On Thu, Jan 27, 2005 at 03:14:36PM +, Alan Cox wrote: > Myths are not really involved here. The IBM PC hardware specifications > are fairly well defined and the various bits of "we glued a 2Mhz part > onto the bus" stuff is all well documented. Nowdays its more complex > because most kbc's aren't standalone low end microcontrollers but are > chipset integrated cells or even software SMM emulations. > > The real test is to fish out something like an old Digital Hi-note > laptop or an early 486 board with seperate kbc and try it. I have a Digital HiNote collecting dust which had this keyboard problem with the RH 6.x 2.2.x boot disk kernels, IIRC. I can test if you like, but I won't be able to get to it until the weekend. Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch, 2.6.11-rc2] sched: /proc/sys/kernel/rt_cpu_limit tunable
On Tue, Jan 25, 2005 at 02:03:02PM -0800, Chris Wright wrote: > * Ingo Molnar ([EMAIL PROTECTED]) wrote: > > did that thread go into technical details? There are some rlimit users > > that might not be prepared to see the rlimit change under them. The > > RT_CPU_RATIO one ought to be safe, but generally i'm not so sure. > > Not really. I mentioned the above, as well as the security concern. > Right now, at least the task_setrlimit hook would have to change to take > into account the task. And I never convinced myself that async changes > would be safe for each rlimit. As was mentioned, but not discussed, in the /proc//rlimit thread, it is not difficult to envision conditions where setrlimit() on another process could make exploiting an application bug much easier, by, e.g., setting number of open files ridiculously low. So IMHO, it ought require privileges similar to ptrace() to change some, if not all, of the rlimits. Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] /proc//rlimit
On Thu, Jan 20, 2005 at 03:43:58PM +0100, Pavel Machek wrote: > It would be nice if you could make it "value-per-file". That way, > it could become writable in future. If "max nice level" ever becomes rlimit, > this would be very usefull. Agreed, though write support present difficulties. My principal concern is that we don't want users changing resource limits of privileged processes. If we want an ordinary user to be allowed to change limits, the rules would have to be similar to those allowed for ptrace(), e.g., no-setuid processes, etc. [With ptrace(), one can of course attach to the process and invoke the setrlimit() syscall directly]. Additionally, sys_setrlimit() has an LSM hook: security_task_setrlimit(unsigned int resource, struct rlimit *) One would need to take account of changing the limit from a different context. It's a bit of a mess, and outside of the standard API; that's why I didn't bother. Anyway, for Jan, here's my incomplete and unmergeable cut-n-paste hack to implement write on top of my previous patch. Format is as was suggested by Jan: <%u|unlimited> <%u|unlimited> E.g., echo memlock 65536 65536 > /proc/1/rlimit Writing is limited to root (i.e. CAP_SYS_PTRACE), though see fs/proc/base.c:may_ptrace_attach() for an idea of how to change that. -Bill --- linux-2.6.11-rc1-bk6/fs/proc/base.c.proc-pid-rlimit-write +++ linux-2.6.11-rc1-bk6/fs/proc/base.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -127,7 +128,7 @@ E(PROC_TGID_ROOT, "root",S_IFLNK|S_IRWXUGO), E(PROC_TGID_EXE, "exe", S_IFLNK|S_IRWXUGO), E(PROC_TGID_MOUNTS,"mounts", S_IFREG|S_IRUGO), - E(PROC_TGID_RLIMIT,"rlimit", S_IFREG|S_IRUGO), + E(PROC_TGID_RLIMIT,"rlimit", S_IFREG|S_IRUGO|S_IWUSR), #ifdef CONFIG_SECURITY E(PROC_TGID_ATTR, "attr",S_IFDIR|S_IRUGO|S_IXUGO), #endif @@ -153,7 +154,7 @@ E(PROC_TID_ROOT, "root",S_IFLNK|S_IRWXUGO), E(PROC_TID_EXE,"exe", S_IFLNK|S_IRWXUGO), E(PROC_TID_MOUNTS, "mounts", S_IFREG|S_IRUGO), - E(PROC_TID_RLIMIT, "rlimit", S_IFREG|S_IRUGO), + E(PROC_TID_RLIMIT, "rlimit", S_IFREG|S_IRUGO|S_IWUSR), #ifdef CONFIG_SECURITY E(PROC_TID_ATTR, "attr",S_IFDIR|S_IRUGO|S_IXUGO), #endif @@ -595,9 +596,99 @@ return single_release(inode, file); } +static inline char *skip_ws(char *s) +{ + while (isspace(*s)) + s++; + return s; +} + +static inline char *find_ws(char *s) +{ + while (!isspace(*s) && *s != '\0') + s++; + return s; +} + +#define MAX_RLIMIT_WRITE 79 +static ssize_t rlimit_write(struct file * file, const char * buf, + size_t count, loff_t *ppos) +{ + struct task_struct *task = proc_task(file->f_dentry->d_inode); + struct rlimit new_rlim, *old_rlim; + unsigned int i; + char *s, *t, kbuf[MAX_RLIMIT_WRITE+1]; + + /* changing resources limits can crash or subvert a process */ + if (!capable(CAP_SYS_PTRACE) || security_ptrace(current,task)) + return -ESRCH; + +if (count > MAX_RLIMIT_WRITE) +return -EINVAL; +if (copy_from_user(&kbuf, buf, count)) +return -EFAULT; +kbuf[MAX_RLIMIT_WRITE] = '\0'; + + /* parse the resource id */ + s = skip_ws(kbuf); + t = find_ws(s); + if (*t == '\0') + return -EINVAL; + *t++ = '\0'; + for (i = 0 ; i < RLIM_NLIMITS ; i++) + if (rlim_name[i] && !strcmp(s,rlim_name[i])) + break; + if (i >= RLIM_NLIMITS) { + if (!strncmp(s, "rlimit-",7)) + s += 7; + if (sscanf(s, "%u", &i) != 1 || i >= RLIM_NLIMITS) + return -EINVAL; + } + + /* parse the soft limit */ + s = skip_ws(t); + t = find_ws(s); + if (*t == '\0') + return -EINVAL; + *t++ = '\0'; + if (!strcmp(s, "unlimited")) + new_rlim.rlim_cur = RLIM_INFINITY; + else if (sscanf(s, "%lu", &new_rlim.rlim_cur) != 1) + return -EINVAL; + + /* parse the hard limit */ + s = skip_ws(t); + t = find_ws(s); + *t = '\0'; + if (!strcmp(s, "unlimited")) + new_rlim.rlim_max = RLIM_INFINITY; + else if (sscanf(s, "%lu", &new_rlim.rlim_max) != 1) + return -EINVAL; + + /* validate the values; copied from sys_setrlimit() */ + if (new_rlim.rlim_cur > new_rlim.rlim_max) + return -EINVAL; +old_rlim = task->signal->rlim + i; + if ((new_rlim.rlim_max > old_rlim->rlim_max) && + !capable(CAP_SYS_RESOURCE)) + return -EPERM; + if (i == RLIMIT_NOFILE && new_rlim.rlim_max > NR_OPEN) + return -EPERM; + +
Re: [RFC][PATCH] /proc//rlimit
On Wed, Jan 19, 2005 at 11:38:03AM -0800, Chris Wright wrote: > * Jan Knutar ([EMAIL PROTECTED]) wrote: > > A "cool feature" would be if you could do > > echo nofile 8192 8192 >/proc/`pidof thatserverproess`/rlimit > > :-) > > This is security sensitive, and is currently only expected to be changed > by current. Sure, I had thought of implementing it, paused to consider the security implications, and then punted. Chris, on the other point that you made regarding UGO read access to "rlimit", the same is true of "maps" (at least sans SELinux policy), so I don't see an issue. Certainly the map information is more security sensitive. Regards, -Bill - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH] consolidate arch specific resource.h headers
On Tue, Jan 18, 2005 at 04:10:56PM -0800, Chris Wright wrote: > +#define INIT_RLIMITS \ > +{\ > + { RLIM_INFINITY, RLIM_INFINITY }, \ > + { RLIM_INFINITY, RLIM_INFINITY }, \ > + { RLIM_INFINITY, RLIM_INFINITY }, \ > + { _STK_LIM, _STK_LIM_MAX }, \ > + { 0, RLIM_INFINITY }, \ > + { RLIM_INFINITY, RLIM_INFINITY }, \ > + { 0, 0 }, \ > + { INR_OPEN, INR_OPEN }, \ > + { MLOCK_LIMIT, MLOCK_LIMIT }, \ > + { RLIM_INFINITY, RLIM_INFINITY }, \ > + { RLIM_INFINITY, RLIM_INFINITY }, \ > + { MAX_SIGPENDING, MAX_SIGPENDING }, \ > + { MQ_BYTES_MAX, MQ_BYTES_MAX }, \ > +} While you are rooting around in there, perhaps this block should be converted to C99 initializer syntax, to avoid problems if arch-specific changes are later introduced? Regards, Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH] /proc//rlimit
This patch against 2.6.11-rc1-bk6 adds /proc//rlimit to export per-process resource limit settings. It was written to help analyze daemon core dump size settings, but may be more generally useful. Tested on 2.6.10. Sample output: [EMAIL PROTECTED] ~ # cat /proc/$$/rlimit cpu unlimited unlimited fsize unlimited unlimited data unlimited unlimited stack 8388608 unlimited core 0 unlimited rss unlimited unlimited nproc 111 111 nofile 1024 1024 memlock 32768 32768 as unlimited unlimited locks unlimited unlimited sigpending 1024 1024 msgqueue 819200 819200 Feedback welcome. Signed-off-by: Bill Rugolsky <[EMAIL PROTECTED]> --- linux-2.6.11-rc1-bk6/fs/proc/base.c.rlimit 2005-01-18 15:01:10.120960254 -0500 +++ linux-2.6.11-rc1-bk6/fs/proc/base.c 2005-01-18 15:07:28.102661832 -0500 @@ -32,6 +32,7 @@ #include #include #include +#include #include "internal.h" /* @@ -61,6 +62,7 @@ PROC_TGID_MAPS, PROC_TGID_MOUNTS, PROC_TGID_WCHAN, + PROC_TGID_RLIMIT, #ifdef CONFIG_SCHEDSTATS PROC_TGID_SCHEDSTAT, #endif @@ -87,6 +89,7 @@ PROC_TID_MAPS, PROC_TID_MOUNTS, PROC_TID_WCHAN, + PROC_TID_RLIMIT, #ifdef CONFIG_SCHEDSTATS PROC_TID_SCHEDSTAT, #endif @@ -124,6 +127,7 @@ E(PROC_TGID_ROOT, "root",S_IFLNK|S_IRWXUGO), E(PROC_TGID_EXE, "exe", S_IFLNK|S_IRWXUGO), E(PROC_TGID_MOUNTS,"mounts", S_IFREG|S_IRUGO), + E(PROC_TGID_RLIMIT,"rlimit", S_IFREG|S_IRUGO), #ifdef CONFIG_SECURITY E(PROC_TGID_ATTR, "attr",S_IFDIR|S_IRUGO|S_IXUGO), #endif @@ -149,6 +153,7 @@ E(PROC_TID_ROOT, "root",S_IFLNK|S_IRWXUGO), E(PROC_TID_EXE,"exe", S_IFLNK|S_IRWXUGO), E(PROC_TID_MOUNTS, "mounts", S_IFREG|S_IRUGO), + E(PROC_TID_RLIMIT, "rlimit", S_IFREG|S_IRUGO), #ifdef CONFIG_SECURITY E(PROC_TID_ATTR, "attr",S_IFDIR|S_IRUGO|S_IXUGO), #endif @@ -496,6 +501,107 @@ .release= mounts_release, }; +const char * const rlim_name[RLIM_NLIMITS] = { +#ifdef RLIMIT_CPU + [RLIMIT_CPU] = "cpu", +#endif +#ifdef RLIMIT_FSIZE + [RLIMIT_FSIZE] = "fsize", +#endif +#ifdef RLIMIT_DATA + [RLIMIT_DATA] = "data", +#endif +#ifdef RLIMIT_STACK + [RLIMIT_STACK] = "stack", +#endif +#ifdef RLIMIT_CORE + [RLIMIT_CORE] = "core", +#endif +#ifdef RLIMIT_RSS + [RLIMIT_RSS] = "rss", +#endif +#ifdef RLIMIT_NPROC + [RLIMIT_NPROC] = "nproc", +#endif +#ifdef RLIMIT_NOFILE + [RLIMIT_NOFILE] = "nofile", +#endif +#ifdef RLIMIT_MEMLOCK + [RLIMIT_MEMLOCK] = "memlock", +#endif +#ifdef RLIMIT_AS + [RLIMIT_AS] = "as", +#endif +#ifdef RLIMIT_LOCKS + [RLIMIT_LOCKS] = "locks", +#endif +#ifdef RLIMIT_SIGPENDING + [RLIMIT_SIGPENDING] = "sigpending", +#endif +#ifdef RLIMIT_MSGQUEUE + [RLIMIT_MSGQUEUE] = "msgqueue", +#endif +}; + +static int rlimit_show(struct seq_file *s, void *v) +{ + struct rlimit *rlim = (struct rlimit *) s->private; + int i; + + for (i = 0 ; i < RLIM_NLIMITS ; i++) { + if (rlim_name[i] != NULL) + seq_puts(s, rlim_name[i]); + else + seq_printf(s, "rlimit-%d", i); + + if (rlim[i].rlim_cur == RLIM_INFINITY) + seq_puts(s, " unlimited"); + else + seq_printf(s, " %lu", (unsigned long)rlim[i].rlim_cur); + + if (rlim[i].rlim_max == RLIM_INFINITY) + seq_puts(s, " unlimited\n"); + else + seq_printf(s, " %lu\n", (unsigned long)rlim[i].rlim_max); + } + return 0; +} + +static int rlimit_open(struct inode *inode, struct file *file) +{ + struct task_struct *task = proc_task(inode); + struct rlimit *rlim = kmalloc(RLIM_NLIMITS * sizeof (struct rlimit), GFP_KERNEL); + int ret; + + if (!rlim) + return -ENOMEM; + + task_lock(task->group_leader); + memcpy(rlim, task->signal->rlim, RLIM_NLIMITS * sizeof (struct rlimit)); + task_unlock(task->group_leader); + + ret = single_open(file, rlimit_show, rlim); + + if (ret) + kfree(rlim); + + return ret; +} + +static int rlimit_release(struct inode *inode, struct file *file) +{ + struct seq_file *s = file->private_data; + kfree(s->private); + return single_release(inode, file); +} + +static struct file_operations proc_rlimit_operations = { + .open = rlimit_open, + .read = seq_read, + .llseek = seq_lseek, + .release= rlimit_release, +}; + #define PROC_BLOCK_SIZE(3*1024)/* 4K page size but our output routines use some slack for overruns */ static ssize_t proc_info_read(struct fil
Re: Disturbing news..
On Wed, Mar 28, 2001 at 04:32:44PM +0200, Romano Giannetti wrote: > But with the new VFS semantics, wouldn't be possible for a MUA to make a > thing like the following: > > spawn a process with a private namespace. Here a minimun subset of the > "real" tree (maybe all / except /dev) is mounted readonly. The private /tmp > and /home/user are substituted by read-write directory that are in the > "real" tree /home/user/mua/fakehome and /home/user/mua/faketmp. In this > private namespace, run the "untrusted" binary. Possible and desirable. You have to turn off access to all the other dangerous namespaces though, like socket() and shmat(), and make sure that nosuid and devices are handled properly. Done right, the only thing that untrusted code can do is consume a little memory, CPU, and disk, but that's why there are limits and a scheduler. :-) One might even want to add back limited access to those other namespaces by implementing a filesystem interface, ala Plan-9/Inferno. Regards, Bill Rugolsky - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: v2.4.0test9 NFSv3 server woes Linux-->Solaris
On Thu, Oct 05, 2000 at 04:58:39PM +0200, David Weinehall wrote: > Using the NFSv3 server in the v2.4.0test9 kernel (I haven't tested any > earlier v2.3.xx or v2.4.0testx kernels) I'm having problems with > (for instance) compile glib. > > The setups I've tried are: > > wsize = rsize = 1kB > Linux NFSv3 server --> Linux NFSv3 client (UDP mounted) -- WORKS > > wsize = rsize = 32kB > Linux NFSv3 server --> Solaris NFSv3 client (UDP mounted) -- BROKEN! > Linux NFSv3 server --> Solaris NFSv3 client (TCP mounted) -- BROKEN! > > wsize = rsize = 2kB > Linux NFSv3 server --> Solaris NFSv3 client (UDP mounted) -- BROKEN! > Linux NFSv3 server --> Solaris NFSv3 client (TCP mounted) -- BROKEN! What do you mean by "BROKEN" ? Anything in syslog? tcpdumps? Why not test wsize=rsize=1K for the Linux/Solaris combo? Also, I was unaware that TCP server was supposed to work in 2.4.0-test9. (It isn't in the 2.2.x patches.) Are you sure that Solaris is not falling back to UDP mounts? > Oh, by the way, is there ANY sane reason whatsoever behind the decision > that the Linux NFSv3 client in the v2.2.18pre15 kernel defaults to wsize > = rsize = 1kB and the NFSv3 client in v2.4.0test9 defaults to > wsize = rsize = 4kB?! Every (?) other implementation of NFSv3 defaults > to 32kB... At least when mounting Solaris NFSv3 server --> Linux NFSv3 > client, 32kB rsize & wsize works perfectly fine (at least for > v2.2.18pre15, but I hope that v2.4.0test9 isn't worse in this regard.) The conservatism is similar to that for IDE tuning: many folks have broken hardware/drivers/networks, and sizes above 1K result in fragmentation/potential packet loss/RPC timeouts/write errors/corruption. So for 2.2.x, Alan has decreed 1K size. 2.4.0-test is a bit more aggressive. 32K is fine, if you are using TCP. But I just went through a day-long session with NetApp after they updated their default UDP size from 8K to 32K. 32K UDP == 23 fragments. On a switched network that may be fine, but with a hub and router, it is seeming death. Our Solaris clients were generating numerous RPC timeouts on writes. After setting the NetApp F720 server default back to 8K, the timeouts went away. You may want to take this over to [EMAIL PROTECTED]; also, tcpdumps are helpful. Regards, Bill Rugolsky [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.2.18pre1
On Fri, Sep 01, 2000 at 12:05:03PM +0100, Alan Cox wrote: > People would appreciate lots of things but stability happens to come first. > Thats why its primarily focussed on driver stuff not on revamping the > internals. Right now Im not happy with the nfsv3 stuff I last looked at and > it seems to still contain things Linus rejected a while back. Alan, would you please describe in a few words what items are problematic? Are the changes simply too extensive for your comfort? Are there still areas of compatibility in userland that you want ironed out? If the problem is more specific than that, is it Trond's client code: SunRPC rewrite? caching? Credentials? This is not a *me too!* request that you put it in; those of us with environments heavily dependent on NFS have been patching for so long, it hardly matters any more, especially now that the patches have been consolidated by Trond and Dave Higgens, and 2.4 is shaping up. I'm just curious as to what the perceived problems are, since you inevitably see lots of reports of breakage that never find their way onto these lists. Lately I just point everybody who asks me about NFS breakage to H.J. Lu's rpms, and they go away happy. Regards, Bill Rugolsky [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] Please read the FAQ at http://www.tux.org/lkml/