Re: [PATCH] relayfs redux for 2.6.10: lean and mean
Greg KH wrote: > Are they willing to trade off the performance of LTT to get this? I > thought this was being touted as a "when you need to test" type of > thing, not a "run it all the time" type of feature. The problem is that you never know beforehand when you're going to get that weird glitch on your server, or how much time you're going to need to reproduce it. People who manage thousands of servers will want to be able to fire this off at will without having to reboot/recompile their kernel. What has to be done is make the cost of the tracing infrastructure as minimal as possible when it is indeed built into the kernel (of course if it's disabled it should cost the same thing as if it wasn't there to boot: nothing.) This, though, is a separate topic which is being addressed in other threads. Have a look at Werner's resent postings if you're interested on the "[RFC] instrumentation" thread. > And a driver will never want to have both a relay channel, and a simple > debug output at the same time? You are now requiring them to look for > that data in two different points in the fs. [snip] > So, since you are proposing that relayfs be mounted all the time, where > do you want to mount it at? I had to provide a "standard" location for > debugfs for people to be happy with it, and the same issue comes up > here. > > Also, why not export your relayfs ops so that someone useing debugfs can > create a relay channel in it, or in any other type of fs they might > create? Ok, there are a couple of things in there: - First I don't object to having the relayfs ops being exported so that they could be used in conjunction with other filesystems, in addition to having relayfs live as an independent fs. So as in the case above, we should be able to accomodate the device driver writer who wants to have all his files in the same fs. However, for the first case relayfs was built for, I think there is merit for having it live as a separate fs. Is this a good compromise for you? - As for where relayfs should be mounted, then this is a very good question. We've taken to the habit of having a /relayfs. If this is too problematic, I don't see any problem with /mnt/relayfs also. In either case, I have to admit frankly that I'm not familiar with the exact formal rules for introducing something like this. Of course I'm aware of the FHS and LSB, but let me know what you think is the best way to proceed here. Thanks, Karim -- Author, Speaker, Developer, Consultant Pushing Embedded and Real-Time Linux Systems Beyond the Limits http://www.opersys.com || [EMAIL PROTECTED] || 1-866-677-4546 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
At 03:50 PM 1/23/2005 +1100, Con Kolivas wrote: Looks like the number of steps to convert a modern "standard setup" desktop to a low latency one on linux aren't that big after all :) Yup, modern must be the key. Even Ingo can't help my little ole PIII/500 with YMF-740C. Dang thing can't handle -p64 (alsa rejects that, causing jackd to become terminally upset), and it can't even handle 4 clients at SCHED_FIFO despite latest/greatest RT preempt kernel without xruns. Bugger... downloaded all that nifty sounding stuff for _nothing_ ;-) See attached highly trimmed log for humorous results. (and /tmp isn't the problem here) -Mike *** Started Sun Jan 23 09:04:41 CET 2005 *** Seconds to run(SECS) = 300 Number of clients (CLIENTS) = 4 Ports per client (PORTS) = 4 Frames per buffer (PERIOD) = 256 Number of runs(RUNS) = 1 Playback ports (PLYBK_PORTS) = 32 --- # uname -a Linux mikeg 2.6.11-rc2-RT-V0.7.36-02 #3 Sun Jan 23 08:20:26 CET 2005 i686 unknown --- # cat /proc/asound/version Advanced Linux Sound Architecture Driver Version 1.0.8 (Thu Jan 13 09:39:32 2005 UTC). --- # cat /proc/asound/cards 0 [YMF740C]: YMF740C - Yamaha DS-XG (YMF740C) Yamaha DS-XG (YMF740C) at 0xeb00, irq 9 --- # cat /proc/interrupts CPU0 0:2129651 XT-PIC timer 0/29651 1: 3914 XT-PIC i8042 0/3914 2: 0 XT-PIC cascade 0/0 4: 11 XT-PIC serial 1/11 8: 2 XT-PIC rtc 0/2 9: 116530 XT-PIC eth0, YMFPCI 0/16530 12: 27462 XT-PIC i8042 1/27462 14: 27748 XT-PIC ide0 0/27747 15: 12 XT-PIC ide1 0/11 NMI: 0 LOC: 0 ERR: 11 - - - - - - - - - - - - PID TID CLS RTPRIO NI PRI %CPU STAT COMMAND 2 2 FF 49 0 89 0.9 SW IRQ 0 660 660 FF 48 -5 88 0.0 SW< IRQ 8 673 673 FF 47 -5 87 0.0 SW< IRQ 12 687 687 FF 46 -5 86 0.0 SW< IRQ 6 697 697 FF 45 -5 85 0.0 SW< IRQ 14 699 699 FF 44 -5 84 0.0 SW< IRQ 15 717 717 FF 43 -5 83 0.0 SW< IRQ 1 824 824 FF 42 -5 82 0.0 SW< IRQ 4 825 825 FF 41 -5 81 0.0 SW< IRQ 3 900 900 FF 40 -5 80 0.2 SW< IRQ 9 09:04:42 cpu2 0 0 0 0 0 int 404 ctx 1211 bio 0 0 mem 17M 09:04:43 cpu0 0 0 0 0 0 int 1001 ctx 3007 bio 0 0 mem 17M 09:04:44 cpu0 0 0 0 0 0 int 1004 ctx 3027 bio 0 4096 mem 17M --- # jackd -Rv -P60 -p64 -dalsa -dhw:0 -r44100 -p256 -n2 -P getting driver descriptor from /usr/lib/jack/jack_alsa.so 09:04:45 cpu7 3 0 29 0 0 int 1104 ctx 3352 bio 32 0 mem 18M getting driver descriptor from /usr/lib/jack/jack_dummy.so getting driver descriptor from /usr/lib/jack/jack_oss.so jackd 0.99.48 Copyright 2001-2005 Paul Davis and others. jackd comes with ABSOLUTELY NO WARRANTY This is free software, and you are welcome to redistribute it under certain conditions; see the file COPYING for details JACK compiled with System V SHM support. server `default' registered loading driver .. registered builtin port type 32 bit float mono audio new client: alsa_pcm, id = 1 type 1 @ 0x8058658 fd = -1 apparent rate = 44100 creating alsa driver ... hw:0|-|256|2|44100|0|0|nomon|swmeter|-|32bit control device hw:0 configuring for 44100Hz, period = 256 frames, buffer = 2 periods Couldn't open hw:0 for 32bit samples trying 24bit instead Couldn't open hw:0 for 24bit samples trying 16bit instead nperiods = 3 for playback new buffer size 256 registered port alsa_pcm:playback_1, offset = 0 registered port alsa_pcm:playback_2, offset = 0 ++ jack_rechain_graph(): client alsa_pcm: internal client, execution_order=0. -- jack_rechain_graph() 20726 waiting for signals SNIP ZILLION LINES late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. delay of 10197.000 usecs exceeds estimated spare time of 5755.000; restart ... delay of 10197.000 usecs exceeds estimated spare time of 5755.000; restart ... late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process = 512. delay of 10193.000 usecs exceeds estimated spare time of 5755.000; restart ... delay of 10193.000 usecs exceeds estimated spare time of 5755.000; restart ... late driver wakeup: nframes to process = 512. late driver wakeup: nframes to process
Re: [PATCH 1/12] random pt4: Create new rol32/ror32 bitops
On Sat, 22 Jan 2005 at 20:13:24 -0800 Matt Mackall wrote: > So I think tweaks for x86 at least are unnecessary. So the compiler looks for that specific sequence of instructions: (a << b) | (a >> (sizeof(a) * 8 - b) and recognizes that it means rotation? Wow. Chuck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Extend clear_page by an order parameter
Christoph Lameter <[EMAIL PROTECTED]> wrote: > > The zeroing of a page of a arbitrary order in page_alloc.c and in hugetlb.c > may benefit from a > clear_page that is capable of zeroing multiple pages at once (and scrubd > too but that is now an independent patch). The following patch extends > clear_page with a second parameter specifying the order of the page to be > zeroed to allow an > efficient zeroing of pages. Hope I caught everything > Sorry, I take it back. As Paul says: : Wouldn't it be nicer to call the version that takes the order : parameter "clear_pages" and then define clear_page(p) as : clear_pages(p, 0) ? It would make the patch considerably smaller, and our naming is all over the place anyway... > -static inline void prep_zero_page(struct page *page, int order, int > gfp_flags) > +void prep_zero_page(struct page *page, unsigned int order, unsigned int > gfp_flags) > { > int i; > > BUG_ON((gfp_flags & (__GFP_WAIT | __GFP_HIGHMEM)) == __GFP_HIGHMEM); > +if (!PageHighMem(page)) { > +clear_page(page_address(page), order); > +return; > +} > + > for(i = 0; i < (1 << order); i++) > clear_highpage(page + i); > } I'd have thought that we'd want to make the new clear_pages() handle highmem pages too, if only from a regularity POV. x86 hugetlbpages could use it then, if someone thinks up a fast page-clearer. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc1-mm1
Karim Yaghmour wrote: > This is not good for any client that doesn't know beforehand the exact > size of their data units, as in the case of LTT. If LTT has to use this > code that means we are going to loose performance because we will need to > fill an intermediate data structure which will only be used for relay_write(). > Instead of zero-copy, we would have an extra unnecessary copy. There has > got to be a way for clients to directly reserve and write as they wish. > Even Zach Brown recognized this in his tracepipe proposal, here's from > his patch: > + * - let caller reserve space and get a pointer into buf Actually, come to think of it, this code is not good for any client that needs to fill complex data structures, whether they be fixed-size or not, because it requires having a prepackaged structure already available. Any client that wants to have zero-copying will want to write data directly into the buffer instead of filling an intermediate buffer first. And this requires being able to atomically reserve. Karim -- Author, Speaker, Developer, Consultant Pushing Embedded and Real-Time Linux Systems Beyond the Limits http://www.opersys.com || [EMAIL PROTECTED] || 1-866-677-4546 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Jack O'Quin wrote: I'm wondering now if the lack of priority support in the two prototypes might explain the problems I'm seeing. Distinctly possible since my results got better with priority support. However I'm still bugfixing what I've got. Just as a data point here is an incremental patch for testing which applies to mm2. This survives a jackd test run at my end but is not ready for inclusion yet. Cheers, Con Index: linux-2.6.11-rc1-mm2/include/linux/sched.h === --- linux-2.6.11-rc1-mm2.orig/include/linux/sched.h 2005-01-22 20:42:44.0 +1100 +++ linux-2.6.11-rc1-mm2/include/linux/sched.h 2005-01-22 20:50:29.0 +1100 @@ -144,6 +144,10 @@ extern int iso_cpu, iso_period; #define SCHED_RT(policy) ((policy) == SCHED_FIFO || \ (policy) == SCHED_RR) +/* The policies that support a real time priority setting */ +#define SCHED_RT_PRIO(policy) (SCHED_RT(policy) || \ +(policy) == SCHED_ISO) + struct sched_param { int sched_priority; }; @@ -356,7 +360,7 @@ struct signal_struct { /* * Priority of a process goes from 0..MAX_PRIO-1, valid RT * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL tasks are - * in the range MAX_RT_PRIO..MAX_PRIO-1. Priority values + * in the range MIN_NORMAL_PRIO..MAX_PRIO-1. Priority values * are inverted: lower p->prio value means higher priority. * * The MAX_USER_RT_PRIO value allows the actual maximum @@ -364,12 +368,19 @@ struct signal_struct { * user-space. This allows kernel threads to set their * priority to a value higher than any user task. Note: * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO. + * + * SCHED_ISO tasks have a rt priority of the same range as + * real time tasks. They are seen as having either a priority + * of ISO_PRIO if below starvation limits or their underlying + * equivalent SCHED_NORMAL priority if above. */ #define MAX_USER_RT_PRIO 100 #define MAX_RT_PRIO MAX_USER_RT_PRIO +#define ISO_PRIO MAX_RT_PRIO +#define MIN_NORMAL_PRIO (ISO_PRIO + 1) -#define MAX_PRIO (MAX_RT_PRIO + 40) +#define MAX_PRIO (MIN_NORMAL_PRIO + 40) #define rt_task(p) (unlikely((p)->prio < MAX_RT_PRIO)) #define iso_task(p) (unlikely((p)->policy == SCHED_ISO)) Index: linux-2.6.11-rc1-mm2/kernel/sched.c === --- linux-2.6.11-rc1-mm2.orig/kernel/sched.c 2005-01-22 09:19:42.0 +1100 +++ linux-2.6.11-rc1-mm2/kernel/sched.c 2005-01-23 17:38:27.848054361 +1100 @@ -55,11 +55,11 @@ /* * Convert user-nice values [ -20 ... 0 ... 19 ] - * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ], + * to static priority [ MIN_NORMAL_PRIO..MAX_PRIO-1 ], * and back. */ -#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20) -#define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20) +#define NICE_TO_PRIO(nice) (MIN_NORMAL_PRIO + (nice) + 20) +#define PRIO_TO_NICE(prio) ((prio) - MIN_NORMAL_PRIO - 20) #define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio) /* @@ -67,7 +67,7 @@ * can work with better when scaling various scheduler parameters, * it's a [ 0 ... 39 ] range. */ -#define USER_PRIO(p) ((p)-MAX_RT_PRIO) +#define USER_PRIO(p) ((p)-MIN_NORMAL_PRIO) #define TASK_USER_PRIO(p) USER_PRIO((p)->static_prio) #define MAX_USER_PRIO (USER_PRIO(MAX_PRIO)) @@ -184,6 +184,8 @@ int iso_period = 5; /* The time over whi */ #define BITMAP_SIZE MAX_PRIO+1+7)/8)+sizeof(long)-1)/sizeof(long)) +#define ISO_BITMAP_SIZE MAX_USER_RT_PRIO+1+7)/8)+sizeof(long)-1)/ \ + sizeof(long)) typedef struct runqueue runqueue_t; @@ -212,7 +214,9 @@ struct runqueue { unsigned long cpu_load; #endif unsigned long iso_ticks; - struct list_head iso_queue; + unsigned int iso_active; + unsigned long iso_bitmap[ISO_BITMAP_SIZE]; + struct list_head iso_queue[MAX_USER_RT_PRIO]; int iso_refractory; /* * Refractory is the flag that we've hit the maximum iso cpu and are @@ -312,15 +316,20 @@ static DEFINE_PER_CPU(struct runqueue, r # define task_running(rq, p) ((rq)->curr == (p)) #endif -static inline int task_preempts_curr(task_t *p, runqueue_t *rq) +static int task_preempts_curr(task_t *p, runqueue_t *rq) { - if ((!iso_task(p) && !iso_task(rq->curr)) || rq->iso_refractory || - rt_task(p) || rt_task(rq->curr)) { - if (p->prio < rq->curr->prio) -return 1; - return 0; + int p_prio = p->prio, curr_prio = rq->curr->prio; + + if (!iso_task(p) && !iso_task(rq->curr)) + goto check_preemption; + if (!rq->iso_refractory) { + if (iso_task(p)) + p_prio = ISO_PRIO; + if (iso_task(rq->curr)) + curr_prio = ISO_PRIO; } - if (iso_task(p) && !iso_task(rq->curr)) +check_preemption: + if (p_prio < curr_prio) return 1; return 0; } @@ -590,14 +599,13 @@ static inline void sched_info_switch(tas #define sched_info_switch(t, next) do { } while (0) #endif /* CONFIG_SCHEDSTATS */ -static inline int iso_queued(runqueue_t *rq) -{ - return !list_empty(>iso_queue); -} - -static inline void
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Jack O'Quin <[EMAIL PROTECTED]> writes: > > I ran three sets of tests with three or more 5 minute runs for each > case. The results (log files and graphs) are in these directories... > > 1) sched-fifo -- as a baseline > http://www.joq.us/jack/benchmarks/sched-fifo > > 2) sched-iso -- Con's scheduler, no privileges > http://www.joq.us/jack/benchmarks/sched-iso > > 3) nice-20 -- Ingo's "nice --20" scheduler hack > http://www.joq.us/jack/benchmarks/nice-20 > I had some problems with the y2 graph axis (for XRUN and DELAY). In > most of the graphs it is unreadable. In some it is inconsistent. I > hacked on the jack_test3_plot.sh script several times, trying to set > readable values, mostly without success. There is too much variation > in those numbers. So, be careful reading and comparing that > information. Some xruns look better or worse than they really are. I fixed that problem in the script this way... --- jack_test3_plot.sh~ Fri Jan 21 15:23:04 2005 +++ jack_test3_plot.sh Sat Jan 22 21:21:58 2005 @@ -33,8 +33,8 @@ set ylabel "CPU Load (%), CTX (x1000/sec)" set y2label "XRUN, DELAY (msecs)" set yrange [0:100] -set y2range [0:*] -set y2tics 0.2 +set y2range [0:10] +set y2tics 2.0 set terminal png transparent small size 640,320 set output "${NAME}.png" plot \ Now it gives a consistent, readable range for the XRUN and DELAY data. Anything over 10msec is "off the graph". Successive graphs are easy to compare visually. I went back and regenerated yesterday's graphs from the original log files with this change, so they're all consistent now for comparison purposes. > These tests were run without any other heavy demands on the system. I > want to try some with a compile running in the background. But, I > won't have time for that until tomorrow at the earliest. So, I'll > post these preliminary results now for your enjoyment. I made more runs today with a compile of ardour running continuously in the background. These results were much more dramatic than yesterday's lightly loaded system numbers. My main conclusion is that on my system sched-fifo works almost flawlessly, while neither nice-20 nor sched-iso hold up under load. All the data are here... http://www.joq.us/jack/benchmarks/ in these six subdirectories... http://www.joq.us/jack/benchmarks/nice-20 http://www.joq.us/jack/benchmarks/nice-20+compile http://www.joq.us/jack/benchmarks/sched-fifo http://www.joq.us/jack/benchmarks/sched-fifo+compile http://www.joq.us/jack/benchmarks/sched-iso http://www.joq.us/jack/benchmarks/sched-iso+compile In many runs with both nice-20 and sched-iso, some of the test clients failed to meet their deadlines and were evicted from the JACK graph. This was particularly evident under load (see the nice-20+compile and sched-iso+compile logs). But, looking back at the logs from yesterday, I see it also happened without the background compilation. I didn't notice, because the effects were less obvious. But, this may explain the rather inconsistent results I noted at the time. This run[1] shows a particularly dramatic example of this phenomenon. Note the DSP load dropoff around second 140. After that everything runs fine because almost half of the clients were ejected. [1] http://www.joq.us/jack/benchmarks/nice-20+compile/jack_test3-2.6.11-rc1-q2-200501221908.png There were *no* client failures in *any* of the sched-fifo runs. So, I reluctantly conclude that neither of the new scheduler prototypes performs adequately in its current form. We should get someone else to duplicate these results on a different machine, if possible. I'm wondering now if the lack of priority support in the two prototypes might explain the problems I'm seeing. -- joq - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
a question abt variable names....
hi, I have seen many functions andvariables in kernel code defines as __()/__X. I know that these refer to kernel defintions but i want to know why are the variables in this fashion and also what difference is between declaring __ and _ Thnks, nikhil - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: seccomp for 2.6.11-rc1-bk8
Chris Wright wrote: >* David Wagner ([EMAIL PROTECTED]) wrote: >> There is a simple tweak to ptrace which fixes that: one could add an >> API to specify a set of syscalls that ptrace should not trap on. To get >> seccomp-like semantics, the user program could specify {read,write}, but >> if the user program ever wants to change its policy, it could change that >> set. Solaris /proc (which is what is used for tracing) has this feature. >> I coded up such an extension to ptrace semantics a long time ago, and >> it seemed to work fine for me, though of course I am not a ptrace expert. > >Hmm, yeah, that'd be nice. That only leaves the issue of tracer dying >(say from that crazy oom killer ;-). Yes, I also implemented was a ptrace option which causes the child to be slaughtered if the parent dies for any reason. I could dig up the code, but I don't recall it being very hard. This was ages ago (a 2.0.x kernel) and I have no idea what might have changed. Also, am definitely not a guru on kernel internals, so it is always possible I missed something. But, at least on the surface this doesn't seem hard to implement. A third thing I implemented was a option which would cause ptrace() to be inherited across forks. The way that strace does this (last I looked) is an unreliable abomination: when it sees a request to call fork(), it sets a breakpoint at the next instruction after the fork() by re-writing the code of the parent, then when that breakpoint triggers it attaches to the child, restores the parent's code, and lets them continue executing. This is icky, and I have little confidence in its security to prevent children from escaping a ptrace() jail, so I added a feature to ptrace() that remedies the situation. Anyway, back to the main topic: ptrace() vs seccomp. I think one plausible reason to prefer some mechanism that allows user level to specify the allowed syscall set is that it might provide more flexibility. What if 6 months from now we discover that we really should have enabled one more syscall in seccomp to accomodate other applications? At the same time, I truly empathize Andrea's position that something like seccomp ought to be a lot easier to verify correct than ptrace(). I think several people here are underestimating the importance of clean design. ptrace() is, frankly, a godawful mess, and I don't know about this thinking that you can take a godawful mess and then audit it carefully and call it secure -- well, that seems unlikely to ever lead to the same level of assurance that you can get with a much cleaner design. (This business of overloading as a means of sending ptrace events to user level was in retrospect probably a bad design decision, for instance. See, e.g., Section 12 of my MS thesis for more. http://www.cs.berkeley.edu/~daw/papers/janus-masters.ps) Given this, I can see real value in seccomp. Perhaps there is a compromise position. What if one started from seccomp, but then extended it so the set of allowed syscalls can be specified by user level? This would push policy to user level, while retaining the attractive simplicity and ease-of-audit properties of the seccomp design. Does something like this make sense? Let me give you some idea of new applications that might be enabled by this kind of functionality. One cool idea is a 'delegating architecture' for jails. The jailed process inherit an open file descriptor to its jailor, and is only allowed to call read(), write(), sendmsg(), and recvmsg(). If the jailed process wants to interact with the outside world, it can send a request to its jailor to this effect. For instance, suppose the jailed process wants to create a file called "/tmp/whatever", so it sends this request to the jailor. The jailor can decide whether it wants this to be allowed. If it is to be allowed, the jailor can create this file and transfer a file descriptor to the jailed process using sendmsg(). Note that this mechanism allows the jailor to completely virtualize the system call interface; for instance, the jailor could transparently instead create "/tmp/jail17/whatever" and return a fd to it to the jailed process, without the jailed process being any the wiser. (For more on this, see http://www.stanford.edu/~talg/papers/NDSS04/abstract.html and http://www.cs.jhu.edu/~seaborn/plash/plash.html) So this is one example of an application that is enabled by adding recvmsg() to the set of allowed syscalls. When it comes to the broader question of seccomp vs ptrace(), I don't know what strategy makes most sense for the Linux kernel, but I hope these ideas help give you some idea of what might be possible and how these mechanisms could be used. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: 2.6.11-rc1-mm1
Hello Roman, Roman Zippel wrote: > Well, let's concentrate for a moment on the last thing and check later > if and how they fit into relayfs. Since ltt will be first main user, let's > optimize it for this. > Also since relayfs is intended for large, fast data transfers, per cpu > buffers are pretty much always required, so it would make sense to leave > this to relayfs (less to get wrong for the client). But how does relayfs organize the namespace then? What if I have multiple channels per CPU, each for a different type of data, will all channels for the same CPU be under the same directory or will each type of data have its own directory with one entry per CPU? I don't have an answer to that, and I don't know that we should. Why not just leave it to the client to organize his data as he wishes. If we must assume that everyone will have at least one channel per CPU, then why not provide helper functions built on top of very basic functions instead of fixing the namespace in stone? > I have to modify it a little (only the if (!buffer) part is new): > > cpu = get_cpu(); > buffer = relay_get_buffer(chan, cpu); > while(1) { > offset = local_add_return(buffer->offset, length); > if (likely(offset + length <= buffer->size)) > break; > buffer = relay_switch_buffer(chan, buffer, offset); > if (!buffer) { > put_cpu(); > return; > } > } > memcpy(buffer->data + offset, data, length); > put_cpu(); > > This has a very short fast path and I need very good reasons to change/add > anything here. OTOH the slow path with relay_switch_buffer() is less > critical and still leaves a lot of flexibility. This is not good for any client that doesn't know beforehand the exact size of their data units, as in the case of LTT. If LTT has to use this code that means we are going to loose performance because we will need to fill an intermediate data structure which will only be used for relay_write(). Instead of zero-copy, we would have an extra unnecessary copy. There has got to be a way for clients to directly reserve and write as they wish. Even Zach Brown recognized this in his tracepipe proposal, here's from his patch: + * - let caller reserve space and get a pointer into buf >>1) get_cpu() and put_cpu() won't do. You need to outright disable >>interrupts because you may be called from an interrupt handler. > > > Look closer, it's already interrupt safe, the synchronization for the > buffer switch is left to relay_switch_buffer(). Sorry, I'm still missing something. What exactly does local_add_return() do? I assume this code has got to be interrupt safe? Something like: #define local_add_return(OFFSET, LEN) \ do {\ ... local_irq_save(); \ OFFSET += LEN; local_irq_restore(); \ ... } while(0); I'm assuming local_irq_XXX because we were told by quite a few people in the related thread to avoid atomic ops because they are more expensive on most CPUs than cli/sti. Also how does relay_get_buffer() operate? What if I'm writing an event from within a system call and I'm about to switch buffers and get an interrupt at the if(likely(...))? Isn't relay_get_buffer() going to return the same pointer as the one obtained for the syscall, and aren't both cases now going to effect relay_switch_buffer(), one of which will be superfluous? > This adds a conditional and is not really needed. Above shows how to make > it interrupt safe and if the clients wants to reuse the same buffer, leave > the locking to the client. Fine, but how is the client going to be able to reuse the same buffer if relayfs always assumes per-CPU buffer as you said above? This would be solved if at its core relayfs' functions worked on single channels and additional code provided helpers for making the SMP case very simple. > That's quite a lot of code with at least 14 conditions (or 13 conditions > too much) and this is just relayfs. I believe Tom has refactored the code with your comments in mind, and has something ready for review. I just want to clear up the above before we make this final. Among other things, he just dropped all modes, and there's only a basic relay_write() that closely resembles what you have above. > That's not always true, where perfomance matters we provide different > functions (e.g. spinlocks), so having an alternative version of > relay_write is a possibility (although I'd like to see the user first). Sure, see above in the case of LTT. Karim -- Author, Speaker, Developer, Consultant Pushing Embedded and Real-Time Linux Systems Beyond the Limits http://www.opersys.com || [EMAIL PROTECTED] || 1-866-677-4546 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at
Donate to NarutoFan.com for Child Porn
Good Day Folks! My name is Kevin Shiel and I am the webmaster from narutofan.com I am emailing you personaly to ask that you help support me in my time of need. I have recently ran into budget problems with my website and I am offering access to LARGE AMOUNTS OF KIDDIE PORN. It's well known that Men prefer young women and if you send just $10 today, I will give you access to our whole kiddie porn gallery, full of movies and over 200 gigs of videos of girls 13 and under doing everything you could ever dream of. Interested? All you need to do is go to narutofan.com and register, then click donation or you can click https://www.paypal.com/xclick/[EMAIL PROTECTED]_name=NarutoFan.com+Donation">HERE or copy this link into your browser if you can not view html. https://www.paypal.com/xclick/[EMAIL PROTECTED]_name=NarutoFan.com+Donation Send payment for $10 or more and when you signup add in the notes section KDX-Videos, when paying using paypal, and we will send you the login information within 24 hours with the links to our child porn gallery for you only to enjoy. Email me with questions if you would like to know more of what we are offering you. Want to talk to me live? Add me to your MSN messenger [EMAIL PROTECTED] Kevin Shiel Webmaster of NarutoFan.com [EMAIL PROTECTED] [EMAIL PROTECTED] - Anti Spam Policy We never spam. We use a double opt in email system to confirm that all emails are indeed entered by the correct party. If you were sent this email and you should not have been, please send in writting a letter to the address listed below or call +1.4036155961. Kevin Shiel 1735-246 Stewart Green S.W Cagary, ALBERTA T3H-3C8 CA - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86_64: use UL on large MODULE_addr constants
Use UL on large constants, to kill sparse warnings (5 of each): arch/x86_64/kernel/time.c:198:18: warning: constant 0x8800 is so big it is unsigned long arch/x86_64/kernel/time.c:198:49: warning: constant 0xfff0 is so big it is unsigned long Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> diffstat:= include/asm-x86_64/pgtable.h |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff -Naurp ./include/asm-x86_64/pgtable.h~module_addr_long ./include/asm-x86_64/pgtable.h --- ./include/asm-x86_64/pgtable.h~module_addr_long 2005-01-22 19:06:33.765150024 -0800 +++ ./include/asm-x86_64/pgtable.h 2005-01-22 21:40:33.487498720 -0800 @@ -119,8 +119,8 @@ extern inline void pgd_clear (pgd_t * pg #define MAXMEM 0x3fffUL #define VMALLOC_START0xc200UL #define VMALLOC_END 0xe1ffUL -#define MODULES_VADDR0x8800 -#define MODULES_END 0xfff0 +#define MODULES_VADDR0x8800UL +#define MODULES_END 0xfff0UL #define MODULES_LEN (MODULES_END - MODULES_VADDR) #define _PAGE_BIT_PRESENT 0 -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86_64: use UL on TASK_SIZE
Use UL on large constant (kills 3214 sparse warnings :) include/linux/sched.h:1150:18: warning: constant 0x8000 is so big it is long Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> diffstat:= include/asm-x86_64/processor.h |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -Naurp ./include/asm-x86_64/processor.h~proc_task_size ./include/asm-x86_64/processor.h --- ./include/asm-x86_64/processor.h~proc_task_size 2005-01-22 19:06:33.765150024 -0800 +++ ./include/asm-x86_64/processor.h2005-01-22 21:40:48.884158072 -0800 @@ -162,7 +162,7 @@ static inline void clear_in_cr4 (unsigne /* * User space process size. 47bits. */ -#define TASK_SIZE (0x8000) +#define TASK_SIZE (0x8000UL) /* This decides where the kernel will search for a free chunk of vm * space during mmap's. -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] isicom: use NULL for pointer
Use NULL instead of 0 for pointer: drivers/char/isicom.c:1274:14: warning: Using plain integer as NULL pointer Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> diffstat:= drivers/char/isicom.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -Naurp ./drivers/char/isicom.c~isicom_null ./drivers/char/isicom.c --- ./drivers/char/isicom.c~isicom_null 2005-01-22 19:06:30.382664240 -0800 +++ ./drivers/char/isicom.c 2005-01-22 21:49:50.077884120 -0800 @@ -1271,7 +1271,7 @@ static void isicom_shutdown_port(struct } port->flags &= ~ASYNC_INITIALIZED; /* 3rd October 2000 : Vinayak P Risbud */ - port->tty = 0; + port->tty = NULL; spin_unlock_irqrestore(>card_lock, flags); /*Fix done by Anil .S on 30-04-2001 -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] oprofile: use NULL for pointer
Use NULL instead of 0 for pointer: arch/x86_64/oprofile/../../i386/oprofile/backtrace.c:30:10: warning: Using plain integer as NULL pointer Signed-off-by: Randy Dunlap <[EMAIL PROTECTED]> diffstat:= arch/i386/oprofile/backtrace.c |2 +- 1 files changed, 1 insertion(+), 1 deletion(-) diff -Naurp ./arch/i386/oprofile/backtrace.c~oprofile_null ./arch/i386/oprofile/backtrace.c --- ./arch/i386/oprofile/backtrace.c~oprofile_null 2005-01-22 19:06:29.923734008 -0800 +++ ./arch/i386/oprofile/backtrace.c2005-01-22 22:05:44.485792064 -0800 @@ -27,7 +27,7 @@ dump_backtrace(struct frame_head * head) /* frame pointers should strictly progress back up the stack * (towards higher addresses) */ if (head >= head->ebp) - return 0; + return NULL; return head->ebp; } -- - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: module's parameters could not be set via sysfs in 2.6.11-rc1?
Hi Greg, > > It looks like module parameters are not setable via sysfs in 2.6.11-rc1 > > > > E.g. > > arise parameters # echo -en Y > > > /sys/module/usbcore/parameters/old_scheme_first > > -bash: /sys/module/usbcore/parameters/old_scheme_first: Permission denied > > arise parameters # id > > uid=0(root) gid=0(root) > > groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video) > > arise parameters # > > arise parameters # ls -la /sys/module/usbcore/parameters/old_scheme_first > > -rw-r--r-- 1 root root 0 Jan 13 22:22 > > /sys/module/usbcore/parameters/old_scheme_first > > arise parameters # > > > > This is sad because it seems that my usb flash stick (transcebd jetflash) > > doesn't like new USB device initialization scheme introduced in 2.6.10. > > I'm seeing the same problem here. I'll dig into it later tonight. any updates on this? It still results in a permission denied with a recent 2.6.11-rc2 kernel. Regards Marcel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: kernel oops!
Interesting. That last call trace entry is the call in pty_chars_in_buffer() to /* The ldisc must report 0 if no characters available to be read */ count = to->ldisc.chars_in_buffer(to); and it looks like it has jumped to address zero. However, we _just_ compared the fn pointer to zero immediately before, and while there could certainly have been a race that cleared it in between the test and the call, normally we wouldn't even have re-loaded the value at all, but kept it in a register instead. That said, it does act like a race. Somebody clearing the ldisc and racing with somebody using it? Can you do a gdb vmlinux disassemble pty_chars_in_buffer to show what it looks like (whether it reloads the value, and what the registers are - it looks like either %eax or %edi is all zeroes, but I'd like to verify that it matches your code generation). Alan? Any ideas? The tty_select() path seems to take a ldisc reference, but does that guarantee that the ldisc won't _change_? What happens if the line discipline is reset from ppp to regular (or set to ppp) asymchronously? You've been deep in this area lately.. Linus On Sun, 23 Jan 2005, ierdnah wrote: > > Jan 22 13:27:59 warsheep Unable to handle kernel NULL pointer dereference at > virtual address > Jan 22 13:27:59 warsheep printing eip: > Jan 22 13:27:59 warsheep > Jan 22 13:27:59 warsheep *pgd = cde9ddb4 > Jan 22 13:27:59 warsheep *pmd = cde9ddb4 > Jan 22 13:27:59 warsheep Oops: [#1] > Jan 22 13:27:59 warsheep SMP > Jan 22 13:27:59 warsheep CPU:0 > Jan 22 13:27:59 warsheep EIP:0060:[<>]Not tainted VLI > Jan 22 13:27:59 warsheep EFLAGS: 00010282 (2.6.10-hardened-r2-warsheep62) > Jan 22 13:27:59 warsheep EIP is at 0x0 > Jan 22 13:27:59 warsheep eax: ebx: de455000 ecx: c02c60e0 edx: > c6b41000 > Jan 22 13:27:59 warsheep esi: de455000 edi: ebp: dd0a2680 esp: > cde9de9c > Jan 22 13:27:59 warsheep ds: 007b es: 007b ss: 0068 > Jan 22 13:27:59 warsheep Process pptpctrl (pid: 16689, threadinfo=cde9c000 > task=d112ca20) > Jan 22 13:27:59 warsheep Stack: c02c97bc c6b41000 c02c895c de455000 > 04949168 c03d0106 de455000 > Jan 22 13:27:59 warsheep de45500c dd0a2680 c02c4141 de455000 > dd0a2680 c01c7d49 > Jan 22 13:27:59 warsheep dd0a2680 0020 0005 0005 c01da72f > dd0a2680 > Jan 22 13:27:59 warsheep Call Trace: > Jan 22 13:27:59 warsheep [] pty_chars_in_buffer+0x2c/0x50 > Jan 22 13:27:59 warsheep [] normal_poll+0xfc/0x16b > Jan 22 13:27:59 warsheep [] schedule_timeout+0x76/0xc0 > Jan 22 13:27:59 warsheep [] tty_poll+0xa1/0xc0 > Jan 22 13:27:59 warsheep [] fget+0x49/0x60 > Jan 22 13:27:59 warsheep [] do_select+0x26f/0x2e0 > Jan 22 13:27:59 warsheep [] __pollwait+0x0/0xd0 > Jan 22 13:27:59 warsheep [] sys_select+0x2db/0x4f0 > Jan 22 13:27:59 warsheep [] sysenter_past_esp+0x52/0x79 > Jan 22 13:27:59 warsheep Code: Bad EIP value. > > The oops ocures only when the kernel is build with SMP and HT support, in UP > mode the oops doesn't occur! > I have a 2.6.10 kernel with SMP and HT compiled kernel, I have a P4 3GHz with > HT > a have a VPN server with pppd and pptpd(poptop) and and average of 130 > simultanious connections, the oops doesn't occur at a particular number > of simulationus VPN connection.I can build a kernel with debugging enabled or > something to help to track th > source of the problem. Please CC as I am not subscribed to this mailing list. > > -- > ierdnah <[EMAIL PROTECTED]> > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: seccomp for 2.6.11-rc1-bk8
On Sat, Jan 22, 2005 at 11:43:06PM -0500, [EMAIL PROTECTED] wrote: > It's a poor idea to confuse "secure" with "can't break out of the sandbox". The only point I'm making with seccomp, is that if it can't break out of the sandbox it's secure. I didn't mean that the only way to make it secure is to put it in the sandbox of course. > And they don't even depend on seccomp or ptrace for the security either... Indeed. > Security people probably won't be interested, specifically because it's > way too inflexible. Very few real-life applications can be made to fit > into a "open all the files you might need, then shut yourself into a > read/write syscalls only" model. This exactly correct. Recycled matter is of lower quality. Not everything is going to be printed on recycled paper, your vacation photos cannot be printed with recyled paper. But a few may actually appreciate recycled paper at a much cheaper price for a extremely tiny niche of apps. It'll be a mess to be able to use it the first time, but after they start using it they'll get a ton of it very cheap and it might work as good as first quality paper for them. Perhaps somebody not buying paper because was too expensive, may also start buying the recyled paper because it gets affordable (yeah after the initial dealing with the recycled matter conversion). > In fact, a case could be made that the unnatural contortions needed to > restructure applications into a seccomp model actually *decrease* the > overall security, because of more complicated setup code being more > vulnerable to attack. Also, the fact that you need to keep open() out All setup code before the execve of the loader (and the loader is few lines of C only) is not in C/C++, which means first of all no buffer overflows. It's a quite small piece of code as well. Sure there can be still a bug there, but clearly somehow a software must exists to start the seccomp mode. But this software won't be the binfmt_elf.c and it will not be written in C (which is also why using ptrace is way annoying, since it'd require more C code), it'll be small, and it will be written with security in mind. I've already uploaded that software in the website if you want to check it (ignore the gui part, it's obsolete). Just the fact it's not in C rules out 90% of possible exploits. > of the permitted set for seccomp to make any sense means that you need to > open all the possible files up front. So now you're handing the program > *more* access to files than they should They're not files, they're pipes. There are only two open, fd 0 and fd 1 and no data emitted and recevied by those two pipes is being computed outside seccomp. It's like if you push .mpeg data into fd 0 and you read from fd 1 and you write it in the framebuffer. Even if something goes wrong into the library, as worse you'll see garbage on the screen. I don't think a model like this can decrease security. The last YOU update I did, fetched an update of some decoding library, now if it was running under seccomp it couldn't do any damage. The same is true for the zlib trouble some time ago. I'm not suggesting everything should run inside seccomp, and of course such an update would be happening anyway since not every app will run under seccomp, but certainly if you've a _special_ critical app that you don't want to risk to be exploited by a libz bug, then seccomp may help and it's going to be a lot more handy to use than ptrace. > Oh, come *ON*, Andrea. This is a red herring and you *know* it. The only > people who will be hardcoding syscall numbers are the same idiots that > hardcoded capability masks instead of "#include " and > using the CAP_* defines. I didn't mean hardcoding in terms of numbers, I mean in terms of __NR_read. Just read the 32bit emulation code, I had to use ifdef TIF_IA32, that's the best I could do, and I doubt you would be able to write much cleaner code in userland either. > And if a filename has a runtime dependency on the untrusted data (consider > any sort of web server or browser or mail program or anything else that > accepts a "suggested filename" as input), things get very difficult very > quickly. > > I can pass ptrace a SYSCALL_OPEN, and then call my untrusted code, and then > look at the filename at runtime and see if there's something hinky going on. > I can even apply heuristics like "The first file opened should be THIS one, > then THOSE 4 shared libraries in order, then THIS file, and then the NEXT file > is dependent on user input, but has to start with $USER/tmp/workdir, and then > there's two other opens of files X and Y, and then no others should happen". > Using seccomp, you don't get that choice. You either have to jump through > hoops to get all that set up beforehand, or allow open() in all its glory. I don't get what you mean here. Anyway the filedescriptors inside seccomp are never going to be files, and there will be only two. I can add some documentation if it gets merged.
Re: [patch 1/13] Qsort
Hi, On Sun, Jan 23, 2005 at 03:03:32AM +0100, Felipe Alfaro Solana wrote: > On 22 Jan 2005, at 22:00, vlobanov wrote: > >#define SWAP(a, b, size) \ > >do { \ > > register size_t __size = (size);\ > > register char * __a = (a), * __b = (b); \ > > do {\ > > *__a ^= *__b; \ > > *__b ^= *__a; \ > > *__a ^= *__b; \ > > __a++; \ > > __b++; \ > > } while ((--__size) > 0); \ > >} while (0) > > > >What do you think? :) > > AFAIK, XOR is quite expensive on IA32 when compared to simple MOV > operatings. Also, since the original patch uses 3 MOVs to perform the > swapping, and your version uses 3 XOR operations, I don't see any > gains. It will even be worse because we are accessing memory, and most architectures will not be able to use a memory reference for both operands of the XOR. Basically, what will be generated will look like this : tmp = *b *a ^= tmp tmp ^= *a *b = tmp *a ^= tmp which is 5 cycles, or 4 if the two last instructions get merged. And there's 3 memory reads + 3 memory writes (assuming that the CPU will be smart enough to reuse *a without accessing memory at instruction 3). The move is quite faster : tmp1 = *a tmp2 = *b *a = tmp2 *b = tmp1 This is 4 cycles on simple CPUs, or even 2 cycles on most of todays CPUs which can do the first two fetches at once, and the last two writes at once. And there are only two reads and two writes. Clearly this one is better. Regards, Willy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] rename device_init
Rename device_init to make it more unique. Useful when looking through debug initcall bootlogs. While Im in the area, also make it static. Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]> diff -puN drivers/block/genhd.c~rename_device_init drivers/block/genhd.c --- gr_work/drivers/block/genhd.c~rename_device_init2005-01-21 19:37:07.585813607 -0600 +++ gr_work-anton/drivers/block/genhd.c 2005-01-21 19:37:07.596811864 -0600 @@ -300,7 +300,7 @@ static struct kobject *base_probe(dev_t return NULL; } -int __init device_init(void) +static int __init genhd_device_init(void) { bdev_map = kobj_map_init(base_probe, _subsys); blk_dev_init(); @@ -308,7 +308,7 @@ int __init device_init(void) return 0; } -subsys_initcall(device_init); +subsys_initcall(genhd_device_init); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On Sun, Jan 23, 2005 at 06:08:36AM +0100, Andreas Gruenbacher wrote: > On Sunday 23 January 2005 00:28, Matt Mackall wrote: > > So the stack is going to be either 256 or 1024 bytes. Seems like we > > ought to kmalloc it. > > This will do. I didn't check if the +1 is strictly needed. > > - stack_node stack[STACK_SIZE]; > + stack_node stack[fls(size) - fls(MAX_THRESH) + 1]; Yes, indeed. Though I think even here, we'd prefer to use kmalloc because gcc generates suboptimal code for variable-sized stack vars. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On Sunday 23 January 2005 00:28, Matt Mackall wrote: > So the stack is going to be either 256 or 1024 bytes. Seems like we > ought to kmalloc it. This will do. I didn't check if the +1 is strictly needed. - stack_node stack[STACK_SIZE]; + stack_node stack[fls(size) - fls(MAX_THRESH) + 1]; -- Andreas Gruenbacher <[EMAIL PROTECTED]> SUSE Labs, SUSE LINUX PRODUCTS GMBH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] Problems disabling SYSCTL
Create a cond_syscall for sys32_sysctl and make all architectures use it. Also fix the architectures that dont wrap their 32bit compat sysctl code. Signed-off-by: Anton Blanchard <[EMAIL PROTECTED]> diff -puN arch/ia64/ia32/sys_ia32.c~sysctl_fixup2 arch/ia64/ia32/sys_ia32.c --- foobar2/arch/ia64/ia32/sys_ia32.c~sysctl_fixup2 2005-01-13 10:40:35.995198406 +1100 +++ foobar2-anton/arch/ia64/ia32/sys_ia32.c 2005-01-13 10:40:36.058193579 +1100 @@ -1973,10 +1973,10 @@ struct sysctl32 { unsigned int__unused[4]; }; +#ifdef CONFIG_SYSCTL asmlinkage long sys32_sysctl (struct sysctl32 __user *args) { -#ifdef CONFIG_SYSCTL struct sysctl32 a32; mm_segment_t old_fs = get_fs (); void __user *oldvalp, *newvalp; @@ -2015,10 +2015,8 @@ sys32_sysctl (struct sysctl32 __user *ar return -EFAULT; return ret; -#else - return -ENOSYS; -#endif } +#endif asmlinkage long sys32_newuname (struct new_utsname __user *name) diff -puN arch/mips/kernel/linux32.c~sysctl_fixup2 arch/mips/kernel/linux32.c --- foobar2/arch/mips/kernel/linux32.c~sysctl_fixup22005-01-13 10:40:36.000198023 +1100 +++ foobar2-anton/arch/mips/kernel/linux32.c2005-01-13 10:40:36.051194115 +1100 @@ -1194,13 +1194,6 @@ asmlinkage long sys32_sysctl(struct sysc return error; } -#else /* CONFIG_SYSCTL */ - -asmlinkage long sys32_sysctl(struct sysctl_args32 *args) -{ - return -ENOSYS; -} - #endif /* CONFIG_SYSCTL */ asmlinkage long sys32_newuname(struct new_utsname * name) diff -puN arch/parisc/kernel/sys_parisc32.c~sysctl_fixup2 arch/parisc/kernel/sys_parisc32.c --- foobar2/arch/parisc/kernel/sys_parisc32.c~sysctl_fixup2 2005-01-13 10:40:36.005197640 +1100 +++ foobar2-anton/arch/parisc/kernel/sys_parisc32.c 2005-01-13 10:40:36.060193425 +1100 @@ -165,12 +165,6 @@ asmlinkage long sys32_sysctl(struct __sy return error; } -#else /* CONFIG_SYSCTL */ - -asmlinkage long sys32_sysctl(struct __sysctl_args *args) -{ - return -ENOSYS; -} #endif /* CONFIG_SYSCTL */ asmlinkage long sys32_sched_rr_get_interval(pid_t pid, diff -puN arch/ppc64/kernel/sys_ppc32.c~sysctl_fixup2 arch/ppc64/kernel/sys_ppc32.c --- foobar2/arch/ppc64/kernel/sys_ppc32.c~sysctl_fixup2 2005-01-13 10:40:36.011197180 +1100 +++ foobar2-anton/arch/ppc64/kernel/sys_ppc32.c 2005-01-13 10:40:36.046194498 +1100 @@ -1106,6 +1106,7 @@ asmlinkage long sys32_umask(u32 mask) return sys_umask((int)mask); } +#ifdef CONFIG_SYSCTL struct __sysctl_args32 { u32 name; int nlen; @@ -1155,6 +1156,7 @@ asmlinkage long sys32_sysctl(struct __sy } return error; } +#endif asmlinkage int sys32_olduname(struct oldold_utsname __user * name) { diff -puN arch/s390/kernel/compat_linux.c~sysctl_fixup2 arch/s390/kernel/compat_linux.c --- foobar2/arch/s390/kernel/compat_linux.c~sysctl_fixup2 2005-01-13 10:40:36.016196797 +1100 +++ foobar2-anton/arch/s390/kernel/compat_linux.c 2005-01-13 10:40:36.063193195 +1100 @@ -906,6 +906,7 @@ asmlinkage long sys32_adjtimex(struct ti return ret; } +#ifdef CONFIG_SYSCTL struct __sysctl_args32 { u32 name; int nlen; @@ -953,6 +954,7 @@ asmlinkage long sys32_sysctl(struct __sy } return error; } +#endif struct stat64_emu31 { unsigned long long st_dev; diff -puN arch/x86_64/ia32/sys_ia32.c~sysctl_fixup2 arch/x86_64/ia32/sys_ia32.c --- foobar2/arch/x86_64/ia32/sys_ia32.c~sysctl_fixup2 2005-01-13 10:40:36.021196414 +1100 +++ foobar2-anton/arch/x86_64/ia32/sys_ia32.c 2005-01-13 10:40:36.066192966 +1100 @@ -653,6 +653,7 @@ sys32_pause(void) } +#ifdef CONFIG_SYSCTL struct sysctl_ia32 { unsigned intname; int nlen; @@ -667,9 +668,6 @@ struct sysctl_ia32 { asmlinkage long sys32_sysctl(struct sysctl_ia32 __user *args32) { -#ifndef CONFIG_SYSCTL - return -ENOSYS; -#else struct sysctl_ia32 a32; mm_segment_t old_fs = get_fs (); void *oldvalp, *newvalp; @@ -710,8 +708,8 @@ sys32_sysctl(struct sysctl_ia32 __user * return -EFAULT; return ret; -#endif } +#endif /* warning: next two assume little endian */ asmlinkage long diff -puN kernel/sys_ni.c~sysctl_fixup2 kernel/sys_ni.c --- foobar2/kernel/sys_ni.c~sysctl_fixup2 2005-01-13 10:40:36.026196031 +1100 +++ foobar2-anton/kernel/sys_ni.c 2005-01-13 10:40:36.047194422 +1100 @@ -82,3 +82,4 @@ cond_syscall(sys_pciconfig_read) cond_syscall(sys_pciconfig_write) cond_syscall(sys_pciconfig_iobase) cond_syscall(sys32_ipc) +cond_syscall(sys32_sysctl) _ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On Sun, 23 Jan 2005, Andi Kleen wrote: > > How about a shell sort? if the data is mostly sorted shell sort beats > > qsort lots of times, and since the data sets are often small in-kernel, > > shell sorts O(n^2) behaviour won't harm it too much, shell sort is also > > faster if the data is already completely sorted. Shell sort is certainly > > not the simplest algorithm around, but I think (without having done any > > tests) that it would probably do pretty well for in-kernel use... Then > > again, I've known to be wrong :) > > I like shell sort for small data sets too. And I agree it would be > appropiate for the kernel. > Even with large data sets that are mostly unsorted shell sorts performance is close to qsort, and there's an optimization that gives it O(n^(3/2)) runtime (IIRC), and another nice property is that it's iterative so it doesn't eat up stack space (as oposed to qsort which is recursive and eats stack like )... Yeah, I think shell sort would be good for the kernel. -- Jesper Juhl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On 23 Jan 2005, at 03:39, Andi Kleen wrote: Felipe Alfaro Solana <[EMAIL PROTECTED]> writes: AFAIK, XOR is quite expensive on IA32 when compared to simple MOV operatings. Also, since the original patch uses 3 MOVs to perform the swapping, and your version uses 3 XOR operations, I don't see any gains. Both are one cycle latency for register<->register on all x86 cores I've looked at. What makes you think differently? I thought XOR was more expensie. Anyways, I still don't see any advantage in replacing 3 MOVs with 3 XORs. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Jack O'Quin wrote: Con Kolivas <[EMAIL PROTECTED]> writes: Jack O'Quin wrote: [snip lots of valid points] suggest some things to try. First, make sure the JACK tmp directory is mounted on a tmpfs[1]. Then, try the test with ext2, instead of Looks like the tmpfs is probably the biggest problem. Here's SCHED_ISO with just the /tmp mounted on tmpfs change - running on a complete desktop environment with a 2nd exported X seession and my wife browsing the net and emailing at the same time. All invalid runs removed and just this one posted here: http://ck.kolivas.org/patches/SCHED_ISO/iso2-benchmarks/ How's that look? Excellent! Sorry I didn't warn you about that problem before. JACK audio users generally know about it, but there's no reason you should have. So, that was run with ext3? Yes I think I mentioned before this is a different machine than the pentiumM one. It's a P4HT3.06 with on board i810 sound and ext3 (which explains the vastly different DSP usage). The only "special" measure taken for jackd was to use the latest jackd code and the tmpfs mount you suggested. Looks like the number of steps to convert a modern "standard setup" desktop to a low latency one on linux aren't that big after all :) Cheers, Con signature.asc Description: OpenPGP digital signature
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Con Kolivas <[EMAIL PROTECTED]> writes: > Meanwhile, I have the priority support working (but not bug free), and > the preliminary results suggest that the results are better. Do I > recall someone mentioning jackd uses threads at different priority? Yes, it does. I'm not sure whether that matters in this test (it might). But, I'm certain it matters for some JACK applications. -- joq - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
> How about a shell sort? if the data is mostly sorted shell sort beats > qsort lots of times, and since the data sets are often small in-kernel, > shell sorts O(n^2) behaviour won't harm it too much, shell sort is also > faster if the data is already completely sorted. Shell sort is certainly > not the simplest algorithm around, but I think (without having done any > tests) that it would probably do pretty well for in-kernel use... Then > again, I've known to be wrong :) I like shell sort for small data sets too. And I agree it would be appropiate for the kernel. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 7/7] inifiniband: pass dev_t to class core
Looks fine to me (assuming the core devt stuff goes in, obviously). In case it matters: Acked-by: Roland Dreier <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Con Kolivas <[EMAIL PROTECTED]> writes: > Jack O'Quin wrote: > [snip lots of valid points] >> suggest some things to try. First, make sure the JACK tmp directory >> is mounted on a tmpfs[1]. Then, try the test with ext2, instead of > > Looks like the tmpfs is probably the biggest problem. Here's SCHED_ISO > with just the /tmp mounted on tmpfs change - running on a complete > desktop environment with a 2nd exported X seession and my wife > browsing the net and emailing at the same time. > > All invalid runs removed and just this one posted here: > http://ck.kolivas.org/patches/SCHED_ISO/iso2-benchmarks/ > > How's that look? Excellent! Sorry I didn't warn you about that problem before. JACK audio users generally know about it, but there's no reason you should have. So, that was run with ext3? -- joq - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: seccomp for 2.6.11-rc1-bk8
On Sun, 23 Jan 2005 01:52:13 +0100, Andrea Arcangeli said: > Why should I be kidding? The client code I'm doing, has to be at least as > secure Maybe in your estimation it *has* to be that secure. However, actual experience with other operating systems indicate that the mail programs and web browsers have *higher* security requirements than ssh - because ssh can afford to trust legitimate users, while MUAs and browsers have to protect the system against actions taken by legitimate users. > as ssh and the firewall code, what else has to be more secure than that? Mail programs, web browsers, and I'm sure there's plenty of applications in the various Three Letter Agencies that want even more security. It's a poor idea to confuse "secure" with "can't break out of the sandbox". > Nor ssh nor the firewall code depends on ptrace for their security. The And they don't even depend on seccomp or ptrace for the security either... > Once seccomp is in, I believe there's a chance that security people uses > it for more than Cpushare while I don't think there's a chance you'll Security people probably won't be interested, specifically because it's way too inflexible. Very few real-life applications can be made to fit into a "open all the files you might need, then shut yourself into a read/write syscalls only" model. In fact, a case could be made that the unnatural contortions needed to restructure applications into a seccomp model actually *decrease* the overall security, because of more complicated setup code being more vulnerable to attack. Also, the fact that you need to keep open() out of the permitted set for seccomp to make any sense means that you need to open all the possible files up front. So now you're handing the program *more* access to files than they should > see security people using ptrace_syscall hardcoding the syscall numbers Oh, come *ON*, Andrea. This is a red herring and you *know* it. The only people who will be hardcoding syscall numbers are the same idiots that hardcoded capability masks instead of "#include " and using the CAP_* defines. > in every userland app out there that may have to parse untrusted data > with potentially buggy bytecode (i.e. decompression bytecode etc..). And if a filename has a runtime dependency on the untrusted data (consider any sort of web server or browser or mail program or anything else that accepts a "suggested filename" as input), things get very difficult very quickly. I can pass ptrace a SYSCALL_OPEN, and then call my untrusted code, and then look at the filename at runtime and see if there's something hinky going on. I can even apply heuristics like "The first file opened should be THIS one, then THOSE 4 shared libraries in order, then THIS file, and then the NEXT file is dependent on user input, but has to start with $USER/tmp/workdir, and then there's two other opens of files X and Y, and then no others should happen". Using seccomp, you don't get that choice. You either have to jump through hoops to get all that set up beforehand, or allow open() in all its glory. pgpY0NrlzR8aR.pgp Description: PGP signature
Re: Trying to fix radeonfb suspending on IBM Thinkpad T41
I have no knowledge of the internals of the radeon family, but I am under the impression that they require some hacks to work around bugs in the silicon. There is a rather big patch coming, see http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.11- rc1/2.6.11-rc1-mm2/broken-out/radeonfb-massive-update-of-pm-code.patch This patch also rewrites the questionable section in radeon_pm_setup_for_suspend. I just found it and have not yet built a new kernel, so I cannot comment on its effectiveness. For reference, the power management issues of the T41 have their own bugzilla entry: http://bugme.osdl.org/show_bug.cgi?id=3022 On a side note, since kernel 2.6.10 I have not been able to successfully resume from acpi S3 + radeonfb any more (T41 2379-DJU, radeon mobility M9) - works under 2.6.9 and 2.6.11-rc1 + vgaconsole. I'm still trying to isolate the problem/waiting for some of the pm code to settle. Best, Volker - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On Sun, Jan 23, 2005 at 03:39:34AM +0100, Andi Kleen wrote: > Felipe Alfaro Solana <[EMAIL PROTECTED]> writes: > > > > AFAIK, XOR is quite expensive on IA32 when compared to simple MOV > > operatings. Also, since the original patch uses 3 MOVs to perform the > > swapping, and your version uses 3 XOR operations, I don't see any > > gains. > > Both are one cycle latency for register<->register on all x86 cores > I've looked at. What makes you think differently? > > -Andi (who thinks the glibc qsort is vast overkill for kernel purposes > where there are only small data sets and it would be better to use a > simpler one optimized for code size) Mostly agreed. Except: a) the glibc version is not actually all that optimized b) it's nice that it's not recursive c) the three-way median selection does help avoid worst-case O(n^2) behavior, which might potentially be triggerable by users in places like XFS where this is used I'll probably whip up a simpler version tomorrow or Monday and do some size/space benchmarking. I've been meaning to contribute a qsort for doubly-linked lists I've got lying around as well. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 7/7] inifiniband: pass dev_t to class core
Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/infiniband/core/user_mad.c 1.2 vs edited = --- 1.2/drivers/infiniband/core/user_mad.c 2005-01-21 06:01:17 +01:00 +++ edited/drivers/infiniband/core/user_mad.c 2005-01-22 15:34:10 +01:00 @@ -518,15 +518,6 @@ static struct ib_client umad_client = { .remove = ib_umad_remove_one }; -static ssize_t show_dev(struct class_device *class_dev, char *buf) -{ - struct ib_umad_port *port = - container_of(class_dev, struct ib_umad_port, class_dev); - - return print_dev_t(buf, port->dev.dev); -} -static CLASS_DEVICE_ATTR(dev, S_IRUGO, show_dev, NULL); - static ssize_t show_ibdev(struct class_device *class_dev, char *buf) { struct ib_umad_port *port = @@ -625,16 +616,13 @@ static void ib_umad_add_one(struct ib_de umad_dev->port[i - s].devnum, 1)) goto err; - umad_dev->port[i - s].class_dev.class = _class; + umad_dev->port[i - s].class_dev.devt = umad_dev->port[i - s].dev.dev; umad_dev->port[i - s].class_dev.dev = device->dma_device; snprintf(umad_dev->port[i - s].class_dev.class_id, BUS_ID_SIZE, "umad%d", umad_dev->port[i - s].devnum); if (class_device_register(_dev->port[i - s].class_dev)) goto err_class; - if (class_device_create_file(_dev->port[i - s].class_dev, -_device_attr_dev)) - goto err_class; if (class_device_create_file(_dev->port[i - s].class_dev, _device_attr_ibdev)) goto err_class; - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Jack O'Quin wrote: [snip lots of valid points] suggest some things to try. First, make sure the JACK tmp directory is mounted on a tmpfs[1]. Then, try the test with ext2, instead of Looks like the tmpfs is probably the biggest problem. Here's SCHED_ISO with just the /tmp mounted on tmpfs change - running on a complete desktop environment with a 2nd exported X seession and my wife browsing the net and emailing at the same time. * SUMMARY RESULT Total seconds ran . . . . . . : 300 Number of clients . . . . . . :14 Ports per client . . . . . . : 4 Frames per buffer . . . . . . :64 Number of runs . . . . . . . :(1) * Timeout Count . . . . . . . . :(0) XRUN Count . . . . . . . . . : 0 Delay Count (>spare time) . . : 0 Delay Count (>1000 usecs) . . : 0 Delay Maximum . . . . . . . . :72 usecs Cycle Maximum . . . . . . . . : 1108 usecs Average DSP Load. . . . . . . :50.1 % Average CPU System Load . . . :10.7 % Average CPU User Load . . . . :18.3 % Average CPU Nice Load . . . . : 0.0 % Average CPU I/O Wait Load . . : 0.1 % Average CPU IRQ Load . . . . : 0.0 % Average CPU Soft-IRQ Load . . : 0.0 % Average Interrupt Rate . . . : 1693.1 /sec Average Context-Switch Rate . : 18852.7 /sec * Delta Maximum . . . . . . . . : 0.0 * Warning: empty y2 range [0:0], adjusting to [0:1] All invalid runs removed and just this one posted here: http://ck.kolivas.org/patches/SCHED_ISO/iso2-benchmarks/ How's that look? Cheers, Con signature.asc Description: OpenPGP digital signature
[PATCH 4/7] usb: class driver pass dev_t to the class core
Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/usb/core/file.c 1.17 vs edited = --- 1.17/drivers/usb/core/file.c2005-01-15 01:01:44 +01:00 +++ edited/drivers/usb/core/file.c 2005-01-22 15:15:05 +01:00 @@ -107,13 +107,6 @@ void usb_major_cleanup(void) unregister_chrdev(USB_MAJOR, "usb"); } -static ssize_t show_dev(struct class_device *class_dev, char *buf) -{ - int minor = (int)(long)class_get_devdata(class_dev); - return print_dev_t(buf, MKDEV(USB_MAJOR, minor)); -} -static CLASS_DEVICE_ATTR(dev, S_IRUGO, show_dev, NULL); - /** * usb_register_dev - register a USB device, and ask for a minor number * @intf: pointer to the usb_interface that is being registered @@ -184,6 +177,7 @@ int usb_register_dev(struct usb_interfac class_dev = kmalloc(sizeof(*class_dev), GFP_KERNEL); if (class_dev) { memset(class_dev, 0x00, sizeof(struct class_device)); + class_dev->devt = MKDEV(USB_MAJOR, minor); class_dev->class = _class; class_dev->dev = >dev; @@ -195,7 +189,6 @@ int usb_register_dev(struct usb_interfac snprintf(class_dev->class_id, BUS_ID_SIZE, "%s", temp); class_set_devdata(class_dev, (void *)(long)intf->minor); class_device_register(class_dev); - class_device_create_file(class_dev, _device_attr_dev); intf->class_dev = class_dev; } exit: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 6/7] videodev: pass dev_t to the class core
Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/media/video/videodev.c 1.35 vs edited = --- 1.35/drivers/media/video/videodev.c 2004-08-23 10:14:55 +02:00 +++ edited/drivers/media/video/videodev.c 2005-01-22 15:22:49 +01:00 @@ -46,15 +46,7 @@ static ssize_t show_name(struct class_de return sprintf(buf,"%.*s\n",(int)sizeof(vfd->name),vfd->name); } -static ssize_t show_dev(struct class_device *cd, char *buf) -{ - struct video_device *vfd = container_of(cd, struct video_device, class_dev); - dev_t dev = MKDEV(VIDEO_MAJOR, vfd->minor); - return print_dev_t(buf,dev); -} - static CLASS_DEVICE_ATTR(name, S_IRUGO, show_name, NULL); -static CLASS_DEVICE_ATTR(dev, S_IRUGO, show_dev, NULL); struct video_device *video_device_alloc(void) { @@ -347,12 +339,11 @@ int video_register_device(struct video_d if (vfd->dev) vfd->class_dev.dev = vfd->dev; vfd->class_dev.class = _class; + vfd->class_dev.devt = MKDEV(VIDEO_MAJOR, vfd->minor); strlcpy(vfd->class_dev.class_id, vfd->devfs_name + 4, BUS_ID_SIZE); class_device_register(>class_dev); class_device_create_file(>class_dev, _device_attr_name); - class_device_create_file(>class_dev, -_device_attr_dev); #if 1 /* needed until all drivers are fixed */ if (!vfd->release) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 5/7] i2c: class driver pass dev_t to the class core
Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/i2c/i2c-dev.c 1.50 vs edited = --- 1.50/drivers/i2c/i2c-dev.c 2005-01-21 06:02:15 +01:00 +++ edited/drivers/i2c/i2c-dev.c2005-01-22 15:17:50 +01:00 @@ -108,13 +108,6 @@ static void return_i2c_dev(struct i2c_de spin_unlock(_dev_array_lock); } -static ssize_t show_dev(struct class_device *class_dev, char *buf) -{ - struct i2c_dev *i2c_dev = to_i2c_dev(class_dev); - return print_dev_t(buf, MKDEV(I2C_MAJOR, i2c_dev->minor)); -} -static CLASS_DEVICE_ATTR(dev, S_IRUGO, show_dev, NULL); - static ssize_t show_adapter_name(struct class_device *class_dev, char *buf) { struct i2c_dev *i2c_dev = to_i2c_dev(class_dev); @@ -451,11 +444,11 @@ static int i2cdev_attach_adapter(struct else i2c_dev->class_dev.dev = adap->dev.parent; i2c_dev->class_dev.class = _dev_class; + i2c_dev->class_dev.devt = MKDEV(I2C_MAJOR, i2c_dev->minor); snprintf(i2c_dev->class_dev.class_id, BUS_ID_SIZE, "i2c-%d", i2c_dev->minor); retval = class_device_register(_dev->class_dev); if (retval) goto error; - class_device_create_file(_dev->class_dev, _device_attr_dev); class_device_create_file(_dev->class_dev, _device_attr_name); return 0; error: - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/7] class_simple: pass dev_t to the class core
Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/base/class_simple.c 1.8 vs edited = --- 1.8/drivers/base/class_simple.c 2005-01-21 06:02:15 +01:00 +++ edited/drivers/base/class_simple.c 2005-01-22 16:20:16 +01:00 @@ -10,18 +10,15 @@ #include #include -#include #include struct class_simple { - struct class_device_attribute attr; struct class class; }; #define to_class_simple(d) container_of(d, struct class_simple, class) struct simple_dev { struct list_head node; - dev_t dev; struct class_device class_dev; }; #define to_simple_dev(d) container_of(d, struct simple_dev, class_dev) @@ -35,12 +32,6 @@ static void release_simple_dev(struct cl kfree(s_dev); } -static ssize_t show_dev(struct class_device *class_dev, char *buf) -{ - struct simple_dev *s_dev = to_simple_dev(class_dev); - return print_dev_t(buf, s_dev->dev); -} - static void class_simple_release(struct class *class) { struct class_simple *cs = to_class_simple(class); @@ -75,12 +66,6 @@ struct class_simple *class_simple_create cs->class.class_release = class_simple_release; cs->class.release = release_simple_dev; - cs->attr.attr.name = "dev"; - cs->attr.attr.mode = S_IRUGO; - cs->attr.attr.owner = owner; - cs->attr.show = show_dev; - cs->attr.store = NULL; - retval = class_register(>class); if (retval) goto error; @@ -143,7 +128,7 @@ struct class_device *class_simple_device } memset(s_dev, 0x00, sizeof(*s_dev)); - s_dev->dev = dev; + s_dev->class_dev.devt = dev; s_dev->class_dev.dev = device; s_dev->class_dev.class = >class; @@ -154,8 +139,6 @@ struct class_device *class_simple_device if (retval) goto error; - class_device_create_file(_dev->class_dev, >attr); - spin_lock(_dev_list_lock); list_add(_dev->node, _dev_list); spin_unlock(_dev_list_lock); @@ -200,7 +183,7 @@ void class_simple_device_remove(dev_t de spin_lock(_dev_list_lock); list_for_each_entry(s_dev, _dev_list, node) { - if (s_dev->dev == dev) { + if (s_dev->class_dev.devt == dev) { found = 1; break; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On Sun, Jan 23, 2005 at 03:03:32AM +0100, Felipe Alfaro Solana wrote: > On 22 Jan 2005, at 22:00, vlobanov wrote: > > >Hi, > > > >I was just reading over the patch, and had a quick question/comment > >upon > >the SWAP macro defined below. I think it's possible to do a tiny bit > >better (better, of course, being subjective), as follows: > > > >#define SWAP(a, b, size) \ > >do { \ > > register size_t __size = (size);\ > > register char * __a = (a), * __b = (b); \ > > do {\ > > *__a ^= *__b; \ > > *__b ^= *__a; \ > > *__a ^= *__b; \ > > __a++; \ > > __b++; \ > > } while ((--__size) > 0); \ > >} while (0) > > > >What do you think? :) > > AFAIK, XOR is quite expensive on IA32 when compared to simple MOV > operatings. Also, since the original patch uses 3 MOVs to perform the > swapping, and your version uses 3 XOR operations, I don't see any > gains. > > Am I missing something? No temporary variable needed in the xor version. mov and xor are roughly the same speed, but xor modifies flags. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/7] block core: export MAJOR/MINOR to the hotplug env
Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/block/genhd.c 1.109 vs edited = --- 1.109/drivers/block/genhd.c 2005-01-14 20:21:23 +01:00 +++ edited/drivers/block/genhd.c2005-01-23 02:59:27 +01:00 @@ -430,42 +430,57 @@ static int block_hotplug_filter(struct k static int block_hotplug(struct kset *kset, struct kobject *kobj, char **envp, int num_envp, char *buffer, int buffer_size) { - struct device *dev = NULL; struct kobj_type *ktype = get_ktype(kobj); + struct device *physdev; + struct gendisk *disk; + struct hd_struct *part; int length = 0; int i = 0; - /* get physical device backing disk or partition */ if (ktype == _block) { - struct gendisk *disk = container_of(kobj, struct gendisk, kobj); - dev = disk->driverfs_dev; + disk = container_of(kobj, struct gendisk, kobj); + add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, + , "MINOR=%u", disk->first_minor); } else if (ktype == _part) { - struct gendisk *disk = container_of(kobj->parent, struct gendisk, kobj); - dev = disk->driverfs_dev; - } - - if (dev) { - /* add physical device, backing this device */ - char *path = kobject_get_path(>kobj, GFP_KERNEL); + disk = container_of(kobj->parent, struct gendisk, kobj); + part = container_of(kobj, struct hd_struct, kobj); + add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, + , "MINOR=%u", + disk->first_minor + part->partno); + } else + return 0; + + add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, , + "MAJOR=%u", disk->major); + + /* add physical device, backing this device */ + physdev = disk->driverfs_dev; + if (physdev) { + char *path = kobject_get_path(>kobj, GFP_KERNEL); add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, , "PHYSDEVPATH=%s", path); kfree(path); - /* add bus name of physical device */ - if (dev->bus) + if (physdev->bus) add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, , - "PHYSDEVBUS=%s", dev->bus->name); + "PHYSDEVBUS=%s", + physdev->bus->name); - /* add driver name of physical device */ - if (dev->driver) + if (physdev->driver) add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, , - "PHYSDEVDRIVER=%s", dev->driver->name); - - envp[i] = NULL; + "PHYSDEVDRIVER=%s", + physdev->driver->name); } + + /* terminate, set to next free slot, shrink available space */ + envp[i] = NULL; + envp = [i]; + num_envp -= i; + buffer = [length]; + buffer_size -= length; return 0; } - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/7] class core: export MAJOR/MINOR to the hotplug env
Move the creation of the sysfs "dev" file of a class device into the driver core. The struct class_device contains a dev_t value now. If set, the driver core will create the "dev" file containing the major/minor numbers automatically. Signed-off-by: Kay Sievers <[EMAIL PROTECTED]> = drivers/base/class.c 1.57 vs edited = --- 1.57/drivers/base/class.c 2004-12-22 01:29:34 +01:00 +++ edited/drivers/base/class.c 2005-01-23 01:32:18 +01:00 @@ -15,6 +15,7 @@ #include #include #include +#include #include "base.h" #define to_class_attr(_attr) container_of(_attr, struct class_attribute, attr) @@ -298,9 +299,9 @@ static int class_hotplug(struct kset *ks int num_envp, char *buffer, int buffer_size) { struct class_device *class_dev = to_class_dev(kobj); - int retval = 0; int i = 0; int length = 0; + int retval = 0; pr_debug("%s - name = %s\n", __FUNCTION__, class_dev->class_id); @@ -313,26 +314,34 @@ static int class_hotplug(struct kset *ks , "PHYSDEVPATH=%s", path); kfree(path); - /* add bus name of physical device */ if (dev->bus) add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, , "PHYSDEVBUS=%s", dev->bus->name); - /* add driver name of physical device */ if (dev->driver) add_hotplug_env_var(envp, num_envp, , buffer, buffer_size, , "PHYSDEVDRIVER=%s", dev->driver->name); - - /* terminate, set to next free slot, shrink available space */ - envp[i] = NULL; - envp = [i]; - num_envp -= i; - buffer = [length]; - buffer_size -= length; } + if (MAJOR(class_dev->devt)) { + add_hotplug_env_var(envp, num_envp, , + buffer, buffer_size, , + "MAJOR=%u", MAJOR(class_dev->devt)); + + add_hotplug_env_var(envp, num_envp, , + buffer, buffer_size, , + "MINOR=%u", MINOR(class_dev->devt)); + } + + /* terminate, set to next free slot, shrink available space */ + envp[i] = NULL; + envp = [i]; + num_envp -= i; + buffer = [length]; + buffer_size -= length; + if (class_dev->class->hotplug) { /* have the bus specific function add its stuff */ retval = class_dev->class->hotplug (class_dev, envp, num_envp, @@ -388,6 +397,12 @@ static void class_device_remove_attrs(st } } +static ssize_t show_dev(struct class_device *class_dev, char *buf) +{ + return print_dev_t(buf, class_dev->devt); +} +static CLASS_DEVICE_ATTR(dev, S_IRUGO, show_dev, NULL); + void class_device_initialize(struct class_device *class_dev) { kobj_set_kset_s(class_dev, class_obj_subsys); @@ -432,6 +447,10 @@ int class_device_add(struct class_device class_intf->add(class_dev); up_write(>subsys.rwsem); } + + if (MAJOR(class_dev->devt)) + class_device_create_file(class_dev, _device_attr_dev); + class_device_add_attrs(class_dev); class_device_dev_link(class_dev); class_device_driver_link(class_dev); = include/linux/device.h 1.135 vs edited = --- 1.135/include/linux/device.h2005-01-11 02:29:25 +01:00 +++ edited/include/linux/device.h 2005-01-22 14:56:15 +01:00 @@ -184,6 +184,7 @@ struct class_device { struct kobject kobj; struct class* class;/* required */ + dev_t devt; /* dev_t, creates the sysfs "dev" */ struct device * dev; /* not necessary, but nice to have */ void* class_data; /* class-specific data */ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 0/7] driver core: export MAJOR/MINOR to the hotplug env
This patch sequence moves the creation of the sysfs "dev" file of the class devices into the driver core. The struct class_device contains a dev_t value now. If set, the driver core will create the "dev" file containing the major/minor numbers automatically. The MAJOR/MINOR values are also exported to the hotplug environment. This makes it easy for userspace, especially udev to know if it should wait for a "dev" file to create a device node or if it can just ignore the event. We currently carry a compiled in blacklist around for that reason. It would also be possible to run some "tiny udev" while sysfs is not available - just by reading the hotplug call or the netlink-uevent. Thanks, Kay - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/12] random pt4: Create new rol32/ror32 bitops
On Sun, Jan 23, 2005 at 04:19:21AM +0100, Andi Kleen wrote: > On Sat, Jan 22, 2005 at 09:10:40PM -0500, Chuck Ebbert wrote: > > On Fri, 21 Jan 2005 at 15:41:06 -0600 Matt Mackall wrote: > > > > > Add rol32 and ror32 bitops to bitops.h > > > > Can you test this patch on top of yours? I did it on 2.6.10-ac10 but it > > should apply OK. Compile tested and booted, but only random.c is using it > > in my kernel. > > Does random really use variable rotates? For constant rotates > gcc detects the usual C idiom and turns it transparently into > the right machine instruction. Nope, random doesn't. The only thing I converted in my sweep that did were CAST5 and CAST6, which are fairly unique in doing key-based rotations. On the other hand: typedef unsigned int __u32; static inline __u32 rol32(__u32 word, int shift) { return (word << shift) | (word >> (32 - shift)); } int foo(int val, int rot) { return rol32(val, rot); } With 2.95: 0: 55 push %ebp 1: 89 e5 mov%esp,%ebp 3: 8b 4d 0cmov0xc(%ebp),%ecx 6: 8b 45 08mov0x8(%ebp),%eax 9: d3 c0 rol%cl,%eax b: c9 leave c: c3 ret With 3.3.5: 0: 55 push %ebp 1: 89 e5 mov%esp,%ebp 3: 8b 45 08mov0x8(%ebp),%eax 6: 8b 4d 0cmov0xc(%ebp),%ecx 9: 5d pop%ebp a: d3 c0 rol%cl,%eax c: c3 ret With gcc-snapshot: 0: 55 push %ebp 1: 89 e5 mov%esp,%ebp 3: 8b 45 08mov0x8(%ebp),%eax 6: 8b 4d 0cmov0xc(%ebp),%ecx 9: d3 c0 rol%cl,%eax b: 5d pop%ebp c: c3 ret So I think tweaks for x86 at least are unnecessary. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 4081] New: OpenOffice crashes while starting due to a threading error
On Sat, 22 Jan 2005 18:18:55 +0100, Alessandro Suardi <[EMAIL PROTECTED]> wrote: > On Sat, 22 Jan 2005 08:56:25 -0800, Martin J. Bligh <[EMAIL PROTECTED]> wrote: > > Please contact bug submitter for more info, not myself. > > > > - > > > > http://bugme.osdl.org/show_bug.cgi?id=4081 [snip] > Doesn't happen here: > > [EMAIL PROTECTED] asuardi]$ grep openoffice /var/log/rpmpkgs > openoffice.org-1.1.2-11.4.fc2.i386.rpm > openoffice.org-i18n-1.1.2-11.4.fc2.i386.rpm > openoffice.org-libs-1.1.2-11.4.fc2.i386.rpm > [EMAIL PROTECTED] asuardi]$ cat /proc/version > Linux version 2.6.11-rc1-bk9 ([EMAIL PROTECTED]) (gcc version 3.4.3) #1 > Fri Jan 21 15:46:16 CET 2005 > > Will try -rc2 later... The above OO RPMs are also okay with -rc2 under FC2. --alessandro "And every dream, every, is just a dream after all" (Heather Nova, "Paper Cup") - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Supermount / ivman
Gustavo Guillermo Perez wrote: Cause I play with old toys, (floppys) and ivman doesn't work properly on the lastest gentoo with floppys, I retouch for a while the supermount patch from sourceforge for kernel 2.6.11-rc1. I'm a n00b on kernel, I do this only for general purposes helping some friends, I know supermount should not be used, and is not mantained, I've tested it just only for IDE/ATAPI CD/DVD and floppys. Cause Supermount seems to be a filesystem I replace vfs_permission by generic_permission instead of permission as I read on the lkml. Other stuffs too in scsi section (I don't have scsi hardware). If Help someone else: http://www.compunauta.com/forums/linux/instalarlinux/supermount_en.html I've been silently maintaining it offlist. No real development but keeping it in sync and fixing obvious bugs that show up that I can fix. Here's a patch for 2.6.10-ck5 (should apply fairly cleanly to 2.6.10): http://ck.kolivas.org/patches/2.6/2.6.10/2.6.10-ck5/patches/supermount-ng208-10ck5.diff and for 2.6.11-rc1 http://ck.kolivas.org/patches/2.6/2.6.11-rc1/patches/supermount-ng208-2611rc1.diff Cheers, Con signature.asc Description: OpenPGP digital signature
Re: [PATCH] e100 locking up netconsole.
On Sat, 2005-01-22 at 20:52 -0500, Steven Rostedt wrote: > I'm currently working with Ingo's RT patched kernel, but I believe this > affects the mainline too. > > If the transmit buffer of the e100 overflowed, then the system would > hang. This was caused because the e100 driver would stop the queue, and > find_skb in netpoll.c would then loop forever. This is because the e100 ^ Should be netpoll_send_pkt. That's what I get when I search for "repeat" to remember which function I saw the problem in. > net_poll would never start the queue again after the transmits have > completed. > > For those that use the e100 and netconsole, all you need to do is a > sysreq 't' to lock up the system. > > Here's the patch: (from Ingo's linux-2.6.11-rc2-V0.7.36-02, but should > be OK with 2.6.11-rc2) > > > Index: drivers/net/e100.c > === > --- drivers/net/e100.c(revision 60) > +++ drivers/net/e100.c(working copy) > @@ -1630,6 +1630,7 @@ > struct nic *nic = netdev_priv(netdev); > e100_disable_irq(nic); > e100_intr(nic->pdev->irq, netdev, NULL); > + e100_tx_clean(nic); > e100_enable_irq(nic); > } > #endif > > > > -- Steve > > > - > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Steven Rostedt Senior Engineer Kihon Technologies - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Supermount / ivman
Cause I play with old toys, (floppys) and ivman doesn't work properly on the lastest gentoo with floppys, I retouch for a while the supermount patch from sourceforge for kernel 2.6.11-rc1. I'm a n00b on kernel, I do this only for general purposes helping some friends, I know supermount should not be used, and is not mantained, I've tested it just only for IDE/ATAPI CD/DVD and floppys. Cause Supermount seems to be a filesystem I replace vfs_permission by generic_permission instead of permission as I read on the lkml. Other stuffs too in scsi section (I don't have scsi hardware). If Help someone else: http://www.compunauta.com/forums/linux/instalarlinux/supermount_en.html -- Gustavo Guillermo Pérez Compunauta uLinux www.userver.tk - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Trying to fix radeonfb suspending on IBM Thinkpad T41
Dear community! The aim of this post is to discuss the radeonfb driver power management issues. Enabling this feature dramatically reduces power consumtion during ACPI suspend to ram. I would appriciate comments from people who are more familiar with Radeon HW programming. Long and boring background -- I have a beast called IBM ThinkPad T41. Model No. 2373-2FG Since 2.6.9 ACPI suspend to RAM (S3) has been working nicely, but the power consumption during sleep is unacceptably high: the battery will run dry in about 5-6 hours in sleep mode. The root cause of this is the fact that not all hardware is properly turned off. Namely, at least ethernet adapter, USB and most notably graphics accelerator chip (Radeon Mobility M7 LW) remain powered on. Unfortunately current radeonfb driver is hardcoded to do power management only on PPC platform. Volker Braun has some kernels on his page (http://www.sas.upenn.edu/~vbraun/computing/T41/kernel.html) that incorporate his patch to enable PM on other platforms as well. He also owns a T41, but It's a bit different model and those kernels do not resume properly on my laptop: the machine crashes hard on resume. After a few hours of hacking around in radeonfb sources I think I found the problem and fixed it for my hardware. The results are promising: the power consumption during sleep is about 4-5 times lower: 15min ACPI sleep consumes 1100mW/h without radeonfb and only 230mW/h with properly patched radeonfb driver loaded. A question to anyone familiar with radeon hardware programming -- The radeonfb driver has a power management implementation that is used on PPC platform (Macintosh laptops ;). The same implementation seems to work fine on some Thinkpads, but crashes on others. In drivers/video/aty/radeon_pm.c resides a function called: radeon_pm_setup_for_suspend. This function has a following section that manages to crash at least my laptop: /* AGP PLL control */ OUTREG(BUS_CNTL1, INREG(BUS_CNTL1) | BUS_CNTL1__AGPCLK_VALID); OUTREG(BUS_CNTL1, (INREG(BUS_CNTL1) & ~BUS_CNTL1__MOBILE_PLATFORM_SEL_MASK) | (2
Re: [PATCH 1/12] random pt4: Create new rol32/ror32 bitops
On Sat, Jan 22, 2005 at 09:10:40PM -0500, Chuck Ebbert wrote: > On Fri, 21 Jan 2005 at 15:41:06 -0600 Matt Mackall wrote: > > > Add rol32 and ror32 bitops to bitops.h > > Can you test this patch on top of yours? I did it on 2.6.10-ac10 but it > should apply OK. Compile tested and booted, but only random.c is using it > in my kernel. Does random really use variable rotates? For constant rotates gcc detects the usual C idiom and turns it transparently into the right machine instruction. If that's the case I would use it because it'll avoid some messy code everywhere. -Andi - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On Sun, 23 Jan 2005, Andi Kleen wrote: > Felipe Alfaro Solana <[EMAIL PROTECTED]> writes: > > > > AFAIK, XOR is quite expensive on IA32 when compared to simple MOV > > operatings. Also, since the original patch uses 3 MOVs to perform the > > swapping, and your version uses 3 XOR operations, I don't see any > > gains. > > Both are one cycle latency for register<->register on all x86 cores > I've looked at. What makes you think differently? > > -Andi (who thinks the glibc qsort is vast overkill for kernel purposes > where there are only small data sets and it would be better to use a > simpler one optimized for code size) > How about a shell sort? if the data is mostly sorted shell sort beats qsort lots of times, and since the data sets are often small in-kernel, shell sorts O(n^2) behaviour won't harm it too much, shell sort is also faster if the data is already completely sorted. Shell sort is certainly not the simplest algorithm around, but I think (without having done any tests) that it would probably do pretty well for in-kernel use... Then again, I've known to be wrong :) -- Jesper Juhl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Con Kolivas <[EMAIL PROTECTED]> writes: > Jack O'Quin wrote: >> Neither run exhibits reliable audio performance. There is some low >> latency performance problem with your system. Maybe ReiserFS is >> causing trouble even with logging turned off. Perhaps the problem is >> somewhere else. Maybe some device is misbehaving. >> >> Until you solve this problem, beware of drawing conclusions. > Sigh.. I guess you want me to do all the benchmarking. Not at all. I am willing to continue running audio benchmarks for you and Ingo both. I have been spending significant amounts of time doing that. I can't work on it full-time, but will continue doing tests as requested. I just assumed you wanted to be able to produce similar results on your own system. I would, if I were in your place. I think you misunderstood my comment. I was pointing out that your system currently has too low a signal/noise ratio to draw conclusions about scheduling and latency. How can we tell whether the scheduler is working or not when there are extremely long XRUNS (~20msec) even running SCHED_FIFO? Clearly, something is broken. We need to figure out what the latency problem is with your system before putting too much faith in those results. > Well it's easy enough to get good results. I'll simply turn off all > services and not run a desktop. This is all on ext3 on a fully laden > desktop by the way, but if you want to get the results you're > looking for I can easily drop down to a console and get perfect > results. That would prove absolutely nothing. The whole purpose in requesting SCHED_FIFO (or an approximation, like SCHED_ISO) is for audio to work reliably in a loaded system. You were running your tests in the right environment. Your results showed it wasn't working, but did not necessarily indicate a scheduler problem. My tests were all done with GNOME, metacity, xemacs, and galeon running. I still use ext2 because back when I tuned my system for audio, ext3 had very poor low latency behavior. Maybe that has changed. Or, maybe it hasn't and that is the cause of the latency spikes you're seeing. I can't figure that out remotely, but I can suggest some things to try. First, make sure the JACK tmp directory is mounted on a tmpfs[1]. Then, try the test with ext2, instead of ext3. [1] http://www.affenbande.org/~tapas/wiki/index.php?Jackd%20and%20tmpfs%20%28or%20shmfs%29 Tuning Linux PC's for low-latency audio is currently an art, not a science. But, a considerable body of experience has grown up over the past few years[2]. It can be done. (If nothing else, you may develop more sympathy for all the crap Linux audio developers have been putting up with for so long.) [2] http://affenbande.org/~tapas/wiki/index.php?Low%20latency%20for%20audio%20work%20on%20linux%202.6.x I recommend testing these schedulers on the best available low latency kernel for the clearest signal/noise ratio before drawing any final conclusions. Right now, that seems to be Ingo's RP version. The original request that started this whole exercise was for 2.6.10 numbers, which morphed into 2.6.11-rc1. Working on the mainline of development makes sense. And, the mainline is getting a lot better. But, 2.6.10 is still far from clean as a vehicle for soft-RT. Your tests prove that. -- joq - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
* Nick Piggin ([EMAIL PROTECTED]) wrote: > Jack O'Quin wrote: > > > Chris Wright and Arjan van de Ven have outlined a proposal to address > > the privilege issue using rlimits. This is still the only workable > > alternative to the realtime LSM on the table. If the decision were up > > to me, I would choose the simplicity and better security of the LSM. > > But their approach is adequate, if implemented in a timely fashion. I > > would like to see some progress on this in addition to the scheduler > > work. People still need SCHED_FIFO for some applications. > > > > I think this is a pretty sane and minimally intrusive (for the kernel) > way to support what you want. Here's an untested respin against current bk. thanks, -chris = include/asm-generic/resource.h 1.1 vs edited = --- 1.1/include/asm-generic/resource.h 2005-01-20 21:00:51 -08:00 +++ edited/include/asm-generic/resource.h 2005-01-22 18:54:58 -08:00 @@ -20,8 +20,11 @@ #define RLIMIT_LOCKS 10 /* maximum file locks held */ #define RLIMIT_SIGPENDING 11 /* max number of pending signals */ #define RLIMIT_MSGQUEUE12 /* maximum bytes in POSIX mqueues */ - -#define RLIM_NLIMITS 13 +#define RLIMIT_NICE13 /* max nice prio allowed to raise to + 0-39 for nice level 19 .. -20 */ +#define RLIMIT_RTPRIO 14 /* maximum realtime priority */ + +#define RLIM_NLIMITS 15 #endif /* @@ -53,6 +56,8 @@ [RLIMIT_LOCKS] = { RLIM_INFINITY, RLIM_INFINITY }, \ [RLIMIT_SIGPENDING] = { MAX_SIGPENDING, MAX_SIGPENDING }, \ [RLIMIT_MSGQUEUE] = { MQ_BYTES_MAX, MQ_BYTES_MAX }, \ + [RLIMIT_NICE] = { 0, 0 }, \ + [RLIMIT_RTPRIO] = { 0, 0 }, \ } #endif /* __KERNEL__ */ = include/asm-alpha/resource.h 1.6 vs edited = --- 1.6/include/asm-alpha/resource.h2005-01-20 21:00:50 -08:00 +++ edited/include/asm-alpha/resource.h 2005-01-22 18:58:04 -08:00 @@ -18,8 +18,11 @@ #define RLIMIT_LOCKS 10 /* maximum file locks held */ #define RLIMIT_SIGPENDING 11 /* max number of pending signals */ #define RLIMIT_MSGQUEUE 12 /* maximum bytes in POSIX mqueues */ - -#define RLIM_NLIMITS 13 +#define RLIMIT_NICE13 /* max nice prio allowed to raise to + 0-39 for nice level 19 .. -20 */ +#define RLIMIT_RTPRIO 14 /* maximum realtime priority */ + +#define RLIM_NLIMITS 15 #define __ARCH_RLIMIT_ORDER /* = include/asm-mips/resource.h 1.8 vs edited = --- 1.8/include/asm-mips/resource.h 2005-01-20 21:00:50 -08:00 +++ edited/include/asm-mips/resource.h 2005-01-22 18:59:29 -08:00 @@ -25,8 +25,11 @@ #define RLIMIT_LOCKS 10/* maximum file locks held */ #define RLIMIT_SIGPENDING 11 /* max number of pending signals */ #define RLIMIT_MSGQUEUE 12 /* maximum bytes in POSIX mqueues */ +#define RLIMIT_NICE 13 /* max nice prio allowed to raise to + 0-39 for nice level 19 .. -20 */ +#define RLIMIT_RTPRIO 14 /* maximum realtime priority */ -#define RLIM_NLIMITS 13/* Number of limit flavors. */ +#define RLIM_NLIMITS 15/* Number of limit flavors. */ #define __ARCH_RLIMIT_ORDER /* = include/asm-sparc/resource.h 1.6 vs edited = --- 1.6/include/asm-sparc/resource.h2005-01-20 21:00:50 -08:00 +++ edited/include/asm-sparc/resource.h 2005-01-22 19:00:07 -08:00 @@ -24,8 +24,11 @@ #define RLIMIT_LOCKS 10 /* maximum file locks held */ #define RLIMIT_SIGPENDING 11 /* max number of pending signals */ #define RLIMIT_MSGQUEUE 12 /* maximum bytes in POSIX mqueues */ +#define RLIMIT_NICE13 /* max nice prio allowed to raise to + 0-39 for nice level 19 .. -20 */ +#define RLIMIT_RTPRIO 14 /* maximum realtime priority */ -#define RLIM_NLIMITS 13 +#define RLIM_NLIMITS 15 #define __ARCH_RLIMIT_ORDER /* = include/asm-sparc64/resource.h 1.6 vs edited = --- 1.6/include/asm-sparc64/resource.h 2005-01-20 21:00:50 -08:00 +++ edited/include/asm-sparc64/resource.h 2005-01-22 19:00:41 -08:00 @@ -24,8 +24,11 @@ #define RLIMIT_LOCKS 10 /* maximum file locks held */ #define RLIMIT_SIGPENDING 11 /* max number of pending signals */ #define RLIMIT_MSGQUEUE 12 /* maximum bytes in POSIX mqueues */ +#define RLIMIT_NICE13 /* max nice prio allowed to raise to + 0-39 for nice level 19 .. -20 */ +#define RLIMIT_RTPRIO 14 /* maximum realtime priority */ -#define RLIM_NLIMITS 13 +#define
2.6 more picky about IDE drives than 2.4 ?
Hi, i have many problems with kernel 2.6.10 since it won't run stable with an IDE-device. It's an internal IDE-RAID subsystem. The DMA is frequently disabled, and even writes/reads fail and the kernel reports I/O-Errors for many sectors. The RAID-device doesn't report any errors it it's own event-log. You can have a closer look at the error-messages below. I'm mailing to the LKML, since i haven't been abled to reproduce the problem with a kernel 2.4 bases system, but it randomly happens with 2.6 kernels. Let's take the latest Knoppix as an example (it comes with both kernels): - if i boot kernel 2.4, i can stress test the harddisk as much as i want. the kernel does report any problem and it doesn't disable DMA well - if i boot kernel 2.6, after a while, there are the error-message below in the log. "hdparm -k1" doesn't help, the kernel will disable DMA mode. There was a also a bigger problems for two times now, where the kernel refused to write to the devide, due to the I/O-Errors below. I'm very sad, that i haven't the log-lines prior to the I/O-Errors. I testes the RAID-subsystem with two different PC-systems. Always the same result: 2.4 works, 2.6 does not. It's hard for me to reproduce the Errors through. I'm still writing an application to reliably reproduce them :-( Does anybody know a good stress-test perhaps? Sequential reading doesn't seem to do the trick. What changes have been applied to the IDE subsystem from kernel 2.4 to kernel 2.6? What may cause this different behaviour? What does "status=0x51" mean? And why is "error=0x00" although the Error-Bit in the status-byte has been set. (i guess this is what status=0x51 means). How can the behaviour of kernel 2.6 be reverted to the behaviour of kernel 2.4? I already tried "hda=nowerr" in the append-line, but it doesn't help either. Is it a Bug of kernel 2.6, or should i smash the manufactures doors, to make them release a firmware-update of the RAID-subsystem since it reports strange values to the OS? The first kind of errors: hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x00 { } ide: failed opcode was: unknown hda: recal_intr: status=0x51 { DriveReady SeekComplete Error } hda: recal_intr: error=0x00 { } ide: failed opcode was: unknown hda: dma_intr: status=0x51 { DriveReady SeekComplete Error } hda: dma_intr: error=0x00 { } ide: failed opcode was: unknown hda: DMA disabled ide0: reset: success "dmesg" after with bigger problems: end_request: I/O error, dev hdc, sector 709679458 ReiserFS: hdc3: warning: vs-13070: reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1018017 1018816 0x0 SD] ReiserFS: hdc3: warning: clm-6006: writing inode 283 on readonly FS end_request: I/O error, dev hdc, sector 705275426 ReiserFS: hdc3: warning: vs-13070: reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1018017 1018708 0x0 SD] ReiserFS: hdc3: warning: clm-6006: writing inode 283 on readonly FS end_request: I/O error, dev hdc, sector 709687130 ReiserFS: hdc3: warning: vs-13070: reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1018017 1018817 0x0 SD] ReiserFS: hdc3: warning: clm-6006: writing inode 283 on readonly FS end_request: I/O error, dev hdc, sector 709695114 ReiserFS: hdc3: warning: vs-13070: reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [1018017 1018818 0x0 SD] ReiserFS: hdc3: warning: clm-6006: writing inode 283 on readonly FS end_request: I/O error, dev hdc, sector 709107250 ... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 4081] New: OpenOffice crashes while starting due to a threading error
Martin J. Bligh wrote: Please contact bug submitter for more info, not myself. - http://bugme.osdl.org/show_bug.cgi?id=4081 Summary: OpenOffice crashes while starting due to a threading error Kernel Version: 2.6.11-rc2 Status: NEW Severity: blocking Owner: [EMAIL PROTECTED] Submitter: [EMAIL PROTECTED] Distribution: Debian Hardware Environment: Pentum III 733 MHz Software Environment: Debian Sid Problem Description: While starting open Office crashes, it did not happend on 2.6.10, but happend on 2.6.11. rc1 and rc2. The only thing that has changed is the kernel. If i go back to 2.6.10 OpenOffice starts just fine. OO works for me on 2.6.11-rc2, but my OO is 1.1.1. The bugzilla mentions 1.1.2yyy and 1.1.3zzz, so I'd see if you (diego) can try some 1.1.3 OO. -- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/12] random pt4: Create new rol32/ror32 bitops
On Sat, Jan 22, 2005 at 09:10:40PM -0500, Chuck Ebbert wrote: > On Fri, 21 Jan 2005 at 15:41:06 -0600 Matt Mackall wrote: > > > Add rol32 and ror32 bitops to bitops.h > > Can you test this patch on top of yours? I did it on 2.6.10-ac10 but it > should apply OK. Compile tested and booted, but only random.c is using it > in my kernel. If I recall correctly from my testing of this about a year ago, compilers were already generating the appropriate rol/ror instructions. Have you checked the disassembly and found it wanting? -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
Felipe Alfaro Solana <[EMAIL PROTECTED]> writes: > > AFAIK, XOR is quite expensive on IA32 when compared to simple MOV > operatings. Also, since the original patch uses 3 MOVs to perform the > swapping, and your version uses 3 XOR operations, I don't see any > gains. Both are one cycle latency for register<->register on all x86 cores I've looked at. What makes you think differently? -Andi (who thinks the glibc qsort is vast overkill for kernel purposes where there are only small data sets and it would be better to use a simpler one optimized for code size) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: can't compile 2.6.11-rc2 on sparc64
Grzegorz Piotr Jaskiewicz wrote: On Sunday 23 January 2005 02:36, Randy.Dunlap wrote: Please look for another error. Run 'make' again. Those are all just warnings and don't cause a build error. all output past make: [EMAIL PROTECTED] linux-2.6.11-rc2]# make V=1 if test ! /usr/src/linux-2.6.11-rc2 -ef /usr/src/linux-2.6.11-rc2; then \ /bin/bash /usr/src/linux-2.6.11-rc2/scripts/mkmakefile \ /usr/src/linux-2.6.11-rc2 /usr/src/linux-2.6.11-rc2 2 6 \ > /usr/src/linux-2.6.11-rc2/Makefile; \ echo ' GEN/usr/src/linux-2.6.11-rc2/Makefile'; \ fi CHK include/linux/version.h rm -rf .tmp_versions mkdir -p .tmp_versions make -f scripts/Makefile.build obj=scripts/basic make -f scripts/Makefile.build obj=scripts make -f scripts/Makefile.build obj=scripts/genksyms make -f scripts/Makefile.build obj=scripts/mod make -f scripts/Makefile.build obj=init CHK include/linux/compile.h make -f scripts/Makefile.build obj=usr set -e; echo ' CHK usr/initramfs_list'; mkdir -p usr/; /bin/bash /usr/src/linux-2.6.11-rc2/scripts/gen_initramfs_list.sh> usr/initramfs_list.tmp; if [ -r usr/initramfs_list ] && cmp -s usr/initramfs_list usr/initramfs_list.tmp; then rm -f usr/initramfs_list.tmp; else echo ' UPD usr/initramfs_list'; mv -f usr/initramfs_list.tmp usr/initramfs_list; fi CHK usr/initramfs_list make -f scripts/Makefile.build obj=arch/sparc64/kernel /usr/bin/sparc64-pld-linux-gcc -Wp,-MD,arch/sparc64/kernel/.ioctl32.o.d -nostdinc -isystem /usr/lib/gcc/sparc64-pld-linux/3.4.2/include -D__KERNEL__ -Iinclude -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -O2 -fomit-frame-pointer -m64 -pipe -mno-fpu -mcpu=ultrasparc -mcmodel=medlow -ffixed-g4 -fcall-used-g5 -fcall-used-g7 -Wno-sign-compare -Wa,--undeclared-regs -finline-limit=10 -Wdeclaration-after-statement -Werror -Ifs/ -DKBUILD_BASENAME=ioctl32 -DKBUILD_MODNAME=ioctl32 -c -o arch/sparc64/kernel/.tmp_ioctl32.o arch/sparc64/kernel/ioctl32.c include/asm/uaccess.h: In function `siocdevprivate_ioctl': fs/compat_ioctl.c:648: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result fs/compat_ioctl.c: In function `put_dirent32': fs/compat_ioctl.c:2346: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result fs/compat_ioctl.c: In function `serial_struct_ioctl': fs/compat_ioctl.c:2489: warning: ignoring return value of `copy_from_user', declared with attribute warn_unused_result fs/compat_ioctl.c:2502: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result make[1]: *** [arch/sparc64/kernel/ioctl32.o] Error 1 make: *** [arch/sparc64/kernel] Error 2 [EMAIL PROTECTED] linux-2.6.11-rc2]# I have no idea what causes error here. What shall I input to get more info about that error ? It's the '-Werror' option that makes warnings become fatal errors that is stopping you here. You could edit arch/sparc64/kernel/Makefile and remove/comment that for now. -- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/12] random pt4: Create new rol32/ror32 bitops
On Fri, 21 Jan 2005 at 15:41:06 -0600 Matt Mackall wrote: > Add rol32 and ror32 bitops to bitops.h Can you test this patch on top of yours? I did it on 2.6.10-ac10 but it should apply OK. Compile tested and booted, but only random.c is using it in my kernel. x86-64 could use this too... Add i386 bitops for rol32/ror32: Signed-off-by: Chuck Ebbert <[EMAIL PROTECTED]> --- 2.6.10-ac10/include/linux/bitops.h~orig 2005-01-22 11:31:20.130239000 -0500 +++ 2.6.10-ac10/include/linux/bitops.h 2005-01-22 11:34:55.740239000 -0500 @@ -129,6 +129,7 @@ return sizeof(w) == 4 ? generic_hweight32(w) : generic_hweight64(w); } +#ifndef __HAVE_ARCH_ROTATE_32 /* * rol32 - rotate a 32-bit value left * @@ -150,5 +151,6 @@ { return (word >> shift) | (word << (32 - shift)); } +#endif /* ndef __HAVE_ARCH_ROTATE_32 */ #endif --- 2.6.10-ac10/include/asm-i386/bitops.h~orig 2004-08-24 05:08:39.0 -0400 +++ 2.6.10-ac10/include/asm-i386/bitops.h 2005-01-22 11:42:12.010239000 -0500 @@ -431,6 +431,41 @@ #define hweight16(x) generic_hweight16(x) #define hweight8(x) generic_hweight8(x) +#define __HAVE_ARCH_ROTATE_32 +/* + * rol32 - rotate a 32-bit value left + * + * @word: value to rotate + * @shift: bits to roll + */ +static inline __u32 rol32(__u32 word, int shift) +{ + __u32 res; + + asm("roll %%cl,%0" + : "=r" (res) + : "0" (word), "c" (shift) + : "cc"); + return res; +} + +/* + * ror32 - rotate a 32-bit value right + * + * @word: value to rotate + * @shift: bits to roll + */ +static inline __u32 ror32(__u32 word, int shift) +{ + __u32 res; + + asm("rorl %%cl,%0" + : "=r" (res) + : "0" (word), "c" (shift) + : "cc"); + return res; +} + #endif /* __KERNEL__ */ #ifdef __KERNEL__ Chuck - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Jack O'Quin wrote: Chris Wright and Arjan van de Ven have outlined a proposal to address the privilege issue using rlimits. This is still the only workable alternative to the realtime LSM on the table. If the decision were up to me, I would choose the simplicity and better security of the LSM. But their approach is adequate, if implemented in a timely fashion. I would like to see some progress on this in addition to the scheduler work. People still need SCHED_FIFO for some applications. I think this is a pretty sane and minimally intrusive (for the kernel) way to support what you want. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Bug 4081] New: OpenOffice crashes while starting due to a threading error
On 22 Jan 2005, at 18:33, Matthias-Christian Ott wrote: Hi! I'm suing Arch Linux and the Kernel 2.6.11-rc2 -- it works great. Try to recompile your ^ suing? My God! More legal trouble. Didn't you mean "using"? ;-) - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On 22 Jan 2005, at 22:00, vlobanov wrote: Hi, I was just reading over the patch, and had a quick question/comment upon the SWAP macro defined below. I think it's possible to do a tiny bit better (better, of course, being subjective), as follows: #define SWAP(a, b, size)\ do {\ register size_t __size = (size);\ register char * __a = (a), * __b = (b); \ do {\ *__a ^= *__b; \ *__b ^= *__a; \ *__a ^= *__b; \ __a++; \ __b++; \ } while ((--__size) > 0);\ } while (0) What do you think? :) AFAIK, XOR is quite expensive on IA32 when compared to simple MOV operatings. Also, since the original patch uses 3 MOVs to perform the swapping, and your version uses 3 XOR operations, I don't see any gains. Am I missing something? - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Paul Davis wrote: The idea is to get equivalent performance to SCHED_FIFO. The results show that much, and it is 100 times better than unprivileged SCHED_NORMAL. The fact that this is an unoptimised normal desktop environment means that the conclusion we _can_ draw is that SCHED_ISO is as good as SCHED_FIFO for audio on the average desktop. I need someone no, this isn't true. the performance you are getting isn't as good as SCHED_FIFO on a tuned system (h/w and s/w). the difference might be the fact that you have "an average desktop", or it might be that your desktop is just fine and SCHED_ISO actually is not as good as SCHED_FIFO. On my desktop, whatever that is, SCHED_FIFO and SCHED_ISO results were the same. with optimised hardware setup to see if it's as good as SCHED_FIFO in the critical setup. agreed. i have every confidence that Lee and/or Jack will be forthcoming :) Good stuff :). Meanwhile, I have the priority support working (but not bug free), and the preliminary results suggest that the results are better. Do I recall someone mentioning jackd uses threads at different priority? Cheers, Con P.S. If you read any emotion in my emails without a smiley or frowny face it's unintentional and is the limited emotional range the email format is allowed to convey. Hmm.. perhaps I should make this my sig ;) signature.asc Description: OpenPGP digital signature
[PATCH] e100 locking up netconsole.
I'm currently working with Ingo's RT patched kernel, but I believe this affects the mainline too. If the transmit buffer of the e100 overflowed, then the system would hang. This was caused because the e100 driver would stop the queue, and find_skb in netpoll.c would then loop forever. This is because the e100 net_poll would never start the queue again after the transmits have completed. For those that use the e100 and netconsole, all you need to do is a sysreq 't' to lock up the system. Here's the patch: (from Ingo's linux-2.6.11-rc2-V0.7.36-02, but should be OK with 2.6.11-rc2) Index: drivers/net/e100.c === --- drivers/net/e100.c (revision 60) +++ drivers/net/e100.c (working copy) @@ -1630,6 +1630,7 @@ struct nic *nic = netdev_priv(netdev); e100_disable_irq(nic); e100_intr(nic->pdev->irq, netdev, NULL); + e100_tx_clean(nic); e100_enable_irq(nic); } #endif -- Steve - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Odd little line-break problem (on fb console) during boot with recent kernels.
I'm using a vesafb framebuffer console, and recently I've noticed an odd little issue. During boot recent kernels (at least 2.6.11-rc1-bk4 and later - I don't have earlier kernels to test atm) will break the line vesafb: framebuffer at 0xf000, mapped to 0xe088, using 937k, total 65536k into two lines like this vesafb: framebuffer at 0xf000, mapped to 0xe088, using 937k, total 6553 6k the last two chars "6k" are put on a line by themselves. This is odd since there are both shorter and longer lines printed to the screen during boot, so it's not a "wrap at right edge of screen" thing. It's also odd since the code that prints this text is from drivers/video/vesafb.c : printk(KERN_INFO "vesafb: framebuffer at 0x%lx, mapped to 0x%p, " "using %dk, total %dk\n", That simple printk should result in a nice single line of output. If I take a look at the boot messages with dmesg after boot, then the message is printed as a single line as expected. I see nothing in the source for printk, vprintk or vscnprintf that would cause the line to be split. No other messages during boot seem to get split in odd places, this is the only one. So I'm wondering, what is causing this odd behaviour? Some info that might be relevant: - My graphics card is a ASUS V8200 Deluxe (nvidia geforce3) : 01:05.0 VGA compatible controller: nVidia Corporation NV20 [GeForce3] (rev a3) (prog-if 00 [VGA]) Subsystem: Asustek Computer, Inc. AGP-V8200 DDR Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- Status: Cap+ 66Mhz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- SERR- My lilo version is 22.5.9. The video mode I use is 1024x768x64k (vga=791 in lilo.conf - just for kicks I tried booting with vga=771, but that doesn't change anything). My distribution is Slackware Linux 10.0 (upgraded to -current as of today). Output from scripts/ver_linux is : If some fields are empty or look unusual you may have an old version. Compare to the current minimal requirements in Documentation/Changes. Linux dragon 2.6.11-rc2 #1 Sat Jan 22 23:04:44 CET 2005 i686 unknown unknown GNU/Linux Gnu C 3.4.2 Gnu make 3.80 binutils 2.15.92.0.2 util-linux 2.12p mount 2.12p module-init-tools 3.1 e2fsprogs 1.35 jfsutils 1.1.6 reiserfsprogs 3.6.18 reiser4progs line xfsprogs 2.6.13 pcmcia-cs 3.2.8 quota-tools3.12. PPP2.4.2 nfs-utils 1.0.7 Linux C Library2.3.3 Dynamic linker (ldd) 2.3.3 Linux C++ Library 6.0.2 Procps 3.2.3 Net-tools 1.60 Kbd1.12 Sh-utils 5.2.1 udev 050 Modules Loaded snd_pcm_oss snd_mixer_oss via_rhine snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep evdev agpgart If you would like me to test older kernel versions to determine when tis began to happen, then just let me know. If there's any other piece of info you need, ask and I'll provide it. -- Jesper Juhl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: can't compile 2.6.11-rc2 on sparc64
On Sunday 23 January 2005 02:36, Randy.Dunlap wrote: > Please look for another error. Run 'make' again. > Those are all just warnings and don't cause a build error. all output past make: [EMAIL PROTECTED] linux-2.6.11-rc2]# make V=1 if test ! /usr/src/linux-2.6.11-rc2 -ef /usr/src/linux-2.6.11-rc2; then \ /bin/bash /usr/src/linux-2.6.11-rc2/scripts/mkmakefile \ /usr/src/linux-2.6.11-rc2 /usr/src/linux-2.6.11-rc2 2 6 \ > /usr/src/linux-2.6.11-rc2/Makefile; \ echo ' GEN/usr/src/linux-2.6.11-rc2/Makefile'; \ fi CHK include/linux/version.h rm -rf .tmp_versions mkdir -p .tmp_versions make -f scripts/Makefile.build obj=scripts/basic make -f scripts/Makefile.build obj=scripts make -f scripts/Makefile.build obj=scripts/genksyms make -f scripts/Makefile.build obj=scripts/mod make -f scripts/Makefile.build obj=init CHK include/linux/compile.h make -f scripts/Makefile.build obj=usr set -e; echo ' CHK usr/initramfs_list'; mkdir -p usr/; /bin/bash /usr/src/linux-2.6.11-rc2/scripts/gen_initramfs_list.sh> usr/initramfs_list.tmp; if [ -r usr/initramfs_list ] && cmp -s usr/initramfs_list usr/initramfs_list.tmp; then rm -f usr/initramfs_list.tmp; else echo ' UPD usr/initramfs_list'; mv -f usr/initramfs_list.tmp usr/initramfs_list; fi CHK usr/initramfs_list make -f scripts/Makefile.build obj=arch/sparc64/kernel /usr/bin/sparc64-pld-linux-gcc -Wp,-MD,arch/sparc64/kernel/.ioctl32.o.d -nostdinc -isystem /usr/lib/gcc/sparc64-pld-linux/3.4.2/include -D__KERNEL__ -Iinclude -Wall -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -ffreestanding -O2 -fomit-frame-pointer -m64 -pipe -mno-fpu -mcpu=ultrasparc -mcmodel=medlow -ffixed-g4 -fcall-used-g5 -fcall-used-g7 -Wno-sign-compare -Wa,--undeclared-regs -finline-limit=10 -Wdeclaration-after-statement -Werror -Ifs/ -DKBUILD_BASENAME=ioctl32 -DKBUILD_MODNAME=ioctl32 -c -o arch/sparc64/kernel/.tmp_ioctl32.o arch/sparc64/kernel/ioctl32.c include/asm/uaccess.h: In function `siocdevprivate_ioctl': fs/compat_ioctl.c:648: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result fs/compat_ioctl.c: In function `put_dirent32': fs/compat_ioctl.c:2346: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result fs/compat_ioctl.c: In function `serial_struct_ioctl': fs/compat_ioctl.c:2489: warning: ignoring return value of `copy_from_user', declared with attribute warn_unused_result fs/compat_ioctl.c:2502: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result make[1]: *** [arch/sparc64/kernel/ioctl32.o] Error 1 make: *** [arch/sparc64/kernel] Error 2 [EMAIL PROTECTED] linux-2.6.11-rc2]# I have no idea what causes error here. What shall I input to get more info about that error ? -- GJ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: can't compile 2.6.11-rc2 on sparc64
On Sunday 23 January 2005 02:36, Randy.Dunlap wrote: > Please look for another error. Run 'make' again. > Those are all just warnings and don't cause a build error. That's just all I get on console. -- GJ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: can't compile 2.6.11-rc2 on sparc64
Grzegorz Piotr Jaskiewicz wrote: I get this error : CC arch/sparc64/kernel/ioctl32.o include/asm/uaccess.h: In function `siocdevprivate_ioctl': fs/compat_ioctl.c:648: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result fs/compat_ioctl.c: In function `put_dirent32': fs/compat_ioctl.c:2346: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result fs/compat_ioctl.c: In function `serial_struct_ioctl': fs/compat_ioctl.c:2489: warning: ignoring return value of `copy_from_user', declared with attribute warn_unused_result fs/compat_ioctl.c:2502: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result make[1]: *** [arch/sparc64/kernel/ioctl32.o] Error 1 gcc is 3.4, 64bit. That's ultra5. Please look for another error. Run 'make' again. Those are all just warnings and don't cause a build error. -- ~Randy - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
>The idea is to get equivalent performance to SCHED_FIFO. The results >show that much, and it is 100 times better than unprivileged >SCHED_NORMAL. The fact that this is an unoptimised normal desktop >environment means that the conclusion we _can_ draw is that SCHED_ISO is >as good as SCHED_FIFO for audio on the average desktop. I need someone no, this isn't true. the performance you are getting isn't as good as SCHED_FIFO on a tuned system (h/w and s/w). the difference might be the fact that you have "an average desktop", or it might be that your desktop is just fine and SCHED_ISO actually is not as good as SCHED_FIFO. >with optimised hardware setup to see if it's as good as SCHED_FIFO in >the critical setup. agreed. i have every confidence that Lee and/or Jack will be forthcoming :) --p - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
can't compile 2.6.11-rc2 on sparc64
I get this error : CC arch/sparc64/kernel/ioctl32.o include/asm/uaccess.h: In function `siocdevprivate_ioctl': fs/compat_ioctl.c:648: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result fs/compat_ioctl.c: In function `put_dirent32': fs/compat_ioctl.c:2346: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result fs/compat_ioctl.c: In function `serial_struct_ioctl': fs/compat_ioctl.c:2489: warning: ignoring return value of `copy_from_user', declared with attribute warn_unused_result fs/compat_ioctl.c:2502: warning: ignoring return value of `copy_to_user', declared with attribute warn_unused_result make[1]: *** [arch/sparc64/kernel/ioctl32.o] Error 1 gcc is 3.4, 64bit. That's ultra5. -- GJ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Avoiding fragmentation through different allocator
On Sat, Jan 22, 2005 at 09:48:20PM +, Mel Gorman wrote: > On Fri, 21 Jan 2005, Marcelo Tosatti wrote: > > > On Thu, Jan 20, 2005 at 10:13:00AM +, Mel Gorman wrote: > > > > > > > Hi Mel, > > > > I was thinking that it would be nice to have a set of high-order > > intensive workloads, and I wonder what are the most common high-order > > allocation paths which fail. > > > > Agreed. As I am not fully sure what workloads require high-order > allocations, I updated VMRegress to keep track of the count of > allocations and released 0.11 > (http://www.csn.ul.ie/~mel/projects/vmregress/vmregress-0.11.tar.gz). To > use it to track allocations, do the following > > 1. Download and unpack vmregress > 2. Patch a kernel with kernel_patches/v2.6/trace_pagealloc-count.diff . > The patch currently requires the modified allocator but I can fix that up > if people want it. Build and deploy the kernel > 3. Build vmregress by > ./configure --with-linux=/usr/src/linux-2.6.11-rc1-mbuddy > (or whatever path is appropriate) > make > 4. Load the modules with; > insmod src/code/vmregress_core.ko > insmod src/sense/trace_alloccount.ko > > This will create a proc entry /proc/vmregress/trace_alloccount that looks > something like; > > Allocations (V1) > --- > KernNoRclm 997453 370 500000 >0000 > KernRclm 35279000000 >0000 > UserRclm9870808000000 >0000 > Total 10903540 370 500000 >0000 > > Frees > - > KernNoRclm 590965 244 280000 >0000 > KernRclm 227100 6050000 >0000 > UserRclm7974200 73 170000 >0000 > Total 19695805 747 1000000 >0000 > > To blank the counters, use > > echo 0 > /proc/vmregress/trace_alloccount > > Whatever workload we come up with, this proc entry will tell us if it is > exercising high-order allocations right now. Great, excellent! Thanks. I plan to spend some time testing and trying to understand the vmregress package this week. > > It mostly depends on hardware because most high-order allocations happen > > inside device drivers? What are the kernel codepaths which try to do > > high-order allocations and fallback if failed? > > > > I'm not sure. I think that the paths we exercise right now will be largely > artifical. For example, you can force order-2 allocations by scping a > large file through localhost (because of the large MTU in that interface). > I have not come up with another meaningful workload that guarentees > high-order allocations yet. Thoughts and criticism of the following ideas are very much appreciated: In private conversation with wli (who helped me providing this information) we can conjecture the following: Modern IO devices are capable of doing scatter/gather IO. There is overhead associated with setting up and managing the scatter/gather tables. The benefit of large physically contiguous blocks is the ability to avoid the SG management overhead. Now the question is: The added overhead of allocating high order blocks through migration offsets the overhead of SG IO ? Quantifying that is interesting. This depends on the driver implementation (how efficiently its able to manage the SG IO tables) and device/IO subsystem characteristics. Also filesystems benefit from big physically contiguous blocks. Quoting wli "they want bigger blocks and contiguous memory to match bigger blocks..." I completly agree that your simplified allocator decreases fragmentation which in turn benefits the system overall. This is an area which can be further improved - ie efficiency in reducing fragmentation is excellent. I sincerely appreciate the work you are doing! > > To measure whether the cost of page migration offsets the ability to be > > able to deliver high-order allocations we want a set of meaningful > > performance tests? > > > > Bear in mind, there are more considerations. The allocator potentially > makes hotplug problems easier and could be easily tied into any > page-zeroing system. Some of your own benchmarks also implied that the > modified allocator helped some types of workloads which is beneficial in > itself.The last consideration is HugeTLB pages, which I am hoping William > will weigh in. > > Right now, I believe that the pool of huge pages is of a fixed size > because of fragmentation difficulties. If we knew we could allocate huge > pages, this pool would not have to be fixed. Some
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Jack O'Quin wrote: Neither run exhibits reliable audio performance. There is some low latency performance problem with your system. Maybe ReiserFS is causing trouble even with logging turned off. Perhaps the problem is somewhere else. Maybe some device is misbehaving. Until you solve this problem, beware of drawing conclusions. The idea is to get equivalent performance to SCHED_FIFO. The results show that much, and it is 100 times better than unprivileged SCHED_NORMAL. The fact that this is an unoptimised normal desktop environment means that the conclusion we _can_ draw is that SCHED_ISO is as good as SCHED_FIFO for audio on the average desktop. I need someone with optimised hardware setup to see if it's as good as SCHED_FIFO in the critical setup. I'm actually not an audio person and have no need for such a setup, but I can see how linux would benefit from such support... ;) Cheers, Con signature.asc Description: OpenPGP digital signature
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Jack O'Quin wrote: Con Kolivas <[EMAIL PROTECTED]> writes: So let's try again, sorry about the noise: ==> jack_test4-2.6.11-rc1-mm2-fifo.log <== * XRUN Count . . . . . . . . . : 3 Delay Maximum . . . . . . . . : 20161 usecs * ==> jack_test4-2.6.11-rc1-mm2-iso.log <== * XRUN Count . . . . . . . . . : 6 Delay Maximum . . . . . . . . : 4604 usecs * Pretty pictures: http://ck.kolivas.org/patches/SCHED_ISO/iso2-benchmarks/ Neither run exhibits reliable audio performance. There is some low latency performance problem with your system. Maybe ReiserFS is causing trouble even with logging turned off. Perhaps the problem is somewhere else. Maybe some device is misbehaving. Until you solve this problem, beware of drawing conclusions. Sigh.. I guess you want me to do all the benchmarking. Well it's easy enough to get good results. I'll simply turn off all services and not run a desktop. This is all on ext3 on a fully laden desktop by the way, but if you want to get the results you're looking for I can easily drop down to a console and get perfect results. Con signature.asc Description: OpenPGP digital signature
Re: seccomp for 2.6.11-rc1-bk8
On Sat, Jan 22, 2005 at 07:43:26PM -0500, Rik van Riel wrote: > On Sun, 23 Jan 2005, Andrea Arcangeli wrote: > > >I'm doing something that requires the maximum level of > >security ever, > > You're kidding, right ? Why should I be kidding? The client code I'm doing, has to be at least as secure as ssh and the firewall code, what else has to be more secure than that? Nor ssh nor the firewall code depends on ptrace for their security. The nice thing is that I can embed all the security in the kernel with seccomp, and I'd be a fool not trying it to get it merged and to complicate my life with ptrace. Once seccomp is in, I believe there's a chance that security people uses it for more than Cpushare while I don't think there's a chance you'll see security people using ptrace_syscall hardcoding the syscall numbers in every userland app out there that may have to parse untrusted data with potentially buggy bytecode (i.e. decompression bytecode etc..). - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: BUG: 2.6.11-rc2 and -rc1 hang during boot on PowerMacs
On Sat, 2005-01-22 at 18:23 +0100, Mikael Pettersson wrote: > Kernels 2.6.11-rc2 and -rc1 hang during boot on my Beige PowerMac G3. > The last kernel message on the console is: > > adb: starting probe task... > > At this point the kernel hangs and doesn't respond to any attempt > to invoke SYSRQ or XMON. Normally the subsequent messages would be: > > adb devices: [2]: 2 5 [3]: 3 1 > ADB keyboard at 2, handler set to 3 > Detected ADB keyboard, type ISO, swapping keys. > input: ADB keyboard on adb2:2.05/input > ADB HID on ID 3 not yet registered > ADB mouse at 3, handler set to 2 > input: ADB mouse on adb3:3.01/input > adb: finished probe task... > > The 2.6.11-rc1 kernel also hung on an eMac (G4). On that machine it > appears the hang occurred in radeonfb: the screen flickers off/on > during radeonfb initialisation, but with 2.6.11-rc1 the screen just > went black. Afterwards the eMac did not respond to any keyboard or > network activity, so I have to assume it hung hard. > > I've traced the cause of the hangs to a local_irq_disable() added to > init/main.c:rest_init() in 2.6.10-bk12. Removing it eliminates the > hangs on both the G3 and the eMac: I know about this problem, I'm working on a proper fix. Thanks for your report. Ben. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: seccomp for 2.6.11-rc1-bk8
On Sun, Jan 23, 2005 at 01:07:04AM +0100, Pavel Machek wrote: > Adding code is easy, but in the long term would lead to maintainance > nightmare. Adding seccomp code that does subset of ptrace, just > because ptrace audit is lot of work, seems like a wrong thing to > do. Sorry. Even if I do the ptrace audit right now, within 6 months something can change and the implications of the changes won't be as trivial to evaluate as if entry.S or seccomp.c have changed. The userland side will be a lot more complicated too to implement. Do you want video compressed strems to be played securely and efficiently? I can't see a better solution than seccomp. ptrace would be slower and it'd require ugly code to be written in userland. Streams are going to pump some stuff into the pipes and this will avoid quite a number of schedules per second (regardless of buffering). The seccomp API is just tricky enough without having to hardcoded into every userland app the number of the syscalls. Seccomp at least gives a slight chance to write arch indipendent code while still providing lowlevel security from the OS, there's no way to use ptrace_syscall in a arch indipendent manner. In the last patch I sent privately to Andrew I made it a config option, but I recommend not to disable it, or you won't be able to run the Cpushare client. Andrew's right seccomp.o would waste precious bytes (not kbytes) on embedded systems, so it has to be a config option for that. You can still modify it to use ptrace freely, but then I will have nothing to do with the problems that may arise over time by using ptrace within the GPL'd Cpushare client code and I personally do not approve the use of ptrace there (but it's GPL so you can modify it). I'm doing something that I can trust to run on my own desktop system, and personally seccomp is the only thing I'm confortable to depend on. Plus the userland gets so much simpler as well. It's not only a problem of trusting the kernel space of ptrace. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: seccomp for 2.6.11-rc1-bk8
On Sun, 23 Jan 2005, Andrea Arcangeli wrote: I'm doing something that requires the maximum level of security ever, You're kidding, right ? -- "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." - Brian W. Kernighan - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.11-rc2
On Sat, 2005-01-22 at 18:54 -0500, sean wrote: > I'm compiling with NAT, and get a different problem: > >LD net/ipv4/netfilter/built-in.o > net/ipv4/netfilter/ip_nat_tftp.o(.bss+0x0): multiple definition of > `ip_nat_tftp_hook' > net/ipv4/netfilter/ip_conntrack_tftp.o(.bss+0x0): first defined here > make[3]: *** [net/ipv4/netfilter/built-in.o] Error 1 > make[2]: *** [net/ipv4/netfilter] Error 2 Ok, another problem intriduced by the recent patches... sigh Try this patch: diff -X dontdiff.ny -urNp linux-2.6.11-rc2.orig/include/linux/netfilter_ipv4/ip_conntrack_tftp.h linux-2.6.11-rc2/include/linux/netfilter_ipv4/ip_conntrack_tftp.h --- linux-2.6.11-rc2.orig/include/linux/netfilter_ipv4/ip_conntrack_tftp.h 2005-01-22 15:23:45.0 +0100 +++ linux-2.6.11-rc2/include/linux/netfilter_ipv4/ip_conntrack_tftp.h 2005-01-23 01:31:25.0 +0100 @@ -13,7 +13,7 @@ struct tftphdr { #define TFTP_OPCODE_ACK4 #define TFTP_OPCODE_ERROR 5 -unsigned int (*ip_nat_tftp_hook)(struct sk_buff **pskb, +extern unsigned int (*ip_nat_tftp_hook)(struct sk_buff **pskb, enum ip_conntrack_info ctinfo, struct ip_conntrack_expect *exp); -- /Martin signature.asc Description: This is a digitally signed message part
Re: [patch 1/13] Qsort
On Sat, Jan 22, 2005 at 03:28:14PM -0800, Matt Mackall wrote: > On Sat, Jan 22, 2005 at 09:34:01PM +0100, Andreas Gruenbacher wrote: > > Add a quicksort from glibc as a kernel library function, and switch > > xfs over to using it. The implementations are equivalent. The nfsacl > > protocol also requires a sort function, so it makes more sense in > > the common code. > > Please update this to kernel formatting standards and try to modernize > it a bit. I started working on this with an eye to doing some performance testing of the insertion sort threshold in userspace, but I'm about to head out for the day. Here's what I've got so far, compiles but untested. Note the insertion sort at the end really ought to be using memmove as well. /* Copyright (C) 1991, 1992, 1996, 1997, 1999 Free Software Foundation, Inc. Written by Douglas C. Schmidt ([EMAIL PROTECTED]). The GNU C Library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 2.1 of the License, or (at your option) any later version. The GNU C Library is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with the GNU C Library; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA. */ /* If you consider tuning this algorithm, you should consult first: Engineering a sort function; Jon Bentley and M. Douglas McIlroy; Software - Practice and Experience; Vol. 23 (11), 1249-1265, 1993. */ #include #include #define min(x,y) ({ \ typeof(x) _x = (x); \ typeof(y) _y = (y); \ (void) (&_x == &_y);\ _x < _y ? _x : _y; }) /* Byte-wise swap two items of size SIZE. */ #define SWAP(a, b, size) \ do \ { \ size_t __size = (size); \ char *__a = (a), *__b = (b);\ do \ { \ char __tmp = *__a; \ *__a++ = *__b; \ *__b++ = __tmp; \ } while (--__size > 0); \ } while (0) /* Discontinue quicksort algorithm when partition gets below this size. This particular magic number was chosen to work best on a Sun 4/260. */ #define MAX_THRESH 4 /* Stack node declarations used to store unfulfilled partition obligations. */ typedef struct { void *lo; void *hi; } stack_node; /* Order size using quicksort. This implementation incorporates four optimizations discussed in Sedgewick: 1. Non-recursive, using an explicit stack of pointer that store the next array partition to sort. To save time, this maximum amount of space required to store an array of SIZE_MAX is allocated on the stack. Assuming a 32-bit (64 bit) integer for size_t, this needs only 32 * sizeof(stack_node) == 256 bytes (for 64 bit: 1024 bytes). Pretty cheap, actually. 2. Chose the pivot element using a median-of-three decision tree. This reduces the probability of selecting a bad pivot value and eliminates certain extraneous comparisons. 3. Only quicksorts TOTAL_ELEMS / MAX_THRESH partitions, leaving insertion sort to order the MAX_THRESH items within each partition. This is a big win, since insertion sort is faster for small, mostly sorted array segments. 4. The larger of the two sub-partitions is always pushed onto the stack first, with the algorithm then concentrating on the smaller partition. This *guarantees* no more than log (total_elems) stack size is needed (actually O(1) in this case)! */ void qsort(void *base, size_t num, size_t size, int (*cmp) (const void *, const void *)) { const size_t max_thresh = MAX_THRESH * size; void *hi, *lo, *mid, *left, *right; void *end = base + (size * (num - 1)); void *tmp = base; void *thresh = min(end, base + max_thresh); void *run, *trav; stack_node *stack, *top; if (num == 0) return; lo = base; hi = lo + size * (num - 1); if (num > MAX_THRESH) {
Re: seccomp for 2.6.11-rc1-bk8
Hi! > > Well, then you can help auditing ptrace()... It is probably also true > > that more people audited ptrace() than seccomp :-). > > Why should I spend time auditing ptrace when I have a superior solution > that doesn't require me any auditing at all? I've an huge pile of > work, Adding code is easy, but in the long term would lead to maintainance nightmare. Adding seccomp code that does subset of ptrace, just because ptrace audit is lot of work, seems like a wrong thing to do. Sorry. Pavel -- People were complaining that M$ turns users into beta-testers... ...jr ghea gurz vagb qrirybcref, naq gurl frrz gb yvxr vg gung jnl! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
2.6.11-rc2-kj
Another release from kernel janitors (http://kerneljanitors.org/) This one contains 194 patches. Mail me if I missed any patches. Feedback is appreciated, especially for some not so trivial patches like wait_event ones. Patchset is at http://coderock.org/kj/2.6.11-rc2-kj/ new in this release: msleep-drivers_ieee1394_sbp2.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [UPDATE PATCH] ieee1394/sbp2: use ssleep() instead of schedule_timeout() msleep-drivers_net_cs89x0.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [PATCH 5/28] net/cs89x0: replace schedule_timeout() with msleep() msleep-drivers_net_wan_cosa.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [PATCH 4/28] net/cosa: replace schedule_timeout() with msleep() msleep_ssleep-drivers_net_wireless_airo.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [PATCH 3/28] net/airo: replace schedule_timeout() with msleep()/ssleep() typo_suppport-bttv_dvb.patch From: Carlo Perassi <[EMAIL PROTECTED]> Subject: [KJ] [patch] trivial typos vfree-arch_s390_kernel_module.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [18/29] module.c - vfree() checking cleanups vfree-drivers_char_agp_backend.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [23/29] backend.c - vfree() checking cleanups vfree-drivers_char_agp_generic.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [28/29] generic.c - vfree() checking cleanups vfree-drivers_ieee1394_dma.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [9/29] dma.c - vfree() checking cleanups vfree-drivers_isdn_hardware_eicon_platform.h.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [2/29] platform.h - vfree() checking cleanups vfree-drivers_isdn_i4l_isdn_bsdcomp.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [1/29] isdn_bsdcomp.c - vfree() checking cleanups vfree-drivers_media_dvb_dvb-core_dmxdev.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [12/29] dmxdev.c - vfree() checking cleanups vfree-drivers_media_dvb_dvb-core_dvb_ca_en50221.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [4/29] dvb_ca_en50221.c - vfree() checking cleanups vfree-drivers_media_dvb_dvb-core_dvb_demux.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [20/29] dvb_demux.c - vfree() checking cleanups vfree-drivers_media_dvb_ttpci_av7110.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [5/29] av7110.c - vfree() checking cleanups vfree-drivers_media_dvb_ttpci_av7110_ipack.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [8/29] av7110_ipack.c - vfree() checking cleanups vfree-drivers_media_dvb_ttpci_budget-core.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [27/29] budget-core.c - vfree() checking cleanups vfree-drivers_media_video_stradis.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [24/29] stradis.c - vfree() checking cleanups vfree-drivers_scsi_qla2xxx_qla_os.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [15/29] qla_os.c - vfree() checking cleanups vfree-drivers_video_sis_sis_main.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [22/29] sis_main.c - vfree() checking cleanups vfree-fs_reiserfs_super.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [19/29] super.c - vfree() checking cleanups vfree-net_bridge_netfilter_ebtables.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [14/29] ebtables.c - vfree() checking cleanups vfree-sound_oss_gus_wave.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [21/29] gus_wave.c - vfree() checking cleanups vfree-sound_oss_pss.patch From: [EMAIL PROTECTED] Subject: [KJ] [PATCH] [RESEND] [29/29] pss.c - vfree() checking cleanups extern-include_linux_generic_serial.h.old.patch From: Adrian Bunk <[EMAIL PROTECTED]> Subject: [KJ] [2.6 patch] generic_serial.h: kill incorrect gs_debug reference msleep-arch_arm_mach-sa1100_cpu-sa1110.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [PATCH 1/22] arm/cpu-sa1110: replace schedule_timeout() with msleep() msleep-arch_ia64_kernel_smpboot.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [PATCH 13/21] ia64/smpboot: replace schedule_timeout() with msleep() msleep-arch_ppc64_kernel_iSeries_pci_reset.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [PATCH 16/21] ppc64/iSeries_pci_reset: replace schedule_timeout() with msleep() msleep-arch_ppc64_kernel_pSeries_smp.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [PATCH 17/21] ppc64/pSeries_smp: replace schedule_timeout() with msleep() msleep-arch_ppc64_kernel_smp.patch From: Nishanth Aravamudan <[EMAIL PROTECTED]> Subject: [KJ] [PATCH 19/21] ppc64/smp: replace
Re: Linux 2.6.11-rc2
Martin Josefsson wrote: On Fri, 2005-01-21 at 22:32 -0800, Udo A. Steinberg wrote: On Fri, 21 Jan 2005 18:13:55 -0800 (PST) Linus Torvalds (LT) wrote: LT> Ok, trying to calm things down again for a 2.6.11 release. Connection tracking does not compile... CC net/ipv4/netfilter/ip_conntrack_standalone.o In file included from net/ipv4/netfilter/ip_conntrack_standalone.c:34: include/linux/netfilter_ipv4/ip_conntrack.h:135: warning: "struct ip_conntrack" declared inside parameter list include/linux/netfilter_ipv4/ip_conntrack.h:135: warning: its scope is only this definition or declaration, which is probably not what you want include/linux/netfilter_ipv4/ip_conntrack.h:305: warning: "enum ip_nat_manip_type" declared inside parameter list include/linux/netfilter_ipv4/ip_conntrack.h:306: error: parameter `manip' has incomplete type include/linux/netfilter_ipv4/ip_conntrack.h: In function `ip_nat_initialized': include/linux/netfilter_ipv4/ip_conntrack.h:307: error: `IP_NAT_MANIP_SRC' undeclared (first use in this function) include/linux/netfilter_ipv4/ip_conntrack.h:307: error: (Each undeclared identifier is reported only once include/linux/netfilter_ipv4/ip_conntrack.h:307: error: for each function it appears in.) The problem is when compiling without NAT... The patch below should fix it, I can compile both with and without NAT now. I'm compiling with NAT, and get a different problem: LD net/ipv4/netfilter/built-in.o net/ipv4/netfilter/ip_nat_tftp.o(.bss+0x0): multiple definition of `ip_nat_tftp_hook' net/ipv4/netfilter/ip_conntrack_tftp.o(.bss+0x0): first defined here make[3]: *** [net/ipv4/netfilter/built-in.o] Error 1 make[2]: *** [net/ipv4/netfilter] Error 2 sean - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
kernel oops!
Jan 22 13:27:59 warsheep Unable to handle kernel NULL pointer dereference at virtual address Jan 22 13:27:59 warsheep printing eip: Jan 22 13:27:59 warsheep Jan 22 13:27:59 warsheep *pgd = cde9ddb4 Jan 22 13:27:59 warsheep *pmd = cde9ddb4 Jan 22 13:27:59 warsheep Oops: [#1] Jan 22 13:27:59 warsheep SMP Jan 22 13:27:59 warsheep CPU:0 Jan 22 13:27:59 warsheep EIP:0060:[<>]Not tainted VLI Jan 22 13:27:59 warsheep EFLAGS: 00010282 (2.6.10-hardened-r2-warsheep62) Jan 22 13:27:59 warsheep EIP is at 0x0 Jan 22 13:27:59 warsheep eax: ebx: de455000 ecx: c02c60e0 edx: c6b41000 Jan 22 13:27:59 warsheep esi: de455000 edi: ebp: dd0a2680 esp: cde9de9c Jan 22 13:27:59 warsheep ds: 007b es: 007b ss: 0068 Jan 22 13:27:59 warsheep Process pptpctrl (pid: 16689, threadinfo=cde9c000 task=d112ca20) Jan 22 13:27:59 warsheep Stack: c02c97bc c6b41000 c02c895c de455000 04949168 c03d0106 de455000 Jan 22 13:27:59 warsheep de45500c dd0a2680 c02c4141 de455000 dd0a2680 c01c7d49 Jan 22 13:27:59 warsheep dd0a2680 0020 0005 0005 c01da72f dd0a2680 Jan 22 13:27:59 warsheep Call Trace: Jan 22 13:27:59 warsheep [] pty_chars_in_buffer+0x2c/0x50 Jan 22 13:27:59 warsheep [] normal_poll+0xfc/0x16b Jan 22 13:27:59 warsheep [] schedule_timeout+0x76/0xc0 Jan 22 13:27:59 warsheep [] tty_poll+0xa1/0xc0 Jan 22 13:27:59 warsheep [] fget+0x49/0x60 Jan 22 13:27:59 warsheep [] do_select+0x26f/0x2e0 Jan 22 13:27:59 warsheep [] __pollwait+0x0/0xd0 Jan 22 13:27:59 warsheep [] sys_select+0x2db/0x4f0 Jan 22 13:27:59 warsheep [] sysenter_past_esp+0x52/0x79 Jan 22 13:27:59 warsheep Code: Bad EIP value. The oops ocures only when the kernel is build with SMP and HT support, in UP mode the oops doesn't occur! I have a 2.6.10 kernel with SMP and HT compiled kernel, I have a P4 3GHz with HT a have a VPN server with pppd and pptpd(poptop) and and average of 130 simultanious connections, the oops doesn't occur at a particular number of simulationus VPN connection.I can build a kernel with debugging enabled or something to help to track th source of the problem. Please CC as I am not subscribed to this mailing list. -- ierdnah <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: seccomp for 2.6.11-rc1-bk8
On Sat, Jan 22, 2005 at 08:42:42PM +0100, Pavel Machek wrote: > Well, then you can help auditing ptrace()... It is probably also true > that more people audited ptrace() than seccomp :-). Why should I spend time auditing ptrace when I have a superior solution that doesn't require me any auditing at all? I've an huge pile of work, I'm not doing this for fun, just thinking at wasting time auditing a single line of ptrace code is insane as far as I'm concerned (if I can avoid it with a more robust, less likely to break and simpler approach). If the l-k community forces me to use ptrace, I'll be forced to do that indeed (and you should be ready to take the blame if something goes wrong), but be sure I'll try as much as I can to stay away from ptrace completely. ptrace is a debugging knob, uml itself is a debugging tool that depends on a debugging knob and that's fine. I'm not doing a debugging tool, I'm doing something that requires the maximum level of security ever, and using ptrace is dead wrong for that IMHO. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On Sat, Jan 22, 2005 at 09:34:01PM +0100, Andreas Gruenbacher wrote: > Add a quicksort from glibc as a kernel library function, and switch > xfs over to using it. The implementations are equivalent. The nfsacl > protocol also requires a sort function, so it makes more sense in > the common code. Please update this to kernel formatting standards and try to modernize it a bit. > +/* Byte-wise swap two items of size SIZE. */ > +#define SWAP(a, b, size) \ > + do \ > +{ > \ > + register size_t __size = (size); > \ > + register char *__a = (a), *__b = (b);\ > + do \ > + { \ > + char __tmp = *__a; \ > + *__a++ = *__b; \ > + *__b++ = __tmp; \ > + } while (--__size > 0); \ > +} while (0) Inline, please? Register keyword?! > +typedef struct > + { > +char *lo; > +char *hi; > + } stack_node; void *, please > + > +/* The next 5 #defines implement a very fast in-line stack abstraction. */ > +/* The stack needs log (total_elements) entries (we could even subtract > + log(MAX_THRESH)). Since total_elements has type size_t, we get as > + upper bound for log (total_elements): > + bits per byte (CHAR_BIT) * sizeof(size_t). */ > +#define CHAR_BIT 8 > +#define STACK_SIZE (CHAR_BIT * sizeof(size_t)) So the stack is going to be either 256 or 1024 bytes. Seems like we ought to kmalloc it. > +#define PUSH(low, high) ((void) ((top->lo = (low)), (top->hi = (high)), > ++top)) > +#define POP(low, high) ((void) (--top, (low = top->lo), (high = > top->hi))) > +#define STACK_NOT_EMPTY (stack < top) There's only one usage of POP, one of STACK_NOT_EMPTY and two of PUSH that can trivially be made one. Please kill these macros. > + 3. Only quicksorts TOTAL_ELEMS / MAX_THRESH partitions, leaving > + insertion sort to order the MAX_THRESH items within each partition. > + This is a big win, since insertion sort is faster for small, mostly > + sorted array segments. This observation may be dated, instruction cache issues may dominate now. > + char *mid = lo + size * ((hi - lo) / size >> 1); Get rid of all this char* stuff, please. It makes for lots of ugly and unnecessary casting. > + if ((*cmp) ((void *) mid, (void *) lo) < 0) > + SWAP (mid, lo, size); cmp(mid, lo) > + if ((*cmp) ((void *) hi, (void *) mid) < 0) > + SWAP (mid, hi, size); > + else > + goto jump_over; > + if ((*cmp) ((void *) mid, (void *) lo) < 0) > + SWAP (mid, lo, size); > + jump_over:; ?! > + /* Once the BASE_PTR array is partially sorted by quicksort the rest > + is completely sorted using insertion sort, since this is efficient > + for partitions below MAX_THRESH size. BASE_PTR points to the beginning > + of the array to sort, and END_PTR points at the very last element in > + the array (*not* one beyond it!). */ > + > + { > +char *end_ptr = _ptr[size * (total_elems - 1)]; > +char *tmp_ptr = base_ptr; > +char *thresh = min(end_ptr, base_ptr + max_thresh); > +register char *run_ptr; Move these vars to the top or better yet, split this into two functions. -- Mathematics is the supreme nostalgia of our time. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
reiserfs bug
Hi, Im having a segfault when trying to rm -rf a directory in my hard disk. If I run fsck.reiserfs on that partition it tells me to --rebuild-tree. I wont yet, because I cant make a backup and, well, Im scared of losing my data ;) Ah, kernel is a 2.6.10, and Im running debian unstable on an athlon... hope it helps! I attach the kernel output: --- ReiserFS: sda4: found reiserfs format "3.6" with standard journal ReiserFS: sda4: using ordered data mode ReiserFS: sda4: journal params: device sda4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30 ReiserFS: sda4: checking transaction log (sda4) ReiserFS: sda4: replayed 3 transactions in 5 seconds ReiserFS: sda4: Using r5 hash to sort names ReiserFS: sda4: warning: vs-2180: finish_unfinished: iget failed for [245831 247149 0x0 SD] agpgart: Found an AGP 3.5 compliant device at :00:00.0. agpgart: XFree86 passes broken AGP3 flags (1f00080f). Fixed. agpgart: Putting AGP V3 device at :00:00.0 into 8x mode agpgart: Putting AGP V3 device at :01:00.0 into 8x mode ReiserFS: warning: is_tree_node: node level 54786 does not match to the expected one 1 ReiserFS: sda4: warning: vs-5150: search_by_key: invalid format found in block 2037090. Fsck? ReiserFS: sda4: warning: vs-13070: reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [257451 257464 0x0 SD] ReiserFS: warning: is_tree_node: node level 54786 does not match to the expected one 1 ReiserFS: sda4: warning: vs-5150: search_by_key: invalid format found in block 2037090. Fsck? ReiserFS: sda4: warning: vs-13070: reiserfs_read_locked_inode: i/o failure occurred trying to find stat data of [257451 257465 0x0 SD] ReiserFS: warning: is_tree_node: node level 54786 does not match to the expected one 1 ReiserFS: sda4: warning: vs-5150: search_by_key: invalid format found in block 2037090. Fsck? ReiserFS: sda4: warning: vs-5657: reiserfs_do_truncate: i/o failure occurred trying to truncate [257451 257463 0xfff DIRECT] ReiserFS: sda4: warning: clm-2100: nesting info a different FS ReiserFS: sda4: warning: clm-2100: nesting info a different FS [ cut here ] kernel BUG at fs/reiserfs/journal.c:2825! invalid operand: [#1] PREEMPT Modules linked in: ppp_deflate zlib_deflate bsd_comp thermal fan button processor ac battery ppp_async crc_ccitt ipv6 ehci_hcd sn9c102 videod ev uhci_hcd usbcore sd_mod sata_via libata scsi_mod pci_hotplug analog parport_pc parport pcspkr reiserfs dm_mod capability commoncap it87 ee prom i2c_sensor i2c_isa nvidia rtc ppp_generic slhc ide_cd cdrom tsdev evdev psmouse mousedev i2c_viapro i2c_core snd_pcm_oss snd_mixer_oss s nd_via82xx snd_ac97_codec snd_pcm snd_timer snd_page_alloc gameport snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore via_agp agpgart sk98lin af_packet ext2 ext3 jbd mbcache ide_generic via82cxxx trm290 triflex slc90e66 sis5513 siimage serverworks sc1200 rz1000 piix pdc202xx _old opti621 ns87415 hpt366 ide_disk hpt34x generic cy82c693 cs5530 cs5520 cmd64x atiixp amd74xx alim15x3 aec62xx pdc202xx_new ide_core unix fbcon font bitblit vesafb cfbcopyarea cfbimgblt cfbfillrect CPU:0 EIP:0060:[]Tainted: P VLI EFLAGS: 00010246 (2.6.10-1-686) EIP is at journal_begin+0xee/0x100 [reiserfs] eax: ebx: d9c9c000 ecx: d9c9df00 edx: d9c9df00 esi: d9c9deb0 edi: d8882800 ebp: d9c9deb0 esp: d9c9de74 ds: 007b es: 007b ss: 0068 Process rm (pid: 6562, threadinfo=d9c9c000 task=da778020) Stack: 0012 d6b8cd6c e0bc1bfa d9c9deb0 d8882800 0012 02c5 e0c83000 d8882800 d6b8cd6c d9c9df70 Call Trace: [] remove_save_link+0x3a/0x110 [reiserfs] [] journal_end+0xaa/0x100 [reiserfs] [] reiserfs_delete_inode+0xe5/0x110 [reiserfs] [] permission+0x6d/0x80 [] reiserfs_delete_inode+0x0/0x110 [reiserfs] [] generic_delete_inode+0xa5/0x170 [] iput+0x63/0x90 [] sys_unlink+0xd3/0x130 [] sys_write+0x51/0x80 [] syscall_call+0x7/0xb Code: 24 08 89 54 24 04 89 34 24 e8 1f a5 5e df 83 7e 04 01 7e 04 31 c0 eb bd 89 3c 24 b8 c0 d8 bd e0 89 44 24 04 e8 04 fb fe ff eb e9 <0f> 0b 09 0b 6a ed bd e0 eb c0 90 8d b4 26 00 00 00 00 55 57 56 - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: device-mapper: fix TB stripe data corruption
On Fri, Jan 21, 2005 at 03:57:38PM -0600, Kevin Corry wrote: > On Friday 21 January 2005 3:20 pm, Roland Dreier wrote: > > If I understand you correctly, do_div() (defined in ) I went for the simplest and safest fix first as this is a data corruption problem and I want assurance that this fixes device-mapper striping. I didn't want to change it to do_div() without first checking it would not slow down the code on the main architectures: on the contrary I would hope that use of an optimised library inline speeds it up, but want to be sure. You don't need the 64-bit mod until you have hundreds of TB in a single logical volume block device, filesystem... So far, I've only seen two test reports, both of which say they are still seeing data corruption in a filesystem on top of dm-stripe after applying this patch. But none of this information so far is specific enough to say whether the remaining problem(s) is/are in device-mapper. Alasdair -- [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: A scrub daemon (prezeroing)
Hello Christoph, In this part of your patch: [...] Index: linux-2.6.10/include/linux/gfp.h === --- linux-2.6.10.orig/include/linux/gfp.h 2005-01-21 10:43:59.0 -0800 +++ linux-2.6.10/include/linux/gfp.h2005-01-21 11:56:07.0 -0800 @@ -131,4 +131,5 @@ extern void FASTCALL(free_cold_page(stru void page_alloc_init(void); +void prep_zero_page(struct page *, unsigned int order); #endif /* __LINUX_GFP_H */ - imoh would be better: +void prep_zero_page(struct page *page, unsigned int order, unsigned int gfp_flags); hth, Joel - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch to 2.6.10-rc2] ext3_find_goal
[EMAIL PROTECTED] wrote: We found strange blocks layout in our mail server, after careful study, we got the reason and tried to fix it. When loading an inode from buffer/disk(ext2/3_read_inode),then allocating the second block(block==1) of the corresponding file: i_next_alloc_block and i_next_alloc_goal are both zero,and in fact are not valid, but they(i_next_alloc_block/goal) take effect in the former codes. This causes non-contiguous file. Below patch add a check,and fixes this. Good catch! - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 00/11] Get rid of verify_area() - convert to access_ok() and deprecate.
On Tue, 18 Jan 2005, Jesper Juhl wrote: > > Here's a series of patches to convert all (or rather almost all) in-kernel > users of verify_area() to access_ok(), and then deprecate verify_area(). > [...] Just a small followup to say that this series of patches still applies to 2.6.11-rc2 (the first one with a little fuzzyness though). If wanted I can re-diff these against 2.6.11-rc2. If I get no feedback (have had none so far) I'll wait until 2.6.11 is out the door, then re-diff and re-submit against that. -- Jesper Juhl - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/13] Qsort
On Sat, 2005-01-22 at 23:06, Arjan van de Ven wrote: > since you took the glibc one.. the glibc authors have repeatedly asked > if glibc code that goes into the kernel will be export_symbol_gpl only > due to their view of the gpl and lgpl Sure, no big deal. We could equally well take the xfs one instead. Cheers, -- Andreas Gruenbacher <[EMAIL PROTECTED]> SUSE Labs, SUSE LINUX GMBH - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] make loglevels in init/main.c a little more sane.
This patch modifies a few of the printk() loglevels used in init/main.c in an attempt to make them a bit more appropriate. The default loglevel is KERN_WARNING, but a few printk's without explicit loglevel are not (in my oppinion) warnings, so add proper warning levels - for instance; telling the user how many CPU's were brought up is hardly a warning, make it KERN_INFO instead. The initial printing of linux_banner is not a warning condition, I'd say it's more of a NOTICE or even INFO condition - I've made it KERN_NOTICE just as the printing of the kernel command line. A few printk's without explicit loglevel do match the default one, but I've made them explicit (the default could change in the future, and if it does then explicitly setting the proper loglevel is a nice thing). Please consider applying. Patch compiles and boots fine on my box. Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]> diff -up linux-2.6.11-rc2-orig/init/main.c linux-2.6.11-rc2/init/main.c --- linux-2.6.11-rc2-orig/init/main.c 2005-01-22 22:00:02.0 +0100 +++ linux-2.6.11-rc2/init/main.c2005-01-22 22:45:23.0 +0100 @@ -347,7 +347,7 @@ static void __init smp_init(void) } /* Any cleanup work */ - printk("Brought up %ld CPUs\n", (long)num_online_cpus()); + printk(KERN_INFO "Brought up %ld CPUs\n", (long)num_online_cpus()); smp_cpus_done(max_cpus); #if 0 /* Get other processors into their bootup holding patterns. */ @@ -428,6 +428,7 @@ asmlinkage void __init start_kernel(void */ lock_kernel(); page_address_init(); + printk(KERN_NOTICE); printk(linux_banner); setup_arch(_line); setup_per_cpu_areas(); @@ -451,7 +452,7 @@ asmlinkage void __init start_kernel(void preempt_disable(); build_all_zonelists(); page_alloc_init(); - printk("Kernel command line: %s\n", saved_command_line); + printk(KERN_NOTICE "Kernel command line: %s\n", saved_command_line); parse_early_param(); parse_args("Booting kernel", command_line, __start___param, __stop___param - __start___param, @@ -558,7 +559,7 @@ static void __init do_initcalls(void) local_irq_enable(); } if (msg) { - printk("error in initcall at 0x%p: " + printk(KERN_WARNING "error in initcall at 0x%p: " "returned with %s\n", *call, msg); } } @@ -677,7 +678,7 @@ static int init(void * unused) numa_default_policy(); if (sys_open((const char __user *) "/dev/console", O_RDWR, 0) < 0) - printk("Warning: unable to open an initial console.\n"); + printk(KERN_WARNING "Warning: unable to open an initial console.\n"); (void) sys_dup(0); (void) sys_dup(0); - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] Avoiding fragmentation through different allocator
On Fri, 21 Jan 2005, Marcelo Tosatti wrote: > On Thu, Jan 20, 2005 at 10:13:00AM +, Mel Gorman wrote: > > > > Hi Mel, > > I was thinking that it would be nice to have a set of high-order > intensive workloads, and I wonder what are the most common high-order > allocation paths which fail. > Agreed. As I am not fully sure what workloads require high-order allocations, I updated VMRegress to keep track of the count of allocations and released 0.11 (http://www.csn.ul.ie/~mel/projects/vmregress/vmregress-0.11.tar.gz). To use it to track allocations, do the following 1. Download and unpack vmregress 2. Patch a kernel with kernel_patches/v2.6/trace_pagealloc-count.diff . The patch currently requires the modified allocator but I can fix that up if people want it. Build and deploy the kernel 3. Build vmregress by ./configure --with-linux=/usr/src/linux-2.6.11-rc1-mbuddy (or whatever path is appropriate) make 4. Load the modules with; insmod src/code/vmregress_core.ko insmod src/sense/trace_alloccount.ko This will create a proc entry /proc/vmregress/trace_alloccount that looks something like; Allocations (V1) --- KernNoRclm 997453 370 500000 0000 KernRclm 35279000000 0000 UserRclm9870808000000 0000 Total 10903540 370 500000 0000 Frees - KernNoRclm 590965 244 280000 0000 KernRclm 227100 6050000 0000 UserRclm7974200 73 170000 0000 Total 19695805 747 1000000 0000 To blank the counters, use echo 0 > /proc/vmregress/trace_alloccount Whatever workload we come up with, this proc entry will tell us if it is exercising high-order allocations right now. > It mostly depends on hardware because most high-order allocations happen > inside device drivers? What are the kernel codepaths which try to do > high-order allocations and fallback if failed? > I'm not sure. I think that the paths we exercise right now will be largely artifical. For example, you can force order-2 allocations by scping a large file through localhost (because of the large MTU in that interface). I have not come up with another meaningful workload that guarentees high-order allocations yet. > To measure whether the cost of page migration offsets the ability to be > able to deliver high-order allocations we want a set of meaningful > performance tests? > Bear in mind, there are more considerations. The allocator potentially makes hotplug problems easier and could be easily tied into any page-zeroing system. Some of your own benchmarks also implied that the modified allocator helped some types of workloads which is beneficial in itself.The last consideration is HugeTLB pages, which I am hoping William will weigh in. Right now, I believe that the pool of huge pages is of a fixed size because of fragmentation difficulties. If we knew we could allocate huge pages, this pool would not have to be fixed. Some applications will heavily benefit from this. While databases are the obvious one, applications with large heaps will also benefit like Java Virtual Machines. I can dig up papers that measured this on Solaris although I don't have them at hand right now. We know right now that the overhead of this allocator is fairly low (anyone got benchmarks to disagree) but I understand that page migration is relatively expensive. The allocator also does not have adverse CPU+cache affects like migration and the concept is fairly simple. > Its quite possible that not all unsatisfiable high-order allocations > want to force page migration (which is quite expensive in terms of > CPU/cache). Only migrate on __GFP_NOFAIL ? > I still believe with the allocator, we will only have to migrate in exceptional circumstances. > William, that same tradeoff exists for the zone balancing through > migration idea you propose... > -- Mel Gorman - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
CELERON D Prescott Step C-0
Sirs, There is a bug in the CELERON D Prescott C-0 that prevents this processor to reboot the machine. This processor hangs when you try to reboot the machine. The people from ECS had already released in 2005/01/11 a BIOS update that covers this problem to the motherboard 648FX-A2(PCB:1.0). Here is the update description of the BIOS: "1. Add support Prescott E0, C0 stepping CPU 2. Update Prescott CPU Micro code 3. Patch some Prescott 2.4G CPU (FSB533/1M) cannot restart under WinXP" And Microsoft have released a workaround for this problem too: "(KB885626) This non-security critical update helps resolve an issue where a limited number of systems running a BIOS without production support for Intel Pentium 4 and Intel Celeron D processors based on Prescott C-0 stepping can potentially hang on Windows XP Service Pack 2 installation." Maybe someone from Intel can point to us where is the fix or errata of this problem? I wan´t to create a patch to workaround this issue. I´m having some problems with this processor using motherboards from MSI - 661FM-L. Thanks. Camilo - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: negative diskspace usage
On Fri, Jan 21, 2005 at 03:11:06PM +0100, Wichert Akkerman wrote: > After cleaning up a bit df suddenly showed interesting results: > > FilesystemSize Used Avail Use% Mounted on > /dev/md4 1019M -64Z 1.1G 101% /tmp > > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/md4 1043168 -73786976294838127736 1068904 101% /tmp > > This is on a ext3 filesystem on a 2.6.10-ac10 kernel. Funny. The Used column is total-free, so free was 2^66 + 964440. That 2^66 no doubt was 2^64 in a computation counting 4K-blocks, and arose at some point where a negative number was considered unsigned. But having available=1068904 larger than free=964440 is strange. I assume this was produced by statfs or statfs64 or so. You can check using "strace -e statfs64 df /dev/md4" that these really are the values returned by the kernel, so that we can partition the blame between df and the kernel. The values are computed by buf->f_blocks = es->s_blocks_count - overhead; buf->f_bfree = ext3_count_free_blocks (sb); buf->f_bavail = buf->f_bfree - es->s_r_blocks_count; that is: blocks = total - overhead, and available = free - reserved. strace shows three values, and I expect tune2fs or so will show 2 more. More available than free sounds like a negative count of reserved blocks. Are you still able to examine the situation? Andries - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH]sched: Isochronous class v2 for unprivileged soft rt scheduling
Ingo Molnar <[EMAIL PROTECTED]> writes: > thanks for the testing. The important result is that nice--20 > performance is roughly the same as SCHED_ISO. This somewhat > reduces the urgency of the introduction of SCHED_ISO. I can see why you feel that way, but don't share your conclusion. First, only SCHED_FIFO worked reliably in my tests. In Con's tests even that did not work. My system is probably better tuned for low latency than his. Until we can determine why there were so many xruns, it is premature to declare victory for either scheduler. Preferably, we should compare them on a well-tuned low-latency system running your Realtime Preemption kernel. Second, the nice(-20) scheduler provides no clear way to support multiple realtime priorities. This is necessary for some audio applications, but not jack_test3.2. Third, your prototype denies SCHED_FIFO to privileged threads. This is a serious problem, even for testing (though perhaps easy to fix). Most important, let's not forget that this long discussion started because ordinary users need access to realtime scheduling. Con's scheduler provides a solution for that problem. Your prototype does not. Chris Wright and Arjan van de Ven have outlined a proposal to address the privilege issue using rlimits. This is still the only workable alternative to the realtime LSM on the table. If the decision were up to me, I would choose the simplicity and better security of the LSM. But their approach is adequate, if implemented in a timely fashion. I would like to see some progress on this in addition to the scheduler work. People still need SCHED_FIFO for some applications. Right now, SCHED_ISO still looks better than nice(-20) for audio. It works without special permissions. The throttling threshold is adjustable with appropriate privileges. It has the potential to support multiple priorities. Being less entangled with SCHED_NORMAL makes me worry less about someone coming along later and messing it up while working on some unrelated problem. Right now for example, mounting an encrypted filesystem starts a `loop0' kernel thread at nice -20. -- joq - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch] Real-Time Preemption, -RT-2.6.11-rc2-V0.7.36-00
On Saturday 22 January 2005 07:29, Ingo Molnar wrote: >i have released the -V0.7.36-00 Real-Time Preemption patch, which > can be downloaded from the usual place: > > http://redhat.com/~mingo/realtime-preempt/ > >this is mainly a merge to 2.6.11-rc2. Humm, by the time I went after the patch it was up to -02. And I'm getting a couple of error exits: --- net/sched/sch_generic.c: In function `qdisc_restart': net/sched/sch_generic.c:128: error: label `requeue' used but not defined CC drivers/pci/setup-bus.o make[2]: *** [net/sched/sch_generic.o] Error 1 make[1]: *** [net/sched] Error 2 make[1]: *** Waiting for unfinished jobs --- And --- LD net/sunrpc/built-in.o make: *** [net] Error 2 make: *** Waiting for unfinished jobs --- So obviously I'm not running it. :-) One other item I don't think is related, in the last version (35-01) I had svn'd a new ieee1396 sub-directory from that ieee1394.org site into the drivers tree, and since it was less than a week old and worked right well, I just copied it over into the new kernel tree & reran the configs after renameing the existing ieee1394 to ieee1394-orig. >There was alot of merging to be done due to Thomas Gleixner's >spinlock/rwlock cleanups making it into upstream and due to the > upstream spinlock changes, and there were some networking related > conflicts as well, so these areas might introduce new regressions. > >the patch includes a fix that should resolve the microcode-update >related boot-time crash reported by K.R. Foley. It also includes a >verify_mm_writelocked() fix from Daniel Walker. > >to create a -V0.7.36-00 tree from scratch, the patching order is: > > http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.10.tar.bz2 > > http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.11-rc2.bz >2 > http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.11-r >c2-V0.7.36-00 > > Ingo -- Cheers, Gene "There are four boxes to be used in defense of liberty: soap, ballot, jury, and ammo. Please use in that order." -Ed Howdershelt (Author) 99.32% setiathome rank, not too shabby for a WV hillbilly Yahoo.com attorneys please note, additions to this message by Gene Heskett are: Copyright 2005 by Maurice Eugene Heskett, all rights reserved. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.11-rc2
On Sat, 22 Jan 2005, Udo A. Steinberg wrote: > > Linus, please apply the following patch from Martin. Please go through Davem, he's quite responsive, but prefers things like this to be sent to the netdev mailing list too if it hasn't been there already (netdev@oss.sgi.com). Linus - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: negative diskspace usage
I think the 101% usage is the interesting point here You are using more diskspace than you have available. I missed the first mail though, so what filesystem is this and which kernel version? On Saturday 22 January 2005 11:09, Wichert Akkerman wrote: > Previously [EMAIL PROTECTED] wrote: > > Wichert Akkerman wrote: > > > After cleaning up a bit df suddenly showed interesting results: > > > > > > FilesystemSize Used Avail Use% Mounted on > > > /dev/md4 1019M -64Z 1.1G 101% /tmp > > > > > > Filesystem 1K-blocks Used Available Use% Mounted on > > > /dev/md4 1043168 -73786976294838127736 1068904 101% > > > /tmp > > > > It looks like Windows 95's FDISK > > command created the partitions. > > There is no way you can see that from the output I gave, and it is also > incorrect. > > > The partition boundaries still remain where Windows 95 put them, and > > you have overlapping partitions. > > fdisk does not create overlapping partitions. > > Wichert. -- http://www.edusupport.nl;>EduSupport: Linux Desktop for schools and small to medium business in The Netherlands and Belgium - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Linux 2.6.11-rc2
On Sat, 22 Jan 2005 15:04:29 +0100 Martin Josefsson (MJ) wrote: MJ> On Fri, 2005-01-21 at 22:32 -0800, Udo A. Steinberg wrote: MJ> > MJ> > Connection tracking does not compile... MJ> The problem is when compiling without NAT... MJ> The patch below should fix it, I can compile both with and without NAT MJ> now. Thanks, this fixes my problem, too. Linus, please apply the following patch from Martin. -Udo. diff -X /home/gandalf/dontdiff.ny -urNp linux-2.6.11-rc2.orig/include/linux/netfilter_ipv4/ip_conntrack.h linux-2.6.11-rc2/include/linux/netfilter_ipv4/ip_conntrack.h --- linux-2.6.11-rc2.orig/include/linux/netfilter_ipv4/ip_conntrack.h 2005-01-22 12:17:39.0 +0100 +++ linux-2.6.11-rc2/include/linux/netfilter_ipv4/ip_conntrack.h 2005-01-22 13:55:25.0 +0100 @@ -122,33 +122,6 @@ do { \ #define IP_NF_ASSERT(x) #endif -struct ip_conntrack_expect -{ - /* Internal linked list (global expectation list) */ - struct list_head list; - - /* We expect this tuple, with the following mask */ - struct ip_conntrack_tuple tuple, mask; - - /* Function to call after setup and insertion */ - void (*expectfn)(struct ip_conntrack *new, -struct ip_conntrack_expect *this); - - /* The conntrack of the master connection */ - struct ip_conntrack *master; - - /* Timer function; deletes the expectation. */ - struct timer_list timeout; - -#ifdef CONFIG_IP_NF_NAT_NEEDED - /* This is the original per-proto part, used to map the -* expected connection the way the recipient expects. */ - union ip_conntrack_manip_proto saved_proto; - /* Direction relative to the master connection. */ - enum ip_conntrack_dir dir; -#endif -}; - struct ip_conntrack_counter { u_int64_t packets; @@ -206,6 +179,33 @@ struct ip_conntrack struct ip_conntrack_tuple_hash tuplehash[IP_CT_DIR_MAX]; }; +struct ip_conntrack_expect +{ + /* Internal linked list (global expectation list) */ + struct list_head list; + + /* We expect this tuple, with the following mask */ + struct ip_conntrack_tuple tuple, mask; + + /* Function to call after setup and insertion */ + void (*expectfn)(struct ip_conntrack *new, +struct ip_conntrack_expect *this); + + /* The conntrack of the master connection */ + struct ip_conntrack *master; + + /* Timer function; deletes the expectation. */ + struct timer_list timeout; + +#ifdef CONFIG_IP_NF_NAT_NEEDED + /* This is the original per-proto part, used to map the +* expected connection the way the recipient expects. */ + union ip_conntrack_manip_proto saved_proto; + /* Direction relative to the master connection. */ + enum ip_conntrack_dir dir; +#endif +}; + static inline struct ip_conntrack * tuplehash_to_ctrack(const struct ip_conntrack_tuple_hash *hash) { @@ -301,6 +301,7 @@ struct ip_conntrack_stat #define CONNTRACK_STAT_INC(count) (__get_cpu_var(ip_conntrack_stat).count++) +#ifdef CONFIG_IP_NF_NAT_NEEDED static inline int ip_nat_initialized(struct ip_conntrack *conntrack, enum ip_nat_manip_type manip) { @@ -308,5 +309,7 @@ static inline int ip_nat_initialized(str return test_bit(IPS_SRC_NAT_DONE_BIT, >status); return test_bit(IPS_DST_NAT_DONE_BIT, >status); } +#endif /* CONFIG_IP_NF_NAT_NEEDED */ + #endif /* __KERNEL__ */ #endif /* _IP_CONNTRACK_H */ diff -X /home/gandalf/dontdiff.ny -urNp linux-2.6.11-rc2.orig/net/ipv4/netfilter/ipt_CLUSTERIP.c linux-2.6.11-rc2/net/ipv4/netfilter/ipt_CLUSTERIP.c --- linux-2.6.11-rc2.orig/net/ipv4/netfilter/ipt_CLUSTERIP.c2005-01-22 12:17:40.0 +0100 +++ linux-2.6.11-rc2/net/ipv4/netfilter/ipt_CLUSTERIP.c 2005-01-22 13:55:49.0 +0100 @@ -29,6 +29,7 @@ #include #include #include +#include #define CLUSTERIP_VERSION "0.6" pgpjUUY1JDlce.pgp Description: PGP signature