Re: [RFC PATCH v2 3/6] sched: pack small tasks
On 12/14/2012 05:33 PM, Vincent Guittot wrote: > On 14 December 2012 02:46, Alex Shi wrote: >> On 12/13/2012 11:48 PM, Vincent Guittot wrote: >>> On 13 December 2012 15:53, Vincent Guittot >>> wrote: On 13 December 2012 15:25, Alex Shi wrote: > On 12/13/2012 06:11 PM, Vincent Guittot wrote: >> On 13 December 2012 03:17, Alex Shi wrote: >>> On 12/12/2012 09:31 PM, Vincent Guittot wrote: During the creation of sched_domain, we define a pack buddy CPU for each CPU when one is available. We want to pack at all levels where a group of CPU can be power gated independently from others. On a system that can't power gate a group of CPUs independently, the flag is set at all sched_domain level and the buddy is set to -1. This is the default behavior. On a dual clusters / dual cores system which can power gate each core and cluster independently, the buddy configuration will be : | Cluster 0 | Cluster 1 | | CPU0 | CPU1 | CPU2 | CPU3 | --- buddy | CPU0 | CPU0 | CPU0 | CPU2 | Small tasks tend to slip out of the periodic load balance so the best place to choose to migrate them is during their wake up. The decision is in O(1) as we only check again one buddy CPU >>> >>> Just have a little worry about the scalability on a big machine, like on >>> a 4 sockets NUMA machine * 8 cores * HT machine, the buddy cpu in whole >>> system need care 64 LCPUs. and in your case cpu0 just care 4 LCPU. That >>> is different on task distribution decision. >> >> The buddy CPU should probably not be the same for all 64 LCPU it >> depends on where it's worth packing small tasks > > Do you have further ideas for buddy cpu on such example? yes, I have several ideas which were not really relevant for small system but could be interesting for larger system We keep the same algorithm in a socket but we could either use another LCPU in the targeted socket (conf0) or chain the socket (conf1) instead of packing directly in one LCPU The scheme below tries to summaries the idea: Socket | socket 0 | socket 1 | socket 2 | socket 3 | LCPU| 0 | 1-15 | 16 | 17-31 | 32 | 33-47 | 48 | 49-63 | buddy conf0 | 0 | 0| 1 | 16| 2 | 32| 3 | 48| buddy conf1 | 0 | 0| 0 | 16| 16 | 32| 32 | 48| buddy conf2 | 0 | 0| 16 | 16| 32 | 32| 48 | 48| But, I don't know how this can interact with NUMA load balance and the better might be to use conf3. >>> >>> I mean conf2 not conf3 >> >> So, it has 4 levels 0/16/32/ for socket 3 and 0 level for socket 0, it >> is unbalanced for different socket. > > That the target because we have decided to pack the small tasks in > socket 0 when we have parsed the topology at boot. > We don't have to loop into sched_domain or sched_group anymore to find > the best LCPU when a small tasks wake up. iteration on domain and group is a advantage feature for power efficient requirement, not shortage. If some CPU are already idle before forking, let another waking CPU check their load/util and then decide which one is best CPU can reduce late migrations, that save both the performance and power. On the contrary, move task walking on each level buddies is not only bad on performance but also bad on power. Consider the quite big latency of waking a deep idle CPU. we lose too much.. > >> >> And the ground level has just one buddy for 16 LCPUs - 8 cores, that's >> not a good design, consider my previous examples: if there are 4 or 8 >> tasks in one socket, you just has 2 choices: spread them into all cores, >> or pack them into one LCPU. Actually, moving them just into 2 or 4 cores >> maybe a better solution. but the design missed this. > > You speak about tasks without any notion of load. This patch only care > of small tasks and light LCPU load, but it falls back to default > behavior for other situation. So if there are 4 or 8 small tasks, they > will migrate to the socket 0 after 1 or up to 3 migration (it depends > of the conf and the LCPU they come from). According to your patch, what your mean 'notion of load' is the utilization of cpu, not the load weight of tasks, right? Yes, I just talked about tasks numbers, but it naturally extends to the task utilization on cpu. like 8 tasks with 25% util, that just can full fill 2 CPUs. but clearly beyond the capacity of the buddy, so you need to wake up another CPU socket while local socket has some LCPU idle... > > Then, if too much small tasks wake up simultaneously on the same LCPU, > the default load balance will spread them in the core/cluster/socket > >> >> Obviously, more and more cores is the trend on any
Re: [PATCH] clk: factor: calculate rate by do_div
On Sun, Dec 16, 2012 at 4:54 AM, Mike Turquette wrote: > On Sat, Dec 15, 2012 at 8:41 AM, Haojian Zhuang > wrote: >> On Tue, Dec 4, 2012 at 9:32 AM, Haojian Zhuang >> wrote: >>> On Mon, Dec 3, 2012 at 4:14 PM, Haojian Zhuang >>> wrote: clk->rate = parent->rate / div * mult The formula is OK. But it may overflow while we do operate with unsigned long. So use do_div instead. Signed-off-by: Haojian Zhuang --- drivers/clk/clk-fixed-factor.c |5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/clk/clk-fixed-factor.c b/drivers/clk/clk-fixed-factor.c index a489985..1ef271e 100644 --- a/drivers/clk/clk-fixed-factor.c +++ b/drivers/clk/clk-fixed-factor.c @@ -28,8 +28,11 @@ static unsigned long clk_factor_recalc_rate(struct clk_hw *hw, unsigned long parent_rate) { struct clk_fixed_factor *fix = to_clk_fixed_factor(hw); + unsigned long long int rate; - return parent_rate * fix->mult / fix->div; + rate = (unsigned long long int)parent_rate * fix->mult; + do_div(rate, fix->div); + return (unsigned long)rate; } static long clk_factor_round_rate(struct clk_hw *hw, unsigned long rate, -- 1.7.10.4 >>> >>> Correct Mike's email address. >> >> Any comments? Does it mean that nobody want to fix the bug? > > Thanks for the patch. My apologies for letting this one slip through > the cracks but my normal email workflow was unavoidably disrupted and > I find myself playing catch-up with pending patches. > > The patch looks good to me but I'll change the $SUBJECT to "clk: > fixed-factor: round_rate should use do_div" and do some testing before > taking it in. > > Regards, > Mike It's nice. Thank you. Best Regards Haojian -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3/3] cpufreq: Don't use cpu removed during cpufreq_driver_unregister
This is how the core works: cpufreq_driver_unregister() - subsys_interface_unregister() - for_each_cpu() call cpufreq_remove_dev(), i.e. 0,1,2,3,4 when we unregister. cpufreq_remove_dev(): - Remove policy node - Call cpufreq_add_dev() for next cpu, sharing mask with removed cpu. i.e. When cpu 0 is removed, we call it for cpu 1. And when called for cpu 2, we call it for cpu 3. - cpufreq_add_dev() would call cpufreq_driver->init() - init would return mask as AND of 2, 3 and 4 for cluster A7. - cpufreq core would do online_cpu && policy->cpus Here is the BUG(). Because cpu hasn't died but we have just unregistered the cpufreq driver, online cpu would still have cpu 2 in it. And so thing go bad again. Solution: Keep cpumask of cpus that are registered with cpufreq core and clear cpus when we get a call from subsys_interface_unregister() via cpufreq_remove_dev(). Signed-off-by: Viresh Kumar --- drivers/cpufreq/cpufreq.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index a0a33bd..271d3be 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -47,6 +47,9 @@ static DEFINE_PER_CPU(char[CPUFREQ_NAME_LEN], cpufreq_cpu_governor); #endif static DEFINE_SPINLOCK(cpufreq_driver_lock); +/* Used when we unregister cpufreq driver */ +struct cpumask cpufreq_online_mask; + /* * cpu_policy_rwsem is a per CPU reader-writer semaphore designed to cure * all cpufreq/hotplug/workqueue/etc related lock issues. @@ -981,6 +984,7 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) * managing offline cpus here. */ cpumask_and(policy->cpus, policy->cpus, cpu_online_mask); + cpumask_and(policy->cpus, policy->cpus, &cpufreq_online_mask); policy->user_policy.min = policy->min; policy->user_policy.max = policy->max; @@ -1064,7 +1068,6 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif } per_cpu(cpufreq_cpu_data, cpu) = NULL; - #ifdef CONFIG_SMP /* if this isn't the CPU which is the parent of the kobj, we * only need to unlink, put and exit @@ -1185,6 +1188,7 @@ static int cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif) if (unlikely(lock_policy_rwsem_write(cpu))) BUG(); + cpumask_clear_cpu(cpu, &cpufreq_online_mask); retval = __cpufreq_remove_dev(dev, sif); return retval; } @@ -1903,6 +1907,8 @@ int cpufreq_register_driver(struct cpufreq_driver *driver_data) cpufreq_driver = driver_data; spin_unlock_irqrestore(&cpufreq_driver_lock, flags); + cpumask_setall(&cpufreq_online_mask); + ret = subsys_interface_register(&cpufreq_interface); if (ret) goto err_null_driver; -- 1.7.12.rc2.18.g61b472e -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 2/3] cpufreq: Notify governors when cpus are hot-[un]plugged
Because cpufreq core and governors worry only about the online cpus, if a cpu is hot [un]plugged, we must notify governors about it, otherwise be ready to expect something unexpected. We already have notifiers in the form of CPUFREQ_GOV_START/CPUFREQ_GOV_STOP, we just need to call them now. Signed-off-by: Viresh Kumar --- drivers/cpufreq/cpufreq.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index de99517..a0a33bd 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -751,11 +751,16 @@ static int cpufreq_add_dev_policy(unsigned int cpu, return -EBUSY; } + __cpufreq_governor(managed_policy, CPUFREQ_GOV_STOP); + spin_lock_irqsave(&cpufreq_driver_lock, flags); cpumask_copy(managed_policy->cpus, policy->cpus); per_cpu(cpufreq_cpu_data, cpu) = managed_policy; spin_unlock_irqrestore(&cpufreq_driver_lock, flags); + __cpufreq_governor(managed_policy, CPUFREQ_GOV_START); + __cpufreq_governor(managed_policy, CPUFREQ_GOV_LIMITS); + pr_debug("CPU already managed, adding link\n"); ret = sysfs_create_link(&dev->kobj, &managed_policy->kobj, @@ -1066,8 +1071,13 @@ static int __cpufreq_remove_dev(struct device *dev, struct subsys_interface *sif */ if (unlikely(cpu != data->cpu)) { pr_debug("removing link\n"); + __cpufreq_governor(data, CPUFREQ_GOV_STOP); cpumask_clear_cpu(cpu, data->cpus); spin_unlock_irqrestore(&cpufreq_driver_lock, flags); + + __cpufreq_governor(data, CPUFREQ_GOV_START); + __cpufreq_governor(data, CPUFREQ_GOV_LIMITS); + kobj = &dev->kobj; cpufreq_cpu_put(data); unlock_policy_rwsem_write(cpu); -- 1.7.12.rc2.18.g61b472e -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/3] cpufreq: Manage only online cpus
cpufreq core doesn't manage offline cpus and if driver->init() has returned mask including offline cpus, it may result in unwanted behavior by cpufreq core or governors. We need to get only online cpus in this mask. There are two places to fix this mask, cpufreq core and cpufreq driver. It makes sense to do this at common place and hence is done in core. Signed-off-by: Viresh Kumar --- drivers/cpufreq/cpufreq.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 1f93dbd..de99517 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -970,6 +970,13 @@ static int cpufreq_add_dev(struct device *dev, struct subsys_interface *sif) pr_debug("initialization failed\n"); goto err_unlock_policy; } + + /* +* affected cpus must always be the one, which are online. We aren't +* managing offline cpus here. +*/ + cpumask_and(policy->cpus, policy->cpus, cpu_online_mask); + policy->user_policy.min = policy->min; policy->user_policy.max = policy->max; -- 1.7.12.rc2.18.g61b472e -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
Dave Chinner wrote: > On Sun, Dec 16, 2012 at 03:35:49AM +, Eric Wong wrote: > > Dave Chinner wrote: > > > On Sun, Dec 16, 2012 at 12:25:49AM +, Eric Wong wrote: > > > > Alan Cox wrote: > > > > > On Sat, 15 Dec 2012 00:54:48 + > > > > > Eric Wong wrote: > > > > > > > > > > > Applications streaming large files may want to reduce disk spinups > > > > > > and > > > > > > I/O latency by performing large amounts of readahead up front > > > This could also be a use case for an audio/video player. > > Sure, but this can all be handled by a userspace application. If you > want to avoid/batch IO to enable longer spindown times, then you > have to load the file into RAM somewhere, and you don't need special > kernel support for that. >From userspace, I don't know when/if I'm caching too much and possibly getting the userspace cache itself swapped out. > > So no, there's no difference that matters between the approaches. > > But I think doing this in the kernel is easier for userspace users. > > The kernel provides mechanisms for applications to use. You have not > mentioned anything new that requires a new kernel mechanism to > acheive - you just need to have the knowledge to put the pieces > together properly. People have been solving this same problem for > the last 20 years without needing to tweak fadvise(). Or even having > an fadvise() syscall... fadvise() is fairly new, and AFAIK few apps use it. Perhaps if it were improved, more people would use it and not have to reinvent the wheel. > Nothing about low latency IO or streaming IO is simple or easy, and > changing how readahead works doesn't change that fact. All it does > is change the behaviour of every other application that uses > fadvise() to minimise IO latency I don't want to introduce regressions, either. Perhaps if part of the FADV_WILLNEED read-ahead were handled synchronously (maybe 2M?) and humongous large readaheads (like mine) went to the background, that would be a good trade off? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
Dave Chinner wrote: > On Sun, Dec 16, 2012 at 03:59:53AM +, Eric Wong wrote: > > I want the first read() to happen sooner than it would under current > > fadvise. > > You're not listening. You do not need the kernel to be modified to > avoid the latency of issuing 1GB of readahead on a file. > > You don't need to do readahead before the first read. Nor do you do > need to wait for 1GB of readhead to be issued before you do the > first read. > > You could do readahead *concurrently* with the first read, so the > first read only blocks until the readahead of the first part of the > file completes. i.e. just do readahead() in a background thread and > don't wait for it to complete before doing the first read. What you describe with concurrent readahead() is _exactly_ what my test program (in other email) does with the RA environment variable set. I know I do not _need_ fadvise + background WILLNEED support in the kernel. But I think the kernel can make life easier and allow us to avoid doing background threads or writing our own (inferior) caching in userspace. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On Sat, Dec 15, 2012 at 6:09 PM, Yinghai Lu wrote: > On Sat, Dec 15, 2012 at 1:40 PM, H. Peter Anvin wrote: >> On 12/15/2012 12:55 PM, Yinghai Lu wrote: >>> >>> BTW, did you look at smp boot problem with early_level4_pgt version? >> >> >> No, I have been busy with non-Linux stuff today. >> > > ok, i sorted it out. I will split it to small pieces and post them. I updated for-x86-boot branch with it, and it is based on linus:master tip:x86/mm tip:x86/urgent tip:x86/mm2. also attach 7 new ones are just added to that branch. Thanks Yinghai 0003-x86-call-copy_bootdata-early.patch Description: Binary data 0004-x86-mm-add-early-kernel-mapping-in-c.patch Description: Binary data 0005-x86-realmode-use-init_level4_pgt-to-set-trapmoline_p.patch Description: Binary data 0006-x86-mm-increase-BRK-area-for-early-page-table.patch Description: Binary data 0007-x86-64bit-early-PF-handler-set-page-table.patch Description: Binary data 0008-x86-64bit-PF-handler-set-page-to-cover-2M-only.patch Description: Binary data 0009-x86-64bit-Print-init-kernel-lowmap-correctly.patch Description: Binary data
Re: [ANNOUNCE] Multiple run-queues for BFS
On Sun, Dec 16, 2012 at 1:16 AM, Matthias Kohler wrote: > I'm doing a CPU-Scheduler based on BFS by Con Kolivas with support for > multiple run-queues. BFS in itself uses only one run-queue for all > CPU's. This avoids the load-balancing overhead, but does not scale well. > One run-queue per CPU does scale well, but then the scheduler has > load-balancing overhead. The scheduler I'm developing supports every > possible run-queues configuration. You can have one single run-queue > like in BFS, or you can have one run-queue per CPU, or something > completely different like one run-queue every two CPU's. This, in theory > would allow the scheduler to be fine-tuned to the hardware and the > workload. Cannot see the reason to install wings on horse back. Is it developed to schedule apps on advanced servers? Notebook, or smart phone? Good Weekend Hillf -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] make CONFIG_EXPERIMENTAL invisible and default
On Wednesday 2012-10-03 18:17, Greg Kroah-Hartman wrote: >> >> OK, I will bite... How should I flag an option that is initially only >> intended for those willing to take some level of risk? > >In the text say "You really don't want to enable this option, use at >your own risk!" Or something like that :) You know that won't not work, just like "everybody is encouraged to upgrade" for -stable. It needs to say "All users must disable this!" -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
On Sun, Dec 16, 2012 at 03:59:53AM +, Eric Wong wrote: > Dave Chinner wrote: > > On Sun, Dec 16, 2012 at 03:04:42AM +, Eric Wong wrote: > > > Dave Chinner wrote: > > > > On Sat, Dec 15, 2012 at 12:54:48AM +, Eric Wong wrote: > > > > > > > > > > Before: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <2.484832> > > > > > After: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <0.61> > > > > > > > > You've basically asked fadvise() to readahead the entire file if it > > > > can. That means it is likely to issue enough readahead to fill the > > > > IO queue, and that's where all the latency is coming from. If all > > > > you are trying to do is reduce the latency of the first read, then > > > > only readahead the initial range that you are going to need to read... > > > > > > Yes, I do want to read the whole file, eventually. So I want to put > > > the file into the page cache ASAP and allow the disk to spin down. > > > > Issuing readahead is not going to speed up the first read. Either > > you will spend more time issuing all the readahead, or you block > > waiting for the first read to complete. And the way you are issuing > > readahead does not guarantee the entire file is brought into the > > page cache > > I'm not relying on readahead to speed up the first read. > > By using fadvise/readahead, I want a _best-effort_ attempt to > keep the file in cache. > > > > But I also want the first read() to be fast. > > > > You can't have a pony, sorry. > > I want the first read() to happen sooner than it would under current > fadvise. You're not listening. You do not need the kernel to be modified to avoid the latency of issuing 1GB of readahead on a file. You don't need to do readahead before the first read. Nor do you do need to wait for 1GB of readhead to be issued before you do the first read. You could do readahead *concurrently* with the first read, so the first read only blocks until the readahead of the first part of the file completes. i.e. just do readahead() in a background thread and don't wait for it to complete before doing the first read. You could even do readahead *after* the first read, when the time it takes *doesn't matter* to the processing of the incoming data... > I want "less-bad" initial latency than I was getting. And you can do that by changing how you issue readahead from userspace. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
On Sun, Dec 16, 2012 at 03:35:49AM +, Eric Wong wrote: > Dave Chinner wrote: > > On Sun, Dec 16, 2012 at 12:25:49AM +, Eric Wong wrote: > > > Alan Cox wrote: > > > > On Sat, 15 Dec 2012 00:54:48 + > > > > Eric Wong wrote: > > > > > > > > > Applications streaming large files may want to reduce disk spinups and > > > > > I/O latency by performing large amounts of readahead up front > This could also be a use case for an audio/video player. Sure, but this can all be handled by a userspace application. If you want to avoid/batch IO to enable longer spindown times, then you have to load the file into RAM somewhere, and you don't need special kernel support for that. > So no, there's no difference that matters between the approaches. > But I think doing this in the kernel is easier for userspace users. The kernel provides mechanisms for applications to use. You have not mentioned anything new that requires a new kernel mechanism to acheive - you just need to have the knowledge to put the pieces together properly. People have been solving this same problem for the last 20 years without needing to tweak fadvise(). Or even having an fadvise() syscall... Nothing about low latency IO or streaming IO is simple or easy, and changing how readahead works doesn't change that fact. All it does is change the behaviour of every other application that uses fadvise() to minimise IO latency Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
Dave Chinner wrote: > On Sun, Dec 16, 2012 at 03:04:42AM +, Eric Wong wrote: > > Dave Chinner wrote: > > > On Sat, Dec 15, 2012 at 12:54:48AM +, Eric Wong wrote: > > > > > > > > Before: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <2.484832> > > > > After: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <0.61> > > > > > > You've basically asked fadvise() to readahead the entire file if it > > > can. That means it is likely to issue enough readahead to fill the > > > IO queue, and that's where all the latency is coming from. If all > > > you are trying to do is reduce the latency of the first read, then > > > only readahead the initial range that you are going to need to read... > > > > Yes, I do want to read the whole file, eventually. So I want to put > > the file into the page cache ASAP and allow the disk to spin down. > > Issuing readahead is not going to speed up the first read. Either > you will spend more time issuing all the readahead, or you block > waiting for the first read to complete. And the way you are issuing > readahead does not guarantee the entire file is brought into the > page cache I'm not relying on readahead to speed up the first read. By using fadvise/readahead, I want a _best-effort_ attempt to keep the file in cache. > > But I also want the first read() to be fast. > > You can't have a pony, sorry. I want the first read() to happen sooner than it would under current fadvise. If it's slightly slower that w/o fadvise, that's fine. The 1-2s slower with current fadvise is what bothers me. > > > Also, Pushing readahead off to a workqueue potentially allows > > > someone to DOS the system because readahead won't ever get throttled > > > in the syscall context... > > > > Yes, I'm a little worried about this, too. > > Perhaps squashing something like the following will work? > > > > diff --git a/mm/readahead.c b/mm/readahead.c > > index 56a80a9..51dc58e 100644 > > --- a/mm/readahead.c > > +++ b/mm/readahead.c > > @@ -246,16 +246,18 @@ void wq_page_cache_readahead(struct address_space > > *mapping, struct file *filp, > > { > > struct wq_ra_req *req; > > > > + nr_to_read = max_sane_readahead(nr_to_read); > > + if (!nr_to_read) > > + goto skip_ra; > > You do realise that anything you read ahead will be accounted as > inactive pages, so nr_to_read doesn't decrease at all as you fill > memory with readahead pages... Ah, ok, I'll see if I can rework it. > > req = kzalloc(sizeof(*req), GFP_ATOMIC); > > GFP_ATOMIC? Really? Sorry, I'm really new at this. > In reality, I think you are looking in the wrong place to fix your > "first read" latency problem. No matter what you do, there is going > to be IO latency on the first read. And readahead doesn't guarantee > that the pages are brought into the page cache (ever heard of > readahead thrashing?) so the way you are doing your readahead is not > going to result in you being able to spin the disk down after > issuing a readahead command... Right, I want a _best-effort_ readahead (which seems to be what an advisory interface should offer). > You've really got two problems - minimal initial latency, and > reading the file quickly and pinning it in memory until you get > around to needing it. The first can't be made faster by using > readahead, and the second can not be guaranteed by using readahead. Agreed. I think I overstated the requirements. I want "less-bad" initial latency than I was getting. So I don't mind if open()+fadvise()+read() is a couple of milliseconds slower than just open()+read(), but I do mind if fadvise() takes 1-2 seconds. > IOWs, readahead is the wrong tool for solving your problems. Minimal > IO latency from the first read will come from just issuing pread() > after open(), and ensuring that the file is read quickly and pinned > in memory can really only be done by allocating RAM in the > application to hold it until it is needed I definitely only want a best-effort method to put a file into memory. I want the kernel to decide whether or not to cache it. Thanks for looking at this! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
On Sun, Dec 16, 2012 at 03:04:42AM +, Eric Wong wrote: > Dave Chinner wrote: > > On Sat, Dec 15, 2012 at 12:54:48AM +, Eric Wong wrote: > > > Applications streaming large files may want to reduce disk spinups and > > > I/O latency by performing large amounts of readahead up front. > > > Applications also tend to read files soon after opening them, so waiting > > > on a slow fadvise may cause unpleasant latency when the application > > > starts reading the file. > > > > > > As a userspace hacker, I'm sometimes tempted to create a background > > > thread in my app to run readahead(). However, I believe doing this > > > in the kernel will make life easier for other userspace hackers. > > > > > > Since fadvise makes no guarantees about when (or even if) readahead > > > is performed, this change should not hurt existing applications. > > > > > > "strace -T" timing on an uncached, one gigabyte file: > > > > > > Before: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <2.484832> > > > After: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <0.61> > > > > You've basically asked fadvise() to readahead the entire file if it > > can. That means it is likely to issue enough readahead to fill the > > IO queue, and that's where all the latency is coming from. If all > > you are trying to do is reduce the latency of the first read, then > > only readahead the initial range that you are going to need to read... > > Yes, I do want to read the whole file, eventually. So I want to put > the file into the page cache ASAP and allow the disk to spin down. Issuing readahead is not going to speed up the first read. Either you will spend more time issuing all the readahead, or you block waiting for the first read to complete. And the way you are issuing readahead does not guarantee the entire file is brought into the page cache > But I also want the first read() to be fast. You can't have a pony, sorry. > > Also, Pushing readahead off to a workqueue potentially allows > > someone to DOS the system because readahead won't ever get throttled > > in the syscall context... > > Yes, I'm a little worried about this, too. > Perhaps squashing something like the following will work? > > diff --git a/mm/readahead.c b/mm/readahead.c > index 56a80a9..51dc58e 100644 > --- a/mm/readahead.c > +++ b/mm/readahead.c > @@ -246,16 +246,18 @@ void wq_page_cache_readahead(struct address_space > *mapping, struct file *filp, > { > struct wq_ra_req *req; > > + nr_to_read = max_sane_readahead(nr_to_read); > + if (!nr_to_read) > + goto skip_ra; You do realise that anything you read ahead will be accounted as inactive pages, so nr_to_read doesn't decrease at all as you fill memory with readahead pages... > + > req = kzalloc(sizeof(*req), GFP_ATOMIC); GFP_ATOMIC? Really? In reality, I think you are looking in the wrong place to fix your "first read" latency problem. No matter what you do, there is going to be IO latency on the first read. And readahead doesn't guarantee that the pages are brought into the page cache (ever heard of readahead thrashing?) so the way you are doing your readahead is not going to result in you being able to spin the disk down after issuing a readahead command... You've really got two problems - minimal initial latency, and reading the file quickly and pinning it in memory until you get around to needing it. The first can't be made faster by using readahead, and the second can not be guaranteed by using readahead. IOWs, readahead is the wrong tool for solving your problems. Minimal IO latency from the first read will come from just issuing pread() after open(), and ensuring that the file is read quickly and pinned in memory can really only be done by allocating RAM in the application to hold it until it is needed Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
Dave Chinner wrote: > On Sun, Dec 16, 2012 at 12:25:49AM +, Eric Wong wrote: > > Alan Cox wrote: > > > On Sat, 15 Dec 2012 00:54:48 + > > > Eric Wong wrote: > > > > > > > Applications streaming large files may want to reduce disk spinups and > > > > I/O latency by performing large amounts of readahead up front > > > > > > How does it compare benchmark wise with a user thread or using the > > > readahead() call ? > > > > Very well. > > > > My main concern is for the speed of the initial pread()/read() call > > after open(). > > > > Setting EARLY_EXIT means my test program _exit()s immediately after the > > first pread(). In my test program (below), I wait for the background > > thread to become ready before open() so I would not take overhead from > > pthread_create() into account. > > > > RA=1 uses a pthread + readahead() > > Not setting RA uses fadvise (with my patch) > > And if you don't use fadvise/readahead at all? Sorry for the confusion. I believe my other reply to you summarized what I wanted to say in my commit message and also reply to Alan. I want all the following things: - I want the first read to be fast. - I want to read the whole file eventually (probably slowly, as processing takes a while). - I want to let my disk spin down for as long as possible. This could also be a use case for an audio/video player. > You're not timing how long the first pread() takes at all. You're > timing the entire set of operations, including cloning a thread and > for the readahead(2) call and messages to be passed back and forth > through the eventfd interface to read the entire file. You're right, I screwed up the measurement. Using clock_gettime(), there's hardly a difference between the approaches and I can't get consistent timings between them. So no, there's no difference that matters between the approaches. But I think doing this in the kernel is easier for userspace users. -- 8< /* gcc -O2 -Wall -lpthread -lrt -o first_read first_read.c */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include #include static int efd1; static int efd2; static void clock_diff(struct timespec *a, const struct timespec *b) { a->tv_sec -= b->tv_sec; a->tv_nsec -= b->tv_nsec; if (a->tv_nsec < 0) { --a->tv_sec; a->tv_nsec += 10; } } static void * start_ra(void *unused) { struct stat st; eventfd_t val; int fd; /* tell parent to open() */ assert(eventfd_write(efd1, 1) == 0); /* wait for parent to tell us fd is ready */ assert(eventfd_read(efd2, &val) == 0); fd = (int)val; assert(fstat(fd, &st) == 0); assert(readahead(fd, 0, st.st_size) == 0); return NULL; } int main(int argc, char *argv[]) { char buf[16384]; pthread_t thr; int fd; struct timespec start; struct timespec finish; char *do_ra = getenv("RA"); if (argc != 2) { fprintf(stderr, "Usage: strace -T %s LARGE_FILE\n", argv[0]); return 1; } if (do_ra) { eventfd_t val; efd1 = eventfd(0, 0); efd2 = eventfd(0, 0); assert(efd1 >= 0 && efd2 >= 0 && "eventfd failed"); assert(pthread_create(&thr, NULL, start_ra, NULL) == 0); /* wait for child thread to spawn */ assert(eventfd_read(efd1, &val) == 0); } fd = open(argv[1], O_RDONLY); assert(fd >= 0 && "open failed"); assert(clock_gettime(CLOCK_MONOTONIC, &start) == 0); if (do_ra) { /* wake up the child thread, give it a chance to run */ assert(eventfd_write(efd2, fd) == 0); sched_yield(); } else assert(posix_fadvise(fd, 0, 0, POSIX_FADV_WILLNEED) == 0); assert(pread(fd, buf, sizeof(buf), 0) == sizeof(buf)); assert(clock_gettime(CLOCK_MONOTONIC, &finish) == 0); clock_diff(&finish, &start); fprintf(stderr, "elapsed: %lu.%09lu\n", finish.tv_sec, finish.tv_nsec); if (getenv("FULL_READ")) { ssize_t r; do { r = read(fd, buf, sizeof(buf)); } while (r > 0); assert(r == 0 && "EOF not reached"); } if (getenv("EXIT_EARLY")) _exit(0); if (do_ra) { assert(pthread_join(thr, NULL) == 0); assert(close(efd1) == 0); assert(close(efd2) == 0); } assert(close(fd) == 0); return 0; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majo
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
Eric Wong wrote: > Perhaps squashing something like the following will work? Last hunk should've had a return before skip_ra: --- a/mm/readahead.c +++ b/mm/readahead.c @@ -264,6 +266,10 @@ void wq_page_cache_readahead(struct address_space *mapping, struct file *filp, req->nr_to_read = nr_to_read; queue_work(readahead_wq, &req->work); + + return; +skip_ra: + fput(filp); } /* -- Eric Wong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
Dave Chinner wrote: > On Sat, Dec 15, 2012 at 12:54:48AM +, Eric Wong wrote: > > Applications streaming large files may want to reduce disk spinups and > > I/O latency by performing large amounts of readahead up front. > > Applications also tend to read files soon after opening them, so waiting > > on a slow fadvise may cause unpleasant latency when the application > > starts reading the file. > > > > As a userspace hacker, I'm sometimes tempted to create a background > > thread in my app to run readahead(). However, I believe doing this > > in the kernel will make life easier for other userspace hackers. > > > > Since fadvise makes no guarantees about when (or even if) readahead > > is performed, this change should not hurt existing applications. > > > > "strace -T" timing on an uncached, one gigabyte file: > > > > Before: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <2.484832> > > After: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <0.61> > > You've basically asked fadvise() to readahead the entire file if it > can. That means it is likely to issue enough readahead to fill the > IO queue, and that's where all the latency is coming from. If all > you are trying to do is reduce the latency of the first read, then > only readahead the initial range that you are going to need to read... Yes, I do want to read the whole file, eventually. So I want to put the file into the page cache ASAP and allow the disk to spin down. But I also want the first read() to be fast. > Also, Pushing readahead off to a workqueue potentially allows > someone to DOS the system because readahead won't ever get throttled > in the syscall context... Yes, I'm a little worried about this, too. Perhaps squashing something like the following will work? diff --git a/mm/readahead.c b/mm/readahead.c index 56a80a9..51dc58e 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -246,16 +246,18 @@ void wq_page_cache_readahead(struct address_space *mapping, struct file *filp, { struct wq_ra_req *req; + nr_to_read = max_sane_readahead(nr_to_read); + if (!nr_to_read) + goto skip_ra; + req = kzalloc(sizeof(*req), GFP_ATOMIC); /* * we are fire-and-forget, not having enough memory means readahead * is not worth doing anyways */ - if (!req) { - fput(filp); - return; - } + if (!req) + goto skip_ra; INIT_WORK(&req->work, wq_ra_req_fn); req->mapping = mapping; @@ -264,6 +266,9 @@ void wq_page_cache_readahead(struct address_space *mapping, struct file *filp, req->nr_to_read = nr_to_read; queue_work(readahead_wq, &req->work); + +skip_ra: + fput(filp); } /* -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
On Sun, Dec 16, 2012 at 12:25:49AM +, Eric Wong wrote: > Alan Cox wrote: > > On Sat, 15 Dec 2012 00:54:48 + > > Eric Wong wrote: > > > > > Applications streaming large files may want to reduce disk spinups and > > > I/O latency by performing large amounts of readahead up front > > > > How does it compare benchmark wise with a user thread or using the > > readahead() call ? > > Very well. > > My main concern is for the speed of the initial pread()/read() call > after open(). > > Setting EARLY_EXIT means my test program _exit()s immediately after the > first pread(). In my test program (below), I wait for the background > thread to become ready before open() so I would not take overhead from > pthread_create() into account. > > RA=1 uses a pthread + readahead() > Not setting RA uses fadvise (with my patch) And if you don't use fadvise/readahead at all? > # readahead + pthread. > $ EARLY_EXIT=1 RA=1 time ./first_read 1G > 0.00user 0.05system 0:01.37elapsed 3%CPU (0avgtext+0avgdata 600maxresident)k > 0inputs+0outputs (1major+187minor)pagefaults 0swaps > > # patched fadvise > $ EARLY_EXIT=1 time ./first_read 1G > 0.00user 0.00system 0:00.01elapsed 0%CPU (0avgtext+0avgdata 564maxresident)k > 0inputs+0outputs (1major+178minor)pagefaults 0swaps You're not timing how long the first pread() takes at all. You're timing the entire set of operations, including cloning a thread and for the readahead(2) call and messages to be passed back and forth through the eventfd interface to read the entire file. Why even bother with another thread for readahead()? It implements *exactly* the same operation as fadvise(WILL_NEED) (ie. force_page_cache_readahead), so should perform identically when called in exactly the same manner... But again, you are interesting in the latency of the first read of 16k from the file, but you are asking to readahead 1GB of data. Perhaps your shoul dbe asking for readahead of something more appropriate to what you care about - the first read Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 1/1] ARM: OMAP2+: common: remove use of vram
commit 966458f OMAP: remove vram allocator Removed the OMAP specific vram allocator but OMAP2 common was still trying to use it and this lead to the following build error: CC arch/arm/mach-omap2/common.o arch/arm/mach-omap2/common.c:19:23: fatal error: plat/vram.h: No such file or directory compilation terminated. make[1]: *** [arch/arm/mach-omap2/common.o] Error 1 make: *** [arch/arm/mach-omap2] Error 2 Signed-off-by: Javier Martinez Canillas --- arch/arm/mach-omap2/common.c |3 --- 1 files changed, 0 insertions(+), 3 deletions(-) diff --git a/arch/arm/mach-omap2/common.c b/arch/arm/mach-omap2/common.c index 5c2fd48..2dabb9e 100644 --- a/arch/arm/mach-omap2/common.c +++ b/arch/arm/mach-omap2/common.c @@ -16,8 +16,6 @@ #include #include -#include - #include "common.h" #include "omap-secure.h" @@ -32,7 +30,6 @@ int __weak omap_secure_ram_reserve_memblock(void) void __init omap_reserve(void) { - omap_vram_reserve_sdram_memblock(); omap_dsp_reserve_sdram_memblock(); omap_secure_ram_reserve_memblock(); omap_barrier_reserve_memblock(); -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
On Sat, Dec 15, 2012 at 12:54:48AM +, Eric Wong wrote: > Applications streaming large files may want to reduce disk spinups and > I/O latency by performing large amounts of readahead up front. > Applications also tend to read files soon after opening them, so waiting > on a slow fadvise may cause unpleasant latency when the application > starts reading the file. > > As a userspace hacker, I'm sometimes tempted to create a background > thread in my app to run readahead(). However, I believe doing this > in the kernel will make life easier for other userspace hackers. > > Since fadvise makes no guarantees about when (or even if) readahead > is performed, this change should not hurt existing applications. > > "strace -T" timing on an uncached, one gigabyte file: > > Before: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <2.484832> > After: fadvise64(3, 0, 0, POSIX_FADV_WILLNEED) = 0 <0.61> You've basically asked fadvise() to readahead the entire file if it can. That means it is likely to issue enough readahead to fill the IO queue, and that's where all the latency is coming from. If all you are trying to do is reduce the latency of the first read, then only readahead the initial range that you are going to need to read... Also, Pushing readahead off to a workqueue potentially allows someone to DOS the system because readahead won't ever get throttled in the syscall context... Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] vfs: update atimes over one day in the past or future
[ please place patches inline, not as attachments. ] On Sat, Dec 15, 2012 at 11:25:23PM +0800, ys wrote: > From 3d56c131b58a21c05bcd677b9d2ba915abcbf195 Mon Sep 17 00:00:00 2001 > From: yangsheng > Date: Sat, 15 Dec 2012 21:46:22 +0800 > Subject: [PATCH] vfs: update atimes over one day in the past or future > > Relatime should update the inode atime if it is more than one day > in the future. The original problem seen was a tarball that had > a bad atime in the distant future, but could also happen if someone > fat-fingers a "touch". The future atime will never be fixed. > > Without relatime enabled, a future atime is updated to the current > kernel time on access. Relatime is meant to reduce the frequency > of atime updates, not decide if whether the system clock or the > inode timestamp is correct or not. > > Signed-off-by: Yang Sheng > Signed-off-by: Andreas Dilger > Acked-by: David Chinner No I didn't. Please don't add tags that someone has not added directly in a reply to the original patch. > CC: sta...@vger.kernel.org > --- > fs/inode.c | 7 --- > 1 ??? 4 ???(+)? 3 ???(-) There's something wrong with the character encoding you are using... > > diff --git a/fs/inode.c b/fs/inode.c > index 14084b7..8713dc8 100644 > --- a/fs/inode.c > +++ b/fs/inode.c > @@ -1488,10 +1488,11 @@ static int relatime_need_update(struct vfsmount *mnt, > struct inode *inode, > return 1; > > /* > - * Is the previous atime value older than a day? If yes, > - * update atime: > + * Update atime if it's older than a day or more than a day > + * in the future, which we assume is corrupt. > + * A time in the future is not a corruption - the comment should reflect exactly what you've put in the commit message. i.e. that relatime is for reducing updates, not preventing atime from ever moving backwards. Also, you've added an extra line of whitespace damage that doesn't need to be there. FWIW, could you write a test for xfstests for this behaviour so we can confirm that we don't break it in future? Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: resend--[PATCH] improve read ahead in kernel
xtu4 wrote: > resend it, due to format error > > Subject: [PATCH] when system in low memory scenario, imaging there is a mp3 > play, ora video play, we need to read mp3 or video file > from memory to page cache,but when system lack of memory, > page cache of mp3 or video file will be reclaimed.once read > in memory, then reclaimed, it will cause audio or video > glitch,and it will increase the io operation at the same > time. To me, this basically describes how POSIX_FADV_NOREUSE should work. I would like to have this ability via fadvise (and not CONFIG_). Also, I think your patch has too many #ifdefs to be accepted. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On Sat, Dec 15, 2012 at 1:40 PM, H. Peter Anvin wrote: > On 12/15/2012 12:55 PM, Yinghai Lu wrote: >> >> BTW, did you look at smp boot problem with early_level4_pgt version? > > > No, I have been busy with non-Linux stuff today. > ok, i sorted it out. I will split it to small pieces and post them. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: /usr/include/linux/errno.h:1:23: fatal error: asm/errno.h: No such file or directory
On Sun, Dec 16, 2012 at 9:53 AM, Al Viro wrote: > On Sun, Dec 16, 2012 at 09:39:01AM +0800, Jeff Chua wrote: >> On Sun, Dec 16, 2012 at 9:28 AM, Al Viro wrote: >> > On Sun, Dec 16, 2012 at 09:23:38AM +0800, Jeff Chua wrote: >> >> How should the symbolic links be setup to compile the latest kernel? >> >> >> >> >> >> Currently I had these links and kernels compiled fine until 2 days ago. >> >> >> >> asm -> /usr/src/linux/include/uapi/asm-generic/ >> >> asm-generic -> /usr/src/linux/include/uapi/asm-generic >> >> linux -> /usr/src/linux/include/uapi/linux >> > >> > What symlinks? /usr/include/* should not contain any symlinks into >> > the kernel source. At all. >> >> Al, >> >> Oh, perhaps I'm having the right setup. Where should I get the kernel >> headers. > > From your libc. Which ought to have its own copies, normally coming from > make headers_install in kernel source. And yes, it had been that way > for many years by now. Userland should *not* blindly grab the kernel > headers. > > Incidentally, your 'asm' is obviously bogus - the headers that should end > up there ought to come from arch//include/uapi/asm (and _not_ > by pointing a symlink to it); yours points to the place where asm-generic > ones ought to have been copied from. Al, Thanks for the pointers. Will try as what you suggested:) Merry Christmas. Jeff -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: /usr/include/linux/errno.h:1:23: fatal error: asm/errno.h: No such file or directory
On Sun, Dec 16, 2012 at 09:39:01AM +0800, Jeff Chua wrote: > On Sun, Dec 16, 2012 at 9:28 AM, Al Viro wrote: > > On Sun, Dec 16, 2012 at 09:23:38AM +0800, Jeff Chua wrote: > >> How should the symbolic links be setup to compile the latest kernel? > >> > >> > >> Currently I had these links and kernels compiled fine until 2 days ago. > >> > >> asm -> /usr/src/linux/include/uapi/asm-generic/ > >> asm-generic -> /usr/src/linux/include/uapi/asm-generic > >> linux -> /usr/src/linux/include/uapi/linux > > > > What symlinks? /usr/include/* should not contain any symlinks into > > the kernel source. At all. > > Al, > > Oh, perhaps I'm having the right setup. Where should I get the kernel > headers. >From your libc. Which ought to have its own copies, normally coming from make headers_install in kernel source. And yes, it had been that way for many years by now. Userland should *not* blindly grab the kernel headers. Incidentally, your 'asm' is obviously bogus - the headers that should end up there ought to come from arch//include/uapi/asm (and _not_ by pointing a symlink to it); yours points to the place where asm-generic ones ought to have been copied from. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 21/21] documentation: drop vmtruncate
On 12/15/2012 05:00:38 AM, Marco Stornelli wrote: Removed vmtruncate Signed-off-by: Marco Stornelli Acked-by: Rob Landley (I can't help thinking there should have been some sort of feature-removal-schedule entry for this. Is there any sort of trailing record of major stuff that happened and when? The kernelnewbies http://kernelnewbies.org/LinuxVersions page is the best I've found, but it's a bit clumsy to use as a reference to find which version a change happened in. The https://lwn.net/Articles/2.6-kernel-api/ page was great but it stalled in 2009. Maybe I just miss kernel-traffic...) Rob-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: /usr/include/linux/errno.h:1:23: fatal error: asm/errno.h: No such file or directory
On Sun, Dec 16, 2012 at 9:39 AM, Jeff Chua wrote: > On Sun, Dec 16, 2012 at 9:28 AM, Al Viro wrote: >> On Sun, Dec 16, 2012 at 09:23:38AM +0800, Jeff Chua wrote: >>> How should the symbolic links be setup to compile the latest kernel? >>> >>> >>> Currently I had these links and kernels compiled fine until 2 days ago. >>> >>> asm -> /usr/src/linux/include/uapi/asm-generic/ >>> asm-generic -> /usr/src/linux/include/uapi/asm-generic >>> linux -> /usr/src/linux/include/uapi/linux >> >> What symlinks? /usr/include/* should not contain any symlinks into >> the kernel source. At all. > > Al, > > Oh, perhaps I'm having the right setup. Where should I get the kernel > headers. After removing the links to the kernel source, here what I > got ... > > make[1]: Nothing to be done for `all'. > HOSTCC scripts/basic/fixdep > In file included from /usr/include/bits/posix1_lim.h:160:0, > from /usr/include/limits.h:144, > from scripts/basic/fixdep.c:114: > /usr/include/bits/local_lim.h:38:26: fatal error: linux/limits.h: No > such file or directory > compilation terminated. > Oh, perhaps I'm having the right setup. Where should I get the kernel NOT having the right setup. Jeff. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: /usr/include/linux/errno.h:1:23: fatal error: asm/errno.h: No such file or directory
On Sun, Dec 16, 2012 at 9:28 AM, Al Viro wrote: > On Sun, Dec 16, 2012 at 09:23:38AM +0800, Jeff Chua wrote: >> How should the symbolic links be setup to compile the latest kernel? >> >> >> Currently I had these links and kernels compiled fine until 2 days ago. >> >> asm -> /usr/src/linux/include/uapi/asm-generic/ >> asm-generic -> /usr/src/linux/include/uapi/asm-generic >> linux -> /usr/src/linux/include/uapi/linux > > What symlinks? /usr/include/* should not contain any symlinks into > the kernel source. At all. Al, Oh, perhaps I'm having the right setup. Where should I get the kernel headers. After removing the links to the kernel source, here what I got ... make[1]: Nothing to be done for `all'. HOSTCC scripts/basic/fixdep In file included from /usr/include/bits/posix1_lim.h:160:0, from /usr/include/limits.h:144, from scripts/basic/fixdep.c:114: /usr/include/bits/local_lim.h:38:26: fatal error: linux/limits.h: No such file or directory compilation terminated. Thanks, Jeff. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: /usr/include/linux/errno.h:1:23: fatal error: asm/errno.h: No such file or directory
On Sun, Dec 16, 2012 at 09:23:38AM +0800, Jeff Chua wrote: > How should the symbolic links be setup to compile the latest kernel? > > > Currently I had these links and kernels compiled fine until 2 days ago. > > asm -> /usr/src/linux/include/uapi/asm-generic/ > asm-generic -> /usr/src/linux/include/uapi/asm-generic > linux -> /usr/src/linux/include/uapi/linux What symlinks? /usr/include/* should not contain any symlinks into the kernel source. At all. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] target updates for v3.8-rc1
On Sat, 2012-12-15 at 14:28 -0800, Linus Torvalds wrote: > On Fri, Dec 14, 2012 at 3:53 PM, Nicholas A. Bellinger > wrote: > > > > Here are the target updates for v3.8-rc1 merge window code. Please go > > ahead and pull from: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git > > for-next > > > > Just a heads up that there is a minor merge conflict that you'll > > encounter in target_handle_task_attr() code, that sfr has been carrying > > a fix for recently within -next. After dropping the HEAD section, the > > resolution should look like: > > Hmm. This is *not* how I resolved that conflict - that seems to drop the new > > complete(&cmd->t_transport_stop_comp); > > added by Roland in commit 3ea160b3e8f0 ("target: Fix handling of > aborted commands"). You are most certainly correct. > > So my conflict resolution looks different. > > Which may be a bug, of course. Nicholas, Roland, please check my end result, > Including the complete() from commit 3ea160b3e8f0 in the exception path for transport_check_aborted_status() within target_execute_cmd() code was/is the proper merge resolution. Thank you for taking care of this. --nab -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Resend][PATCH] PM: Move disabling/enabling runtime PM to late suspend/early resume
On Saturday, December 15, 2012 10:16:29 PM Jiri Kosina wrote: > On Sat, 15 Dec 2012, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki > > > > Currently, the PM core disables runtime PM for all devices right > > after executing subsystem/driver .suspend() callbacks for them > > and re-enables it right before executing subsystem/driver .resume() > > callbacks for them. This may lead to problems when there are > > two devices such that the .suspend() callback executed for one of > > them depends on runtime PM working for the other. In that case, > > if runtime PM has already been disabled for the second device, > > the first one's .suspend() won't work correctly (and analogously > > for resume). > > > > To make those issues go away, make the PM core disable runtime PM > > for devices right before executing subsystem/driver .suspend_late() > > callbacks for them and enable runtime PM for them right after > > executing subsystem/driver .resume_early() callbacks for them. This > > way the potential conflitcs between .suspend_late()/.resume_early() > > and their runtime PM counterparts are still prevented from happening, > > but the subtle ordering issues related to disabling/enabling runtime > > PM for devices during system suspend/resume are much easier to avoid. > > > > Reported-and-tested-by: Jan-Matthias Braun > > Signed-off-by: Rafael J. Wysocki > > Hi Rafael, > > just curious what is the reason for resend? Do you want to gather more > Acks before pushing this upstream? Well, I thought that some people might actually look at it when they found it again in their mailboxes. :-) Thanks, Rafael -- I speak only for myself. Rafael J. Wysocki, Intel Open Source Technology Center. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [patch 1/8] mm: memcg: only evict file pages when we have plenty
On 12/13/2012 10:55 PM, Michal Hocko wrote: On Wed 12-12-12 17:28:44, Johannes Weiner wrote: On Wed, Dec 12, 2012 at 04:53:36PM -0500, Rik van Riel wrote: On 12/12/2012 04:43 PM, Johannes Weiner wrote: dc0422c "mm: vmscan: only evict file pages when we have plenty" makes a point of not going for anonymous memory while there is still enough inactive cache around. The check was added only for global reclaim, but it is just as useful for memory cgroup reclaim. Signed-off-by: Johannes Weiner --- mm/vmscan.c | 19 ++- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 157bb11..3874dcb 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1671,6 +1671,16 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, denominator = 1; goto out; } + /* +* There is enough inactive page cache, do not reclaim +* anything from the anonymous working set right now. +*/ + if (!inactive_file_is_low(lruvec)) { + fraction[0] = 0; + fraction[1] = 1; + denominator = 1; + goto out; + } anon = get_lru_size(lruvec, LRU_ACTIVE_ANON) + get_lru_size(lruvec, LRU_INACTIVE_ANON); @@ -1688,15 +1698,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc, fraction[1] = 0; denominator = 1; goto out; - } else if (!inactive_file_is_low_global(zone)) { - /* -* There is enough inactive page cache, do not -* reclaim anything from the working set right now. -*/ - fraction[0] = 0; - fraction[1] = 1; - denominator = 1; - goto out; } } I believe the if() block should be moved to AFTER the check where we make sure we actually have enough file pages. You are absolutely right, this makes more sense. Although I'd figure the impact would be small because if there actually is that little file cache, it won't be there for long with force-file scanning... :-) Yes, I think that the result would be worse (more swapping) so the change can only help. I moved the condition, but it throws conflicts in the rest of the series. Will re-run tests, wait for Michal and Mel, then resend. Yes the patch makes sense for memcg as well. I guess you have tested this primarily with memcg. Do you have any numbers? Would be nice to put them into the changelog if you have (it should help to reduce swapping with heavy streaming IO load). Acked-by: Michal Hocko Hi Michal, I still can't understand why "The goto out means that it should be fine either way.", could you explain to me, sorry for my stupid. :-) Regards, Simon -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re:second hand digital duplicators on sale
Duplicator ink and master, copier toner and printer cartridge, spare parts and used machineone package service If you need, call me pls then we talk details. (*^__^*). Skype: luckyme991 Yahoo! Messenger: luckyme991 Tel: 0086 15989081521 * If you hate this email, pls reject it. Sorry.
Re: [PATCH] avoid entropy starvation due to stack protection
On Fri, Dec 14, 2012 at 06:36:41PM +0100, Stephan Mueller wrote: > >> That patch is about one week from a mainline merge, btw. > > Initially I was also thinking about get_random_int. But stack protection > > depends on non-predictable numbers to ensure it cannot be defeated. As > > get_random_int depends on MD5 which is assumed to be broken now, I > > discarded the idea of using get_random_int. The original use of get_random_int() was for applications where the speed impact of using a heavierweight cryptographic primitive was not something which could be tolerated. However, the strength of get_random_int() is actually pretty good. Note that we never expose the full MD5 hash; we only export the first 32-bits of the hash. So even if you ignore the effects of: hash[0] += current->pid + jiffies + get_cycles(); What we effectively have is a deterministic RNG which is using MD5, where the secret "key" is an initially seeded random value, and the state counter is the MD5 hash accumulator, where we only expose the first 32-bits with each turn of the crank. Now, MD5 has been cracked, but it's been cracked as a cryptographic checksum --- that is, given a particular MD5 hash, it is possible to find an input value which will result in that hash. That doesn't necessarily mean that it can be possible to take a stream of numbers produced by using the MD5 core in this particular RNG configuration, and determine the secret value used for the RNG (a collision attack allows you to find a possible input value; that value may not be the one used as the secret). That being said, it's not a question which has been studied extensively by cryptographers, and so I can easily see how people might be paranoid about whether this approach is good enough. In the case of initializing 16 bits of randomness passed to userspace after a exec(), performance is presumably not as important, so if someone wanted to use something that was stronger from a certificational point of view than get_random_int(), that's certainly understandable. However, it's not clear to me that replicating the full /dev/random pool infrastructure if you're never going to mix in any additional randomness is the best way to go about things. What I would do instead is use an AES-based cryptographic random number generator. That is, at boot time, grab enough randomness to for an AES key, and then use that key to create a cryptographic random number generator by encrypting a counter with said AES key. This is a cryptographic primitive which has been very carefully studied, and for architectures where you have a hardware support for AES (including ARMv8, Power 7, Sparc T4, as well as x86 processors with the AES-NI instructions), this will be much faster and require much less memory and CPU resources than replicating the /dev/urandom infrastructure. Whether or not we really need this level of paranoia for hardening stack randomization I'll leave for someone else to decide. Personally, my philosophy is if someone has managed to get unprivileged shell acess, trying to protect against a privilege escalation attack is largely hopeless on most Linux systems. The name of the game is to protect against someone who does not yet have the ability to run arbitrary unprivileged code on the system of interest. In that case, the attacker isn't going to be able to get access to the output of get_random_int(), so even if there was a cryptographic weakness where an attacker who had access to the get_random_int() output stream could guess the internal state of the MD5-based RNG, in the case of a remote attacker, they wouldn't have access to the output of the RNG in the first place. Regards, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANNOUNCE] 3.7-ck1, BFS 426 for linux-3.7
These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but suitable to any commodity hardware workload. Apply to 3.7.x: -ck-ckhttp://ck.kolivas.org/patches/3.0/3.7/3.71/patch-3.71.bz2 or -ck-ckhttp://ck.kolivas.org/patches/3.0/3.7/3.71/patch-3.71.lrz Broken out tarball: -ck-ckhttp://ck.kolivas.org/patches/3.0/3.7/3.71/3.71-broken-out.tar.bz2 or -ck-ckhttp://ck.kolivas.org/patches/3.0/3.7/3.71/3.71-broken-out.tar.lrz Discrete patches: -ckhttp://ck.kolivas.org/patches/3.0/3.7/3.71/patches/ Latest BFS by itself: http://ck.kolivas.org/patches/bfs/3.0/3.7/3.7-sched-bfs-426.patch Web: http://kernel.kolivas.org Code blog when I feel like it: http://ck-hack.blogspot.com/ -- -ck -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
Alan Cox wrote: > On Sat, 15 Dec 2012 00:54:48 + > Eric Wong wrote: > > > Applications streaming large files may want to reduce disk spinups and > > I/O latency by performing large amounts of readahead up front > > How does it compare benchmark wise with a user thread or using the > readahead() call ? Very well. My main concern is for the speed of the initial pread()/read() call after open(). Setting EARLY_EXIT means my test program _exit()s immediately after the first pread(). In my test program (below), I wait for the background thread to become ready before open() so I would not take overhead from pthread_create() into account. RA=1 uses a pthread + readahead() Not setting RA uses fadvise (with my patch) # readahead + pthread. $ EARLY_EXIT=1 RA=1 time ./first_read 1G 0.00user 0.05system 0:01.37elapsed 3%CPU (0avgtext+0avgdata 600maxresident)k 0inputs+0outputs (1major+187minor)pagefaults 0swaps # patched fadvise $ EARLY_EXIT=1 time ./first_read 1G 0.00user 0.00system 0:00.01elapsed 0%CPU (0avgtext+0avgdata 564maxresident)k 0inputs+0outputs (1major+178minor)pagefaults 0swaps Perhaps I screwed up my readahead() + threads path badly, but there seems to be a huge benefit in using fadvise with my patch. I'm not sure why readahead() + thread does so badly, even... Even if I badly screwed up my use of readahead(), the benefit of my patch spares others from screwing up when using threads+readahead() :) FULL_READ - While full, fast reads are not my target use case, there's no noticeable regression here, either. Results for doing a full, fast read on the file are closer and fluctuate more between runs. # readahead + pthread. $ FULL_READ=1 EARLY_EXIT=1 RA=1 time ./first_read 1G 0.01user 1.10system 0:09.24elapsed 12%CPU (0avgtext+0avgdata 596maxresident)k 0inputs+0outputs (1major+186minor)pagefaults 0swaps # patched fadvise FULL_READ=1 EARLY_EXIT=1 time ./first_read 1G 0.01user 1.04system 0:09.22elapsed 11%CPU (0avgtext+0avgdata 564maxresident)k 0inputs+0outputs (1major+178minor)pagefaults 0swaps - 8< -- /* gcc -O2 -Wall -lpthread -o first_read first_read.c */ #define _GNU_SOURCE #include #include #include #include #include #include #include #include #include #include #include static int efd1; static int efd2; static void * start_ra(void *unused) { struct stat st; eventfd_t val; int fd; /* tell parent to open() */ assert(eventfd_write(efd1, 1) == 0); /* wait for parent to tell us fd is ready */ assert(eventfd_read(efd2, &val) == 0); fd = (int)val; assert(fstat(fd, &st) == 0); assert(readahead(fd, 0, st.st_size) == 0); return NULL; } int main(int argc, char *argv[]) { char buf[16384]; pthread_t thr; int fd; char *do_ra = getenv("RA"); if (argc != 2) { fprintf(stderr, "Usage: strace -T %s LARGE_FILE\n", argv[0]); return 1; } if (do_ra) { eventfd_t val; efd1 = eventfd(0, 0); efd2 = eventfd(0, 0); assert(efd1 >= 0 && efd2 >= 0 && "eventfd failed"); assert(pthread_create(&thr, NULL, start_ra, NULL) == 0); /* wait for child thread to spawn */ assert(eventfd_read(efd1, &val) == 0); } fd = open(argv[1], O_RDONLY); assert(fd >= 0 && "open failed"); if (do_ra) { /* wake up the child thread, give it a chance to run */ assert(eventfd_write(efd2, fd) == 0); sched_yield(); } else assert(posix_fadvise(fd, 0, 0, POSIX_FADV_WILLNEED) == 0); assert(pread(fd, buf, sizeof(buf), 0) == sizeof(buf)); if (getenv("FULL_READ")) { ssize_t r; do { r = read(fd, buf, sizeof(buf)); } while (r > 0); assert(r == 0 && "EOF not reached"); } if (getenv("EXIT_EARLY")) _exit(0); if (do_ra) { assert(pthread_join(thr, NULL) == 0); assert(close(efd1) == 0); assert(close(efd2) == 0); } assert(close(fd) == 0); return 0; } - 8< -- Thanks for your interest in this! -- Eric Wong -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
On 12/15/2012 01:37 PM, Dave Jones wrote: On Sat, Dec 15, 2012 at 11:58:00AM -0800, Linus Torvalds wrote: > It might also be that it causes some massive corruption at boot time, > but it then requires that that particular memory is actually used. So > maybe it's not so much about the memory map except indirectly. I wonder if this might explain the XFS corruption I've been seeing the last couple days. Won't be able to get at the affected laptop until Monday to find out.. It seems somewhat unlikely, but not implausible, since the trampoline page table is only in use for very brief moments and usually not very often at all, but if it is just completely screwed and we do fandango on memory... yes we could have problems. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
Anybody see anything else? And why do we have to call the get-time calls so early? Couldn't we move them later and avoid all the crazy "let's create silly magical page tables just for the idiotic EFI problems". We need them anyway... actually the whole point of that patch is to try to *remove* silly magical page tables just for EFI and use another set of silly magical page tables we need anyway (for S3 resume, SMP bootup and so on.) Reducing the sheer number of silly magical page tables has been a priority for some time -- I want to get it down to one if we can. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
On Sat, Dec 15, 2012 at 2:05 PM, Yinghai Lu wrote: > On Sat, Dec 15, 2012 at 1:06 PM, Linus Torvalds > wrote: >> >> I've reverted the commit. > > more than that, 3 commits just after that commit should be reverted at > the same time. > they all depend on that commit. Thanks for pointing that out, and just to make sure I verified that on my Macbook Air which does use EFI. It was broken by the single revert, and fixed by the additional three reverts. Sadly: > and first checking of that commit, it would have problem with system > more than 512g ... That particular bug isn't the cause for my non-EFI problems, since I don't have that kind of memory.. So there is something else going on in addition to the bug you found. But good eye. Anybody see anything else? And why do we have to call the get-time calls so early? Couldn't we move them later and avoid all the crazy "let's create silly magical page tables just for the idiotic EFI problems". And while I'm asking, why the f*ck did Intel do that crazy EFI thing in the first place again? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] f2fs: request for tree inclusion
Hi Linus, I'm seeing that f2fs has not been merged yet. Could you give me any notice for this? Management priority, or something else? BTW, I have added a couple of bug fixes since "for-3.8-merge". Which is better sending between [GIT PULL v2] or additional pull request after merge? Thanks, Jaegeuk Kim 2012-12-11 (화), 16:58 +0900, Jaegeuk Kim: > Hi Linus, > > This is the first pull request for tree inclusion of Flash-Friendly File > System (F2FS) towards the 3.8 merge window. > > http://lwn.net/Articles/518718/ > http://lwn.net/Articles/518988/ > http://en.wikipedia.org/wiki/F2FS > > The f2fs has been in the linux-next tree for a while, and several issues > have been cleared as described in the signed tag below. > And also, I've done testing f2fs successfully based on Linux 3.7 with > the following test scenarios. > > - Reliability test: > Run fsstress on an SSD partition. > > - Robustness test: > Conduct sudden-power-off and examine the fs consistency repeatedly, > while running a reliability test. > > So, please pull the f2fs filesystem. > If I'm missing any issues or made mistakes, please let me know. > > Thanks, > Jaegeuk Kim > > The following changes since commit > 29594404d7fe73cd80eaa4ee8c43dcc53970c60e: > > Linux 3.7 (2012-12-10 19:30:57 -0800) > > are available in the git repository at: > > git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git > tags/for-3.8-merge > > for you to fetch changes up to e6aa9f36b2bfd6b30072c07b34f2a24becf1: > > f2fs: fix tracking parent inode number (2012-12-11 13:43:45 +0900) > > > Introduce a new file system, Flash-Friendly File System (F2FS), to Linux > 3.8. > > Highlights: > - Add initial f2fs source codes > - Fix an endian conversion bug > - Fix build failures on random configs > - Fix the power-off-recovery routine > - Minor cleanup, coding style, and typos patches > > Greg Kroah-Hartman (1): > f2fs: move proc files to debugfs > > Huajun Li (1): > f2fs: fix a typo in f2fs documentation > > Jaegeuk Kim (22): > f2fs: add document > f2fs: add on-disk layout > f2fs: add superblock and major in-memory structure > f2fs: add super block operations > f2fs: add checkpoint operations > f2fs: add node operations > f2fs: add segment operations > f2fs: add file operations > f2fs: add address space operations for data > f2fs: add core inode operations > f2fs: add inode operations for special inodes > f2fs: add core directory operations > f2fs: add xattr and acl functionalities > f2fs: add garbage collection functions > f2fs: add recovery routines for roll-forward > f2fs: update Kconfig and Makefile > f2fs: update the f2fs document > f2fs: fix endian conversion bugs reported by sparse > f2fs: adjust kernel coding style > f2fs: resolve build failures > f2fs: cleanup the f2fs_bio_alloc routine > f2fs: fix tracking parent inode number > > Namjae Jeon (10): > f2fs: fix the compiler warning for uninitialized use of variable > f2fs: show error in case of invalid mount arguments > f2fs: remove unneeded memset from init_once > f2fs: check read only condition before beginning write out > f2fs: remove unneeded initialization > f2fs: move error condition for mkdir at proper place > f2fs: rewrite f2fs_bio_alloc to make it simpler > f2fs: make use of GFP_F2FS_ZERO for setting gfp_mask > f2fs: remove redundant call to f2fs_put_page in delete entry > f2fs: introduce accessor to retrieve number of dentry slots > > Sachin Kamat (1): > f2fs: remove unneeded version.h header file from f2fs.h > > Wei Yongjun (1): > f2fs: remove unused variable > > Documentation/filesystems/00-INDEX |2 + > Documentation/filesystems/f2fs.txt | 421 + > fs/Kconfig |1 + > fs/Makefile|1 + > fs/f2fs/Kconfig| 53 ++ > fs/f2fs/Makefile |7 + > fs/f2fs/acl.c | 414 + > fs/f2fs/acl.h | 57 ++ > fs/f2fs/checkpoint.c | 794 > fs/f2fs/data.c | 702 ++ > fs/f2fs/debug.c| 361 > fs/f2fs/dir.c | 672 ++ > fs/f2fs/f2fs.h | 1083 ++ > fs/f2fs/file.c | 636 + > fs/f2fs/gc.c | 742 +++ > fs/f2fs/gc.h | 117 +++ > fs/f2fs/hash.c | 97 ++ > fs/f2fs/inode.c| 268 ++ > fs/f2fs/namei.c| 503 ++ > fs/f2fs/node.c | 1764 > ++
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On 12/15/2012 03:15 PM, Yinghai Lu wrote: That is for the kernel region itself (that code is actually unchanged from the current code), and yes, we could cap that one to _end if there are systems which have bugs in that area. The dynamic page tables map 1G aligned at a time. dynamic should be 2M too. AMD system: http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commitdiff;h=66520ebc2df3fe52eb4792f8101fac573b766baf BIOS-e820: [mem 0x0001-0x00e037ff] usable BIOS-e820: [mem 0x00e03800-0x00fc] reserved BIOS-e820: [mem 0x0100-0x011ffeff] usable the hole is not 1G aligned. or HT region is from e04000 ? The HT region starts at 0xfd -- after that reserved region, so I have no idea what that particular system is trying to do or what is requirements are (nor what its MTRR setup is, since you didn't post it.) -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On Sat, Dec 15, 2012 at 2:17 PM, H. Peter Anvin wrote: > On 12/15/2012 02:13 PM, Yinghai Lu wrote: >> >> >> AMD system could have all mem between TOLM and TOHM all WB, and don >> need to set them in MTRRs entries. >> > > I include the TOM2 mechanism in the overall umbrella of MTRRs for this > purpose. > > >> and also your switchover change that handle cross 1G, and 512g, and it >> is not 1G aligned. >> for example, if kernel at 4095G+512M, it will map from 4095G+512M to >> 4096G + 512M. > > > That is for the kernel region itself (that code is actually unchanged from > the current code), and yes, we could cap that one to _end if there are > systems which have bugs in that area. The dynamic page tables map 1G > aligned at a time. dynamic should be 2M too. AMD system: http://git.kernel.org/?p=linux/kernel/git/tip/tip.git;a=commitdiff;h=66520ebc2df3fe52eb4792f8101fac573b766baf BIOS-e820: [mem 0x0001-0x00e037ff] usable BIOS-e820: [mem 0x00e03800-0x00fc] reserved BIOS-e820: [mem 0x0100-0x011ffeff] usable the hole is not 1G aligned. or HT region is from e04000 ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] avoid entropy starvation due to stack protection
Am 15.12.2012 20:15, schrieb Ondřej Bílka: Why not use nonblocking pool and seed nonblocking pool only with half of collected entropy to get /dev/random in almost all practical scenarios nonblocking? I would not recommend changing /dev/urandom. First, we would change the characteristic of a kernel interface a lot of user space cryptographic components rely on. According to Linus that is typically a no-go. Moreover, the question can be raised, where do we pick the number of 50%, why not 30% or 70%, why (re)seeding it at all? Also, let us assume we pick 50% and we leave the create_elf_tables function as is (i.e. it pulls from get_random_bytes), I fear that we do not win at all. Our discussed problem is the depletion of the entropy via nonblocking_pool due to every execve() syscall requires 128 bits of data from nonblocking_pool. Even if we seed nonblocking_pool more rarely, we still deplete the entropy of the input_pool and thus deplete the entropy we want for cryptographic purposes a particular user has. Thus, my recommendation is to disconnect the system entropy requirements from the user entropy requirements as much as possible. I am aware that there are in-kernel cryptographic requirements that must seed itself via the good entropy. And those users shall be rather left untouched -- i.e. they should still call get_random_bytes. But for users that do not require cryptographic strength, but a strength against guessing of a random number on the local system for a decent time (like the stack protection or ASLR), we can use a slightly less perfect DRNG which is seeded with good entropy and never thereafter. Ciao Stephan On Thu, Dec 13, 2012 at 08:44:36AM +0100, Stephan Mueller wrote: On 13.12.2012 01:43:21, +0100, Andrew Morton wrote: Hi Andrew, On Tue, 11 Dec 2012 13:33:04 +0100 Stephan Mueller wrote: Some time ago, I noticed the fact that for every newly executed process, the function create_elf_tables requests 16 bytes of randomness from get_random_bytes. This is easily visible when calling while [ 1 ] do cat /proc/sys/kernel/random/entropy_avail sleep 1 done Please see http://ozlabs.org/~akpm/mmotm/broken-out/binfmt_elfc-use-get_random_int-to-fix-entropy-depleting.patch That patch is about one week from a mainline merge, btw. Initially I was also thinking about get_random_int. But stack protection depends on non-predictable numbers to ensure it cannot be defeated. As get_random_int depends on MD5 which is assumed to be broken now, I discarded the idea of using get_random_int. Moreover, please consider that get_cycles is an architecture-specific function that on some architectures only returns 0 (For all architectures where this is implemented, you have no guarantee that it increments as a high-resolution timer). So, the quality of get_random_int is questionable IMHO for the use as a stack protector. Also note, that other in-kernel users of get_random_bytes may be converted to using the proposed kernel pool to avoid more entropy drainage. Please note that the suggested approach of fully seeding a deterministic RNG never followed by a re-seeding is used elsewhere (e.g. the OpenSSL RNG). Therefore, I think the suggested approach is viable. Ciao Stephan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] target updates for v3.8-rc1
On Fri, Dec 14, 2012 at 3:53 PM, Nicholas A. Bellinger wrote: > > Here are the target updates for v3.8-rc1 merge window code. Please go > ahead and pull from: > > git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git > for-next > > Just a heads up that there is a minor merge conflict that you'll > encounter in target_handle_task_attr() code, that sfr has been carrying > a fix for recently within -next. After dropping the HEAD section, the > resolution should look like: Hmm. This is *not* how I resolved that conflict - that seems to drop the new complete(&cmd->t_transport_stop_comp); added by Roland in commit 3ea160b3e8f0 ("target: Fix handling of aborted commands"). So my conflict resolution looks different. Which may be a bug, of course. Nicholas, Roland, please check my end result, Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] fadvise: perform WILLNEED readahead in a workqueue
On Sat, 15 Dec 2012 00:54:48 + Eric Wong wrote: > Applications streaming large files may want to reduce disk spinups and > I/O latency by performing large amounts of readahead up front How does it compare benchmark wise with a user thread or using the readahead() call ? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] fbdev changes for 3.8
On Sat, Dec 15, 2012 at 01:11:04PM -0800, Linus Torvalds wrote: > On Fri, Dec 14, 2012 at 2:22 AM, Tomi Valkeinen > wrote: > > Hi Linus, > > > > Florian, the fbdev maintainer, has been very busy lately, so I offered to > > send > > the pull request for fbdev for this merge window. > > Pulled. However, with this I get the Kconfig question > >OMAP2+ Display Subsystem support (OMAP2_DSS) [N/m/y/?] (NEW) > > which doesn't make a whole lot of sense on x86-64, unless there's > something about OMAP2 that I don't know. > > So I'd suggest making that OMAP2_DSS be dependent on OMAP2. Or at > least ARM. Because showing it to anybody else seems insane. > > Same goes for FB_OMAP2 for that matter. I realize that it's likely > nice to get compile testing for this on x86-64 too, but if that's the > intent, we need to think about it some more. I don't think it's good > to ask actual normal users questions like this just for compile > coverage. This OMAP stuff has been creeping into x86 builds for a while. Grep from my current build config .. # CONFIG_OMAP_OCP2SCP is not set # CONFIG_KEYBOARD_OMAP4 is not set # CONFIG_OMAP2_DSS is not set # CONFIG_OMAP_USB2 is not set There was some other arm-ism that does the same that I' currently forgetting, or maybe that got fixed.. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On 12/15/2012 02:13 PM, Yinghai Lu wrote: AMD system could have all mem between TOLM and TOHM all WB, and don need to set them in MTRRs entries. I include the TOM2 mechanism in the overall umbrella of MTRRs for this purpose. and also your switchover change that handle cross 1G, and 512g, and it is not 1G aligned. for example, if kernel at 4095G+512M, it will map from 4095G+512M to 4096G + 512M. That is for the kernel region itself (that code is actually unchanged from the current code), and yes, we could cap that one to _end if there are systems which have bugs in that area. The dynamic page tables map 1G aligned at a time. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: DMAR and DRHD errors[DMAR:[fault reason 06] PTE Read access is not set] Vt-d & intel_iommu
On 12/14/2012 03:32 PM, Don Dutile wrote: On 12/13/2012 04:50 AM, Jason Gao wrote: Dear List: Description of problem: After installed Centos 6.3(RHEL6.3) on my Dell R710(lastest bios:Version: 6.3.0,Release Date: 07/24/2012) server,and updated lastest kernel "2.6.32-279.14.1.el6.x86_64",I want to use the Intel 82576 ET Dual Port nic's SR-IOV feature,assigning VFs to kvm guest appended kernel boot parameter: intel_iommu=on,after boot with the following messages: Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 2 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe65000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 102 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe8a000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set Dec 13 16:58:15 2 kernel: scsi 0:0:32:0: Enclosure DP BACKPLANE1.07 PQ: 0 ANSI: 5 Dec 13 16:58:15 2 kernel: DRHD: handling fault status reg 202 Dec 13 16:58:15 2 kernel: DMAR:[DMA Read] Request device [03:00.0] fault addr ffe89000 Dec 13 16:58:15 2 kernel: DMAR:[fault reason 06] PTE Read access is not set full dmesg detail: http://pastebin.com/BzFQV0jU lspci -vvv full detail: http://pastebin.com/9rP2d1br it's a production server,and I'm not sure if this is a critical problem,how to fix it,any help would be greatly appreciated. DMAR table does not have an entry for this device to this region. Once the driver reconfigs/resets the device to stop polling bios-boot cmd rings and use (new) OS (dma-mapped) rings, there's a period of time during this transition that the hw is babbling away to an area that is no longer mapped. Maybe some kind of boot PCI quirk is needed to stop the device DMA activity before enabling the IOMMU? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3.7.0 5/9] i82975x_edac: optimise mode detection
On Sun, 16 Dec 2012 02:12:51 +0530 Arvind R wrote: > > Subject: [PATCH 3.7.0 5/9] i82975x_edac: optimise mode detection > > Minor optimisation of dual channel symmetric operation. Return > value changed to bool. And you moved the function for no reason that is obvious form the patch. > +/* Return 1 if dual channel mode is active. Else return 0. */ The comment is no longer correct i.e. you are no returning true/false not 0/1. > +static bool dual_channel_active(void __iomem *mch_window) > +{ > + /* > + * We treat interleaved-symmetric configuration as dual-channel. > + * All other configurations are virtual single channel mode. > + * bit-0 of EAP always provides the real channel in error. > + */ > + u8 drb[2]; > + int row; > + booldualch; > + > + for (dualch = 1, row = 0; dualch && um dualch = true > + (row < I82975X_NR_CSROWS_PER_CHANNEL); row++) { > + drb[0] = readb(mch_window + I82975X_DRB + row); > + drb[1] = readb(mch_window + I82975X_DRB + row + 0x80); > + dualch &= (drb[0] == drb[1]); Don't do bit operations on a bool. > + } > + return dualch; > +} -- Cheers, Stephen Rothwells...@canb.auug.org.au pgp0k3RVHge1l.pgp Description: PGP signature
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On Sat, Dec 15, 2012 at 1:40 PM, H. Peter Anvin wrote: > On 12/15/2012 12:55 PM, Yinghai Lu wrote: >> Also if we set map too large, could have chance to cover mem hole near >> 1T for AMD HT system. > > > Again, should not be cachable in the MTRRs, and even so, is 1G aligned > already. AMD system could have all mem between TOLM and TOHM all WB, and don need to set them in MTRRs entries. and also your switchover change that handle cross 1G, and 512g, and it is not 1G aligned. for example, if kernel at 4095G+512M, it will map from 4095G+512M to 4096G + 512M. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 3.7.0 1/9] i82975x_edac.c: fix style errors
Hi Arvind, On Sun, 16 Dec 2012 02:08:50 +0530 Arvind R wrote: > > Subject: [PATCH 3.7.0 1/9] i82975x_edac.c: fix style errors > > splits or shortens extra long lines in source. Don't do this, except for the one marked below, these add no value. The line length is a guide. > - snprintf(csrow->channels[chan]->dimm->label, > EDAC_MC_LABEL_LEN, "DIMM %c%d", > + snprintf(csrow->channels[chan]->dimm->label, > + EDAC_MC_LABEL_LEN, "DIMM %c%d", Only keep this one. -- Cheers, Stephen Rothwells...@canb.auug.org.au pgpFbqGKr6kHR.pgp Description: PGP signature
Re: [GIT PULL] x86/uapi for 3.8
On Sat, Dec 15, 2012 at 1:06 PM, Linus Torvalds wrote: > On Sat, Dec 15, 2012 at 1:04 PM, Markus Trippelsdorf > wrote: >> >> So I wonder if the following simple patch might be enough? >> It fixes the issue for me at least. > > Not enough. > > It presumably fixes the issue for you by hiding the problem. But if > you were to boot a kernel with EFI support, it would re-surface. > Including in any distro kernel that obviously will include EFI support > in order to handle the generic case. > > I've reverted the commit. more than that, 3 commits just after that commit should be reverted at the same time. they all depend on that commit. and first checking of that commit, it would have problem with system more than 512g ... static int insert_identity_mapping(resource_size_t paddr, unsigned long vaddr, unsigned long size) ... pgd_t *vpgd, *ppgd; ppgd = __va(real_mode_header->trampoline_pgd) + pgd_index(paddr); it missed one . we should use ppgd = (pgd_t *)__va(real_mode_header->trampoline_pgd) + pgd_index(paddr); Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On 12/15/2012 12:55 PM, Yinghai Lu wrote: On Sat, Dec 15, 2012 at 11:30 AM, H. Peter Anvin wrote: What is the point of only managing 2M at a time? Now you have to have more conditionals and you don't get any more memory efficiency. We don't need to, because real_data is less than 2M, and ramdisk is about 16M. In other words, you make magic assumptions (some of which are very wrong in many real-life scenarios -- people can and do use gigabyte-plus initramfs). That is exactly the wrong thing to do. Furthermore it doesn't buy you anything, because you still have to allocate the PMDs. Also if we set map too large, could have chance to cover mem hole near 1T for AMD HT system. Again, should not be cachable in the MTRRs, and even so, is 1G aligned already. Filling arbitrarily into the brk is not acceptable... the brk is an O(1) area and all brk allocations need to be reserved at compile time, so the overflow handling is still necessary. if run out of BRK, we will get panic, because early_make_pgtable will return -1. And you consider that panic an acceptable failure mode and current BRK already have 64 slop space. BTW, did you look at smp boot problem with early_level4_pgt version? No, I have been busy with non-Linux stuff today. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
On Sat, Dec 15, 2012 at 11:58:00AM -0800, Linus Torvalds wrote: > It might also be that it causes some massive corruption at boot time, > but it then requires that that particular memory is actually used. So > maybe it's not so much about the memory map except indirectly. I wonder if this might explain the XFS corruption I've been seeing the last couple days. Won't be able to get at the affected laptop until Monday to find out.. Dave -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
On 12/15/2012 01:06 PM, Linus Torvalds wrote: On Sat, Dec 15, 2012 at 1:04 PM, Markus Trippelsdorf wrote: So I wonder if the following simple patch might be enough? It fixes the issue for me at least. Not enough. It presumably fixes the issue for you by hiding the problem. But if you were to boot a kernel with EFI support, it would re-surface. Including in any distro kernel that obviously will include EFI support in order to handle the generic case. I've reverted the commit. Right... we'll work on fixing it properly. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
The mem hole at 1T should not be marked cachable in the MTRRs. Yinghai Lu wrote: >On Sat, Dec 15, 2012 at 11:30 AM, H. Peter Anvin >wrote: >> What is the point of only managing 2M at a time? Now you have to >have >> more conditionals and you don't get any more memory efficiency. > >We don't need to, because real_data is less than 2M, and ramdisk is >about 16M. > >Also if we set map too large, could have chance to cover mem hole near >1T for AMD HT system. > >> >> Filling arbitrarily into the brk is not acceptable... the brk is an >O(1) >> area and all brk allocations need to be reserved at compile time, so >the >> overflow handling is still necessary. > >if run out of BRK, we will get panic, because early_make_pgtable will >return -1. > >and current BRK already have 64 slop space. > >BTW, did you look at smp boot problem with early_level4_pgt version? > >Yinghai -- Sent from my mobile phone. Please excuse brevity and lack of formatting. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [Resend][PATCH] PM: Move disabling/enabling runtime PM to late suspend/early resume
On Sat, 15 Dec 2012, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki > > Currently, the PM core disables runtime PM for all devices right > after executing subsystem/driver .suspend() callbacks for them > and re-enables it right before executing subsystem/driver .resume() > callbacks for them. This may lead to problems when there are > two devices such that the .suspend() callback executed for one of > them depends on runtime PM working for the other. In that case, > if runtime PM has already been disabled for the second device, > the first one's .suspend() won't work correctly (and analogously > for resume). > > To make those issues go away, make the PM core disable runtime PM > for devices right before executing subsystem/driver .suspend_late() > callbacks for them and enable runtime PM for them right after > executing subsystem/driver .resume_early() callbacks for them. This > way the potential conflitcs between .suspend_late()/.resume_early() > and their runtime PM counterparts are still prevented from happening, > but the subtle ordering issues related to disabling/enabling runtime > PM for devices during system suspend/resume are much easier to avoid. > > Reported-and-tested-by: Jan-Matthias Braun > Signed-off-by: Rafael J. Wysocki Hi Rafael, just curious what is the reason for resend? Do you want to gather more Acks before pushing this upstream? Thanks. -- Jiri Kosina SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] fbdev changes for 3.8
On Fri, Dec 14, 2012 at 2:22 AM, Tomi Valkeinen wrote: > Hi Linus, > > Florian, the fbdev maintainer, has been very busy lately, so I offered to send > the pull request for fbdev for this merge window. Pulled. However, with this I get the Kconfig question OMAP2+ Display Subsystem support (OMAP2_DSS) [N/m/y/?] (NEW) which doesn't make a whole lot of sense on x86-64, unless there's something about OMAP2 that I don't know. So I'd suggest making that OMAP2_DSS be dependent on OMAP2. Or at least ARM. Because showing it to anybody else seems insane. Same goes for FB_OMAP2 for that matter. I realize that it's likely nice to get compile testing for this on x86-64 too, but if that's the intent, we need to think about it some more. I don't think it's good to ask actual normal users questions like this just for compile coverage. Hmm? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
On Sat, Dec 15, 2012 at 1:04 PM, Markus Trippelsdorf wrote: > > So I wonder if the following simple patch might be enough? > It fixes the issue for me at least. Not enough. It presumably fixes the issue for you by hiding the problem. But if you were to boot a kernel with EFI support, it would re-surface. Including in any distro kernel that obviously will include EFI support in order to handle the generic case. I've reverted the commit. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
On 2012.12.15 at 11:58 -0800, Linus Torvalds wrote: > On Sat, Dec 15, 2012 at 11:41 AM, H. Peter Anvin wrote: > > > > Matt is on vacation, and I'm partly offline for the weekend, but that > > definitely seems suspicious. Do we have a memory map of the affected > > machine(s)? > > > but as mentioned, there's bound to be some particular kernel layout > that triggers this, because I definitely ran a few kernels with that > commit in it without problems (and clearly other people are too). > Looking at my boot log, I had successful boots with both 6a57d104c8cb > and c2714334b944, which contains that commit. > > It might also be that it causes some massive corruption at boot time, > but it then requires that that particular memory is actually used. So > maybe it's not so much about the memory map except indirectly. > > But that commit *does* look a lot more likely than the things I looked at. The commit message says that only some broken implementations of EFI firmware require the mapping for the physical I/O device addresses. So I wonder if the following simple patch might be enough? It fixes the issue for me at least. diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index e190f7b..402e4ca 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -50,7 +50,7 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size, return err; } -#ifdef CONFIG_X86_64 +#ifdef CONFIG_EFI static void ident_pte_range(unsigned long paddr, unsigned long vaddr, pmd_t *ppmd, pmd_t *vpmd, unsigned long end) { -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On Sat, Dec 15, 2012 at 11:30 AM, H. Peter Anvin wrote: > What is the point of only managing 2M at a time? Now you have to have > more conditionals and you don't get any more memory efficiency. We don't need to, because real_data is less than 2M, and ramdisk is about 16M. Also if we set map too large, could have chance to cover mem hole near 1T for AMD HT system. > > Filling arbitrarily into the brk is not acceptable... the brk is an O(1) > area and all brk allocations need to be reserved at compile time, so the > overflow handling is still necessary. if run out of BRK, we will get panic, because early_make_pgtable will return -1. and current BRK already have 64 slop space. BTW, did you look at smp boot problem with early_level4_pgt version? Yinghai -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[ANNOUNCE] Git v1.8.1-rc2
A release candidate Git v1.8.1-rc2 is now available for testing at the usual places. The release tarballs are found at: http://code.google.com/p/git-core/downloads/list and their SHA-1 checksums are: 0a65a3d203b8d6e320f15abb040e1137e333c967 git-1.8.1.rc2.tar.gz e6bc111686e6864cc3f078b9523ef1057a7fff8f git-htmldocs-1.8.1.rc2.tar.gz 2c97472ae861454ff445868c40b49db66fa09f50 git-manpages-1.8.1.rc2.tar.gz Also the following public repositories all have a copy of the v1.8.1-rc2 tag and the master branch that the tag points at: url = git://repo.or.cz/alt-git.git url = https://code.google.com/p/git-core/ url = git://git.sourceforge.jp/gitroot/git-core/git.git url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core url = https://github.com/gitster/git Git v1.8.1 Release Notes (draft) Backward compatibility notes In the next major release (not *this* one), we will change the behavior of the "git push" command. When "git push [$there]" does not say what to push, we have used the traditional "matching" semantics so far (all your branches were sent to the remote as long as there already are branches of the same name over there). We will use the "simple" semantics that pushes the current branch to the branch with the same name, only when the current branch is set to integrate with that remote branch. There is a user preference configuration variable "push.default" to change this, and "git push" will warn about the upcoming change until you set this variable in this release. "git branch --set-upstream" is deprecated and may be removed in a relatively distant future. "git branch [-u|--set-upstream-to]" has been introduced with a saner order of arguments to replace it. Updates since v1.8.0 UI, Workflows & Features * Command-line completion scripts for tcsh and zsh have been added. * A new remote-helper interface for Mercurial has been added to contrib/remote-helpers. * We used to have a workaround for a bug in ancient "less" that causes it to exit without any output when the terminal is resized. The bug has been fixed in "less" version 406 (June 2007), and the workaround has been removed in this release. * Some documentation pages that used to ship only in the plain text format are now formatted in HTML as well. * "git-prompt" scriptlet (in contrib/completion) can be told to paint pieces of the hints in the prompt string in colors. * A new configuration variable "diff.context" can be used to give the default number of context lines in the patch output, to override the hardcoded default of 3 lines. * When "git checkout" checks out a branch, it tells the user how far behind (or ahead) the new branch is relative to the remote tracking branch it builds upon. The message now also advises how to sync them up by pushing or pulling. This can be disabled with the advice.statusHints configuration variable. * "git config --get" used to diagnose presence of multiple definitions of the same variable in the same configuration file as an error, but it now applies the "last one wins" rule used by the internal configuration logic. Strictly speaking, this may be an API regression but it is expected that nobody will notice it in practice. * "git log -p -S" now looks for the after applying the textconv filter (if defined); earlier it inspected the contents of the blobs without filtering. * "git format-patch" learned the "--notes=" option to give notes for the commit after the three-dash lines in its output. * "git log --grep=" learned to honor the "grep.patterntype" configuration set to "perl". * "git replace -d " now interprets as an extended SHA-1 (e.g. HEAD~4 is allowed), instead of only accepting full hex object name. * "git rm $submodule" used to punt on removing a submodule working tree to avoid losing the repository embedded in it. Because recent git uses a mechanism to separate the submodule repository from the submodule working tree, "git rm" learned to detect this case and removes the submodule working tree when it is safe to do so. * "git send-email" used to prompt for the sender address, even when the committer identity is well specified (e.g. via user.name and user.email configuration variables). The command no longer gives this prompt when not necessary. * "git send-email" did not allow non-address garbage strings to appear after addresses on Cc: lines in the patch files (and when told to pick them up to find more recipients), e.g. Cc: Stable Kernel # for v3.2 and up The command now strips " # for v3.2 and up" part before adding the remainder of this line to the list of recipients. * "git submodule add" learned to add a new submodule at the same path as the path where an unrelated submodule was bound to in an existing revision via the "--name" option. * "gi
Re: [PATCH] clk: factor: calculate rate by do_div
On Sat, Dec 15, 2012 at 8:41 AM, Haojian Zhuang wrote: > On Tue, Dec 4, 2012 at 9:32 AM, Haojian Zhuang > wrote: >> On Mon, Dec 3, 2012 at 4:14 PM, Haojian Zhuang >> wrote: >>> clk->rate = parent->rate / div * mult >>> >>> The formula is OK. But it may overflow while we do operate with >>> unsigned long. So use do_div instead. >>> >>> Signed-off-by: Haojian Zhuang >>> --- >>> drivers/clk/clk-fixed-factor.c |5 - >>> 1 file changed, 4 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/clk/clk-fixed-factor.c b/drivers/clk/clk-fixed-factor.c >>> index a489985..1ef271e 100644 >>> --- a/drivers/clk/clk-fixed-factor.c >>> +++ b/drivers/clk/clk-fixed-factor.c >>> @@ -28,8 +28,11 @@ static unsigned long clk_factor_recalc_rate(struct >>> clk_hw *hw, >>> unsigned long parent_rate) >>> { >>> struct clk_fixed_factor *fix = to_clk_fixed_factor(hw); >>> + unsigned long long int rate; >>> >>> - return parent_rate * fix->mult / fix->div; >>> + rate = (unsigned long long int)parent_rate * fix->mult; >>> + do_div(rate, fix->div); >>> + return (unsigned long)rate; >>> } >>> >>> static long clk_factor_round_rate(struct clk_hw *hw, unsigned long rate, >>> -- >>> 1.7.10.4 >>> >> >> Correct Mike's email address. > > Any comments? Does it mean that nobody want to fix the bug? Thanks for the patch. My apologies for letting this one slip through the cracks but my normal email workflow was unavoidably disrupted and I find myself playing catch-up with pending patches. The patch looks good to me but I'll change the $SUBJECT to "clk: fixed-factor: round_rate should use do_div" and do some testing before taking it in. Regards, Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ 00/22] 3.0.57-stable review
On Sat, Dec 15, 2012 at 7:25 AM, Shuah Khan wrote: > On Fri, Dec 14, 2012 at 3:25 PM, Greg Kroah-Hartman > wrote: >> This is the start of the stable review cycle for the 3.0.57 release. >> There are 22 patches in this series, all will be posted as a response >> to this one. If anyone has any issues with these being applied, please >> let me know. >> >> Responses should be made by Sun Dec 16 22:16:57 UTC 2012. >> Anything received after that time might be too late. >> >> The whole patch series can be found in one patch at: >> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.57-rc1.gz >> and the diffstat can be found below. >> >> thanks, >> >> greg k-h > > Patches applied cleanly to 3.0.y, 3.4.y, 3.6.y, and 3.7.y. > Compiled and booted on the following systems: > HP EliteBook 6930p Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz > HP ProBook 6475b AMD A10-4600M APU with Radeon(tm) HD Graphics > > Started cross-compile tests and will report the status. > > -- Shuah Cross-compile tests: alpha: defconfig passed on all arm: defconfig passed on all c6x: not applicable to 3.0.y, defconfig passed on the rest three mips: defconfig passed on all mipsel: defconfig passed on all powerpc: wii_defconfig failed on 3.0.y (known issue), passed on the rest three sh: defconfig passed on all sparc: defconfig passed on all tile: tilegx_defconfig passed on all -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ 00/28] 3.4.24-stable review
On Sat, Dec 15, 2012 at 7:27 AM, Shuah Khan wrote: > On Fri, Dec 14, 2012 at 3:26 PM, Greg Kroah-Hartman > wrote: >> This is the start of the stable review cycle for the 3.4.24 release. >> There are 28 patches in this series, all will be posted as a response >> to this one. If anyone has any issues with these being applied, please >> let me know. >> >> Responses should be made by Sun Dec 16 22:16:59 UTC 2012. >> Anything received after that time might be too late. >> >> The whole patch series can be found in one patch at: >> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.4.24-rc1.gz >> and the diffstat can be found below. >> >> thanks, >> >> greg k-h >> > > Patches applied cleanly to 3.0.y, 3.4.y, 3.6.y, and 3.7.y. > Compiled and booted on the following systems: > HP EliteBook 6930p Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz > HP ProBook 6475b AMD A10-4600M APU with Radeon(tm) HD Graphics > > Started cross-compile tests and will report status. > > -- Shuah Cross-compile tests: alpha: defconfig passed on all arm: defconfig passed on all c6x: not applicable to 3.0.y, defconfig passed on the rest three mips: defconfig passed on all mipsel: defconfig passed on all powerpc: wii_defconfig failed on 3.0.y (known issue), passed on the rest three sh: defconfig passed on all sparc: defconfig passed on all tile: tilegx_defconfig passed on all -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ 00/37] 3.6.11-stable review
On Sat, Dec 15, 2012 at 7:24 AM, Shuah Khan wrote: > On Fri, Dec 14, 2012 at 4:00 PM, Greg Kroah-Hartman > wrote: >> Note: This is going to be the last 3.6.y kernel release, unless >> something major comes up, everyone should be moving to the 3.7.y kernel >> at this point in time. >> >> This is the start of the stable review cycle for the 3.6.11 release. >> There are 37 patches in this series, all will be posted as a response >> to this one. If anyone has any issues with these being applied, please >> let me know. >> >> Responses should be made by Sun Dec 16 22:16:49 UTC 2012. >> Anything received after that time might be too late. >> >> The whole patch series can be found in one patch at: >> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.6.11-rc1.gz >> and the diffstat can be found below. >> >> thanks, >> >> greg k-h > > Patches applied cleanly to 3.0.y, 3.4.y, 3.6.y, and 3.7.y. > Compiled and booted on the following systems: > HP EliteBook 6930p Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz > HP ProBook 6475b AMD A10-4600M APU with Radeon(tm) HD Graphics > > I started cross-compile tests and will report the status. > > -- Shuah Cross-compile tests: alpha: defconfig passed on all arm: defconfig passed on all c6x: not applicable to 3.0.y, defconfig passed on the rest three mips: defconfig passed on all mipsel: defconfig passed on all powerpc: wii_defconfig failed on 3.0.y (known issue), passed on the rest three sh: defconfig passed on all sparc: defconfig passed on all tile: tilegx_defconfig passed on all -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [ 00/27] 3.7.1-stable review
On Sat, Dec 15, 2012 at 7:22 AM, Shuah Khan wrote: > On Fri, Dec 14, 2012 at 4:01 PM, Greg Kroah-Hartman > wrote: >> This is the start of the stable review cycle for the 3.7.1 release. >> There are 27 patches in this series, all will be posted as a response >> to this one. If anyone has any issues with these being applied, please >> let me know. >> >> Responses should be made by Sun Dec 16 22:16:56 UTC 2012. >> Anything received after that time might be too late. >> >> The whole patch series can be found in one patch at: >> kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.7.1-rc1.gz >> and the diffstat can be found below. >> >> thanks, >> >> greg k-h > > Patches applied cleanly to 3.0.y, 3.4.y, 3.6.y, and 3.7.y. > ompiled and booted on the following systems: > HP EliteBook 6930p Intel(R) Core(TM)2 Duo CPU T9400 @ 2.53GHz > HP ProBook 6475b AMD A10-4600M APU with Radeon(tm) HD Graphics > > I started cross-compile tests and will report the status > Cross-compile tests: alpha: defconfig passed on all arm: defconfig passed on all c6x: not applicable to 3.0.y, defconfig passed on the rest three mips: defconfig passed on all mipsel: defconfig passed on all powerpc: wii_defconfig failed on 3.0.y (known issue), passed on the rest three sh: defconfig passed on all sparc: defconfig passed on all tile: tilegx_defconfig passed on all -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.7.0 9/9] i82975x_edac: set sw-scrub mode, bump rev.
Subject: [PATCH 3.7.0 9/9] i82975x_edac: set sw-scrub mode, bump rev. update revision number and enable software scrub mode. Signed-off-by: Arvind R. --- i82975x_edac.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-15 23:08:28.0 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-15 23:09:29.0 +0530 @@ -16,7 +16,7 @@ #include #include "edac_core.h" -#define I82975X_REVISION " Ver: 1.0.0" +#define I82975X_REVISION " Ver: 2.0.0" #define EDAC_MOD_STR "i82975x_edac" #define i82975x_printk(level, fmt, arg...) \ @@ -586,7 +586,7 @@ static int i82975x_probe1(struct pci_dev pvt = (struct i82975x_pvt *) mci->pvt_info; pvt->chip = dev_idx; i82975x_init_csrows(mci, mch_window); - mci->scrub_mode = SCRUB_HW_SRC; + mci->scrub_mode = SCRUB_SW_SRC; i82975x_get_error_info(mci, &discard); /* clear counters */ /* finalize this instance of memory controller with edac core */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.7.0 8/9] i82975x_edac: fix wrong offset reported
Subject: [PATCH 3.7.0 8/9] i82975x_edac: fix wrong offset reported Cleanup error reporting function. This also corrects the wrong calculation of the offset mask. Signed-off-by: Arvind R. --- i82975x_edac.c | 59 +- 1 file changed, 25 insertions(+), 34 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-15 23:55:02.0 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-15 23:54:00.0 +0530 @@ -34,6 +34,8 @@ #defineI82975X_NR_CSROWS_PER_CHANNEL 4 #defineI82975X_NR_CSROWS_PER_DIMM 2 +#defineI82975X_ECC_GRAIN (1 << 7) + /* Intel 82975X register addresses - device 0 function 0 - DRAM Controller */ #define I82975X_EAP0x58/* Dram Error Address Pointer (32b) * @@ -205,6 +207,10 @@ NOTE: Only ONE of the three must be enab #define I82975X_DRC_CH0M1 0x124 #define I82975X_DRC_CH1M1 0x1A4 +#defineI82975X_BIT_ERROR_CE0x01 +#defineI82975X_BIT_ERROR_UE0x02 +#defineI82975X_BITS_ERROR 0x03 + enum i82975x_chips { I82975X_chip = 0, }; @@ -239,7 +245,7 @@ static struct pci_dev *mci_pdev;/* init static int i82975x_registered = 1; -static void i82975x_get_error_info(struct mem_ctl_info *mci, +static bool i82975x_get_error_info(struct mem_ctl_info *mci, struct i82975x_error_info *info) { struct pci_dev *pdev; @@ -258,7 +264,8 @@ static void i82975x_get_error_info(struc pci_read_config_byte(pdev, I82975X_DERRSYN, &info->derrsyn); pci_read_config_word(pdev, I82975X_ERRSTS, &info->errsts2); - pci_write_bits16(pdev, I82975X_ERRSTS, 0x0003, 0x0003); + pci_write_bits16(pdev, I82975X_ERRSTS, I82975X_BITS_ERROR, + I82975X_BITS_ERROR); /* * If the error is the same then we can for both reads then @@ -266,31 +273,30 @@ static void i82975x_get_error_info(struc * there is a CE no info and the second set of reads is valid * and should be UE info. */ - if (!(info->errsts2 & 0x0003)) - return; + if (!(info->errsts2 & I82975X_BITS_ERROR)) + return false; - if ((info->errsts ^ info->errsts2) & 0x0003) { + if ((info->errsts ^ info->errsts2) & I82975X_BITS_ERROR) { pci_read_config_dword(pdev, I82975X_EAP, &info->eap); pci_read_config_byte(pdev, I82975X_XEAP, &info->xeap); pci_read_config_byte(pdev, I82975X_DES, &info->des); pci_read_config_byte(pdev, I82975X_DERRSYN, &info->derrsyn); } + return true; } static int i82975x_process_error_info(struct mem_ctl_info *mci, struct i82975x_error_info *info, int handle_errors) { + enum hw_event_mc_err_type err_type; int row, chan; unsigned long offst, page; - if (!(info->errsts2 & 0x0003)) - return 0; - if (!handle_errors) return 1; - if ((info->errsts ^ info->errsts2) & 0x0003) { + if ((info->errsts ^ info->errsts2) & I82975X_BITS_ERROR) { edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1, 0, 0, 0, -1, -1, -1, "UE overwrote CE", ""); info->errsts = info->errsts2; @@ -302,30 +308,15 @@ static int i82975x_process_error_info(st page |= 0x8000; page >>= (PAGE_SHIFT - 1); row = edac_mc_find_csrow_by_page(mci, page); + chan = (mci->num_cschannel == 1) ? 0 : info->eap & 1; + offst = info->eap & ((1 << PAGE_SHIFT) - I82975X_ECC_GRAIN); - if (row == -1) { - i82975x_mc_printk(mci, KERN_ERR, "error processing EAP:\n" - "\tXEAP=%u\n" - "\t EAP=0x%08x\n" - "\tPAGE=0x%08x\n", - (info->xeap & 1) ? 1 : 0, info->eap, (unsigned) page); - return 0; - } - chan = (mci->csrows[row]->nr_channels == 1) ? 0 : info->eap & 1; - offst = info->eap - & ((1 << PAGE_SHIFT) - - (1 << mci->csrows[row]->channels[chan]->dimm->grain)); - - if (info->errsts & 0x0002) - edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1, -page, offst, 0, -row, -1, -1, -"i82975x UE", ""); - else - edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 1, + err_type = (info->errsts & I82975X_BIT_ERROR_UE) ? + HW_EVENT_ERR_UNCORRECTED : HW_EVENT_ERR_CORRECTED; + edac_mc_handle_error(err_type, mci, 1, page, offst, info->derrsyn, -row, chan ? chan : 0, -1, -
[PATCH 3.7.0 7/9] i82975x_edac: correct dimm label initialisation
Subject: [PATCH 3.7.0 7/9] i82975x_edac: correct dimm label initialisation DIMM label are the legends on the mobo. Fix their initialisation to correspond to the legends. Channels are designated A/B. A single DIMM occupies 2 ranks. And the first DIMM is 1, not 0. This is as found in Asus P5WDG2 family of mobos. This patch maps to that. Signed-off-by: Arvind R. --- i82975x_edac.c | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-15 22:32:00.0 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-15 22:42:00.0 +0530 @@ -32,6 +32,7 @@ #define I82975X_NR_DIMMS 8 #define I82975X_NR_CSROWS(nr_chans)(I82975X_NR_DIMMS / (nr_chans)) #defineI82975X_NR_CSROWS_PER_CHANNEL 4 +#defineI82975X_NR_CSROWS_PER_DIMM 2 /* Intel 82975X register addresses - device 0 function 0 - DRAM Controller */ #define I82975X_EAP0x58/* Dram Error Address Pointer (32b) @@ -339,13 +340,13 @@ static void i82975x_check(struct mem_ctl } static void i82975x_init_csrows(struct mem_ctl_info *mci, - struct pci_dev *pdev, void __iomem *mch_window) + void __iomem *mch_window) { struct csrow_info *csrow; unsigned long last_cumul_size; u8 value; u32 cumul_size, nr_pages; - int index, chan; + unsigned index, chan; struct dimm_info *dimm; last_cumul_size = 0; @@ -370,7 +371,8 @@ static void i82975x_init_csrows(struct m * Adjust cumul_size w.r.t number of channels * */ - if (csrow->nr_channels > 1) + if (mci->num_cschannel > 1) + /* dual_channel symmetric */ cumul_size <<= 1; edac_dbg(3, "(%d) cumul_size 0x%x\n", index, cumul_size); @@ -384,15 +386,18 @@ static void i82975x_init_csrows(struct m * [0-7] for single-channel; i.e. csrow->nr_channels = 1 * [0-3] for dual-channel; i.e. csrow->nr_channels = 2 */ - for (chan = 0; chan < csrow->nr_channels; chan++) { + for (chan = 0; chan < mci->num_cschannel; chan++) { dimm = mci->csrows[index]->channels[chan]->dimm; - dimm->nr_pages = nr_pages / csrow->nr_channels; + dimm->nr_pages = nr_pages / mci->num_cschannel; snprintf(csrow->channels[chan]->dimm->label, EDAC_MC_LABEL_LEN, "DIMM %c%d", -(chan == 0) ? 'A' : 'B', -index); +((mci->num_cschannel <= 1) ? + index / I82975X_NR_CSROWS_PER_CHANNEL : + chan) + 'A', +((index % I82975X_NR_CSROWS_PER_CHANNEL) / + I82975X_NR_CSROWS_PER_DIMM) + 1); dimm->grain = 1 << 7; /* always */ dimm->dtype = DEV_X8; /* only with ECC */ dimm->mtype = MEM_DDR2; /* only supported */ @@ -589,7 +594,7 @@ static int i82975x_probe1(struct pci_dev mci->ctl_page_to_phys = NULL; pvt = (struct i82975x_pvt *) mci->pvt_info; pvt->chip = dev_idx; - i82975x_init_csrows(mci, pdev, mch_window); + i82975x_init_csrows(mci, mch_window); mci->scrub_mode = SCRUB_HW_SRC; i82975x_get_error_info(mci, &discard); /* clear counters */ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.7.0 6/9] i82975x_edac: unmap pcibar after init
Subject: [PATCH 3.7.0 6/9] i82975x_edac: unmap pcibar after init Remove the unnecessary mapped window in private data structure. Then the window can be unmapped right after driver initialisation is done. Signed-off-by: Arvind R. --- i82975x_edac.c | 24 1 file changed, 8 insertions(+), 16 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-16 00:09:23.0 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-16 00:08:52.0 +0530 @@ -205,11 +205,11 @@ NOTE: Only ONE of the three must be enab #define I82975X_DRC_CH1M1 0x1A4 enum i82975x_chips { - I82975X = 0, + I82975X_chip = 0, }; struct i82975x_pvt { - void __iomem *mch_window; + enum i82975x_chips chip; }; struct i82975x_dev_info { @@ -227,7 +227,7 @@ struct i82975x_error_info { }; static const struct i82975x_dev_info i82975x_devs[] = { - [I82975X] = { + [I82975X_chip] = { .ctl_name = "i82975x" }, }; @@ -588,7 +588,7 @@ static int i82975x_probe1(struct pci_dev mci->edac_check = i82975x_check; mci->ctl_page_to_phys = NULL; pvt = (struct i82975x_pvt *) mci->pvt_info; - pvt->mch_window = mch_window; + pvt->chip = dev_idx; i82975x_init_csrows(mci, pdev, mch_window); mci->scrub_mode = SCRUB_HW_SRC; i82975x_get_error_info(mci, &discard); /* clear counters */ @@ -596,15 +596,13 @@ static int i82975x_probe1(struct pci_dev /* finalize this instance of memory controller with edac core */ if (edac_mc_add_mc(mci)) { edac_dbg(0, "MC%d failed add_mc()\n", dev_idx); - goto fail2; + edac_mc_free(mci); + goto fail1; } /* get this far and it's successful */ edac_dbg(3, "MC%d success\n", dev_idx); - return 0; - -fail2: - edac_mc_free(mci); + rc = 0; fail1: iounmap(mch_window); @@ -632,23 +630,17 @@ static int __devinit i82975x_init_one(st static void __devexit i82975x_remove_one(struct pci_dev *pdev) { struct mem_ctl_info *mci; - struct i82975x_pvt *pvt; mci = edac_mc_del_mc(&pdev->dev); if (mci == NULL) return; - - pvt = mci->pvt_info; - if (pvt->mch_window) - iounmap( pvt->mch_window ); - edac_mc_free(mci); } static DEFINE_PCI_DEVICE_TABLE(i82975x_pci_tbl) = { { PCI_VEND_DEV(INTEL, 82975_0), PCI_ANY_ID, PCI_ANY_ID, 0, 0, - I82975X + I82975X_chip }, { 0, -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.7.0 5/9] i82975x_edac: optimise mode detection
Subject: [PATCH 3.7.0 5/9] i82975x_edac: optimise mode detection Minor optimisation of dual channel symmetric operation. Return value changed to bool. Signed-off-by: Arvind R. --- i82975x_edac.c | 45 -- 1 file changed, 22 insertions(+), 23 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-15 22:14:18.0 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-15 22:17:12.0 +0530 @@ -31,6 +31,7 @@ #define I82975X_NR_DIMMS 8 #define I82975X_NR_CSROWS(nr_chans)(I82975X_NR_DIMMS / (nr_chans)) +#defineI82975X_NR_CSROWS_PER_CHANNEL 4 /* Intel 82975X register addresses - device 0 function 0 - DRAM Controller */ #define I82975X_EAP0x58/* Dram Error Address Pointer (32b) @@ -337,29 +338,6 @@ static void i82975x_check(struct mem_ctl i82975x_process_error_info(mci, &info, 1); } -/* Return 1 if dual channel mode is active. Else return 0. */ -static int dual_channel_active(void __iomem *mch_window) -{ - /* -* We treat interleaved-symmetric configuration as dual-channel - EAP's -* bit-0 giving the channel of the error location. -* -* All other configurations are treated as single channel - the EAP's -* bit-0 will resolve ok in symmetric area of mixed -* (symmetric/asymmetric) configurations -*/ - u8 drb[4][2]; - int row; - intdualch; - - for (dualch = 1, row = 0; dualch && (row < 4); row++) { - drb[row][0] = readb(mch_window + I82975X_DRB + row); - drb[row][1] = readb(mch_window + I82975X_DRB + row + 0x80); - dualch = dualch && (drb[row][0] == drb[row][1]); - } - return dualch; -} - static void i82975x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev, void __iomem *mch_window) { @@ -527,6 +505,27 @@ static void i82975x_print_dram_timings(v } #endif +/* Return 1 if dual channel mode is active. Else return 0. */ +static bool dual_channel_active(void __iomem *mch_window) +{ + /* +* We treat interleaved-symmetric configuration as dual-channel. +* All other configurations are virtual single channel mode. +* bit-0 of EAP always provides the real channel in error. +*/ + u8 drb[2]; + int row; + booldualch; + + for (dualch = 1, row = 0; dualch && + (row < I82975X_NR_CSROWS_PER_CHANNEL); row++) { + drb[0] = readb(mch_window + I82975X_DRB + row); + drb[1] = readb(mch_window + I82975X_DRB + row + 0x80); + dualch &= (drb[0] == drb[1]); + } + return dualch; +} + static int i82975x_probe1(struct pci_dev *pdev, int dev_idx) { int rc = -ENODEV; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 162/241] Bluetooth: ath3k: Add support for VAIO VPCEH [0489:e027]
On Sat, Dec 15, 2012 at 07:59:18PM +, Ben Hutchings wrote: > On Thu, 2012-12-13 at 11:58 -0200, Herton Ronaldo Krzesinski wrote: > > 3.5.7.2 -stable review patch. If anyone has any objections, please let me > > know. > > > > -- > > > > From: Marcos Chaparro > > > > commit acd9454433e28c1a365d8b069813c35c1c3a8ac3 upstream. > > > > Added Atheros AR3011 internal bluetooth device found in Sony VAIO VPCEH to > > the > > devices list. > > Before this, the bluetooth module was identified as an Foxconn / Hai > > bluetooth > > device [0489:e027], now it claims to be an AtherosAR3011 Bluetooth > > [0cf3:3005]. > [...] > > This seems to be applicable to 3.{0,2,4,6}.y as well... While we're here you may also want to consider to add to other stables the following (where applicable, I didn't verify exactly which versions these may be needed): [163/241] drm/i915: EBUSY status handling added to i915_gem_fault(). (reference: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1087302) [164/241] MISC: hpilo, remove pci_disable_device (reference: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1087860) This is what I recall as bugfixes requested to be added directly to me, but that didn't yet came through as an stable mailing list request. > > Ben. > > -- > Ben Hutchings > Theory and practice are closer in theory than in practice. > - John Levine, moderator of comp.compilers -- []'s Herton -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.7.0 4/9] i82975x_edac.c: remove unnecessary function
Subject: [PATCH 3.7.0 4/9] i82975x_edac.c: remove unnecessary function remove function that returns a constant value and variable to hold the returned value. Signed-off-by: Arvind R. --- i82975x_edac.c | 12 +--- 1 file changed, 1 insertion(+), 11 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-15 20:19:16 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-15 20:19:02 +0530 @@ -360,14 +360,6 @@ static int dual_channel_active(void __io return dualch; } -static enum dev_type i82975x_dram_type(void __iomem *mch_window, int rank) -{ - /* -* ECC is possible on i92975x ONLY with DEV_X8 -*/ - return DEV_X8; -} - static void i82975x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev, void __iomem *mch_window) { @@ -377,7 +369,6 @@ static void i82975x_init_csrows(struct m u32 cumul_size, nr_pages; int index, chan; struct dimm_info *dimm; - enum dev_type dtype; last_cumul_size = 0; @@ -415,7 +406,6 @@ static void i82975x_init_csrows(struct m * [0-7] for single-channel; i.e. csrow->nr_channels = 1 * [0-3] for dual-channel; i.e. csrow->nr_channels = 2 */ - dtype = i82975x_dram_type(mch_window, index); for (chan = 0; chan < csrow->nr_channels; chan++) { dimm = mci->csrows[index]->channels[chan]->dimm; @@ -426,7 +416,7 @@ static void i82975x_init_csrows(struct m (chan == 0) ? 'A' : 'B', index); dimm->grain = 1 << 7; /* always */ - dimm->dtype = i82975x_dram_type(mch_window, index); + dimm->dtype = DEV_X8; /* only with ECC */ dimm->mtype = MEM_DDR2; /* only supported */ dimm->edac_mode = EDAC_SECDED; /* only supported */ } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.7.0 3/9] i82975x_edac.c: cleanup debug code
Subject: [PATCH 3.7.0 3/9] i82975x_edac.c: cleanup debug code modify debug levels to sane levels. Also move random debug code into CONFIG_EDAC_DEBUG sections. Signed-off-by: Arvind R. --- i82975x_edac.c | 171 +++--- 1 file changed, 97 insertions(+), 74 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-15 20:18:24 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-15 20:17:58 +0530 @@ -167,7 +167,8 @@ NOTE: Only ONE of the three must be enab #define I82975X_C0BNKARC 0x10e #define I82975X_C1BNKARC 0x18e - +#define I82975X_C0DRT1 0x114 +#define I82975X_C1DRT1 0x194 #define I82975X_DRC0x120 /* DRAM Controller Mode0 (32b) * @@ -331,7 +332,7 @@ static void i82975x_check(struct mem_ctl { struct i82975x_error_info info; - edac_dbg(1, "MC%d\n", mci->mc_idx); + edac_dbg(4, "MC%d\n", mci->mc_idx); i82975x_get_error_info(mci, &info); i82975x_process_error_info(mci, &info, 1); } @@ -436,27 +437,93 @@ static void i82975x_init_csrows(struct m } } -/* #define i82975x_DEBUG_IOMEM */ - -#ifdef i82975x_DEBUG_IOMEM -static void i82975x_print_dram_timings(void __iomem *mch_window) -{ - /* -* The register meanings are from Intel specs; -* (shows 13-5-5-5 for 800-DDR2) -* Asus P5W Bios reports 15-5-4-4 -* What's your religion? -*/ +#ifdef CONFIG_EDAC_DEBUG +static void i82975x_print_dram_settings(void __iomem *mch_window, + u32 mchbar, u32 *drc, bool is_symmetric) +{ + static const char *refresh_modes[8] = { + "disabled" + "15.6 uSec", "7.8 uSec", "3.9 uSec", "1.95 uSec", + "reserved", "reserved", + "fast refresh (64 clocks)" + }; + static const char *rank_attr[8] = { + "empty ", "reserved", + "4 Kb", "8 Kb", "16 Kb ", + "reserved", "reserved", "reserved" + }; static const int caslats[4] = { 5, 4, 3, 6 }; u32 dtreg[2]; + u8 drb[4]; + u8 dra[2][2]; + + /* Show memory config if debug level is 1 or upper */ + if (!edac_debug_level) + return; + + i82975x_printk(KERN_INFO, "MCHBAR real = %0x, remapped = %p\n", + mchbar, mch_window); + + drb[0] = readb(mch_window + I82975X_DRB_CH0R0); + drb[1] = readb(mch_window + I82975X_DRB_CH0R1); + drb[2] = readb(mch_window + I82975X_DRB_CH0R2); + drb[3] = readb(mch_window + I82975X_DRB_CH0R3); + i82975x_printk(KERN_INFO, "DRBCH0R0 = 0x%02x\n", drb[0]); + i82975x_printk(KERN_INFO, "DRBCH0R1 = 0x%02x\n", drb[1]); + i82975x_printk(KERN_INFO, "DRBCH0R2 = 0x%02x\n", drb[2]); + i82975x_printk(KERN_INFO, "DRBCH0R3 = 0x%02x\n\n", drb[3]); + drb[0] = readb(mch_window + I82975X_DRB_CH1R0); + drb[1] = readb(mch_window + I82975X_DRB_CH1R1); + drb[2] = readb(mch_window + I82975X_DRB_CH1R2); + drb[3] = readb(mch_window + I82975X_DRB_CH1R3); + i82975x_printk(KERN_INFO, "DRBCH1R0 = 0x%02x\n", drb[0]); + i82975x_printk(KERN_INFO, "DRBCH1R1 = 0x%02x\n", drb[1]); + i82975x_printk(KERN_INFO, "DRBCH1R2 = 0x%02x\n", drb[2]); + i82975x_printk(KERN_INFO, "DRBCH1R3 = 0x%02x\n", drb[3]); + i82975x_printk(KERN_INFO, "Memory in %ssymmetric mode\n", + is_symmetric ? "" : "as"); - dtreg[0] = readl(mch_window + 0x114); - dtreg[1] = readl(mch_window + 0x194); + i82975x_printk(KERN_INFO, "DRC_CH0 = %0x, %s\n", drc[0], + ((drc[0] >> 21) & 3) == 1 ? + "ECC enabled" : "ECC disabled"); + i82975x_printk(KERN_INFO, "DRC_CH1 = %0x, %s\n", drc[1], + ((drc[1] >> 21) & 3) == 1 ? + "ECC enabled" : "ECC disabled"); + + dra[0][0] = readb(mch_window + I82975X_DRA_CH0R01); + dra[0][1] = readb(mch_window + I82975X_DRA_CH0R23); + dra[1][0] = readb(mch_window + I82975X_DRA_CH1R01); + dra[1][1] = readb(mch_window + I82975X_DRA_CH1R23); + i82975x_printk(KERN_INFO, "Rank Attribute:\n" + " Rank: 0123\n" + " Ch0: %s %s %s %s\n" + " Ch1: %s %s %s %s\n", + rank_attr[dra[0][0] & 7], + rank_attr[(dra[0][0] >> 4) & 7], + rank_attr[dra[0][1] & 7], + rank_attr[(dra[0][1] >> 4) & 7], + rank_attr[dra[1][0] & 7], + rank_attr[(dra[1][0] >> 4) & 7], + rank_attr[dra[1][1] & 7], + rank_attr[(dra[1][1] >> 4) & 7]); + + i82975x_printk(KERN_INFO, "Bank Architecture:\n" + "
[PATCH 3.7.0 2/9] i82975x_edac.c: fix layers initialisation
Subject: [PATCH 3.7.0 2/9] i82975x_edac.c: fix layers initialisation correct the absolutely wrong initialisation of memory layout. Signed-off-by: Arvind R. --- i82975x_edac.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-15 16:17:28 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-15 16:16:51 +0530 @@ -544,10 +544,10 @@ static int i82975x_probe1(struct pci_dev /* assuming only one controller, index thus is 0 */ layers[0].type = EDAC_MC_LAYER_CHIP_SELECT; - layers[0].size = I82975X_NR_DIMMS; + layers[0].size = I82975X_NR_CSROWS(chans); layers[0].is_virt_csrow = true; layers[1].type = EDAC_MC_LAYER_CHANNEL; - layers[1].size = I82975X_NR_CSROWS(chans); + layers[1].size = chans; layers[1].is_virt_csrow = false; mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(*pvt)); if (!mci) { -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.7.0 1/9] i82975x_edac.c: fix style errors
Subject: [PATCH 3.7.0 1/9] i82975x_edac.c: fix style errors splits or shortens extra long lines in source. Signed-off-by: Arvind R. --- i82975x_edac.c | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) --- a/drivers/edac/i82975x_edac.c 2012-12-11 09:00:57 +0530 +++ b/drivers/edac/i82975x_edac.c 2012-12-15 16:01:29 +0530 @@ -48,7 +48,7 @@ #define I82975X_DES0x5d/* Dram ERRor DeSTination (8b) * 0h:Processor Memory Reads * 1h:7h reserved -* More - See Page 65 of Intel DocSheet. +* More - See Pg.65 of Intel DocSheet. */ #define I82975X_ERRSTS 0xc8/* Error Status Register (16b) @@ -98,7 +98,7 @@ NOTE: Only ONE of the three must be enab #define I82975X_XEAP 0xfc/* Extended Dram Error Address Pointer (8b) * * 7:1 reserved -* 0 Bit32 of the Dram Error Address +* 0 Bit32 of Dram Error Address */ #define I82975X_MCHBAR 0x44/* @@ -305,13 +305,13 @@ static int i82975x_process_error_info(st "\tXEAP=%u\n" "\t EAP=0x%08x\n" "\tPAGE=0x%08x\n", - (info->xeap & 1) ? 1 : 0, info->eap, (unsigned int) page); + (info->xeap & 1) ? 1 : 0, info->eap, (unsigned) page); return 0; } chan = (mci->csrows[row]->nr_channels == 1) ? 0 : info->eap & 1; offst = info->eap & ((1 << PAGE_SHIFT) - - (1 << mci->csrows[row]->channels[chan]->dimm->grain)); + (1 << mci->csrows[row]->channels[chan]->dimm->grain)); if (info->errsts & 0x0002) edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1, @@ -420,12 +420,13 @@ static void i82975x_init_csrows(struct m dimm->nr_pages = nr_pages / csrow->nr_channels; - snprintf(csrow->channels[chan]->dimm->label, EDAC_MC_LABEL_LEN, "DIMM %c%d", + snprintf(csrow->channels[chan]->dimm->label, +EDAC_MC_LABEL_LEN, "DIMM %c%d", (chan == 0) ? 'A' : 'B', index); - dimm->grain = 1 << 7; /* 128Byte cache-line resolution */ + dimm->grain = 1 << 7; /* always */ dimm->dtype = i82975x_dram_type(mch_window, index); - dimm->mtype = MEM_DDR2; /* I82975x supports only DDR2 */ + dimm->mtype = MEM_DDR2; /* only supported */ dimm->edac_mode = EDAC_SECDED; /* only supported */ } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] omap_vout: find_vma() needs ->mmap_sem held
On Sat, Dec 15, 2012 at 08:12:37PM +, Al Viro wrote: > Walking rbtree while it's modified is a Bad Idea(tm); besides, > the result of find_vma() can be freed just as it's getting returned > to caller. Fortunately, it's easy to fix - just take ->mmap_sem a bit > earlier (and don't bother with find_vma() at all if virtp >= PAGE_OFFSET - > in that case we don't even look at its result). While we are at it, what prevents VIDIOC_PREPARE_BUF calling v4l_prepare_buf() -> (e.g) vb2_ioctl_prepare_buf() -> vb2_prepare_buf() -> __buf_prepare() -> __qbuf_userptr() -> vb2_vmalloc_get_userptr() -> find_vma(), AFAICS without having taken ->mmap_sem anywhere in process? The code flow is bloody convoluted and depends on a bunch of things done by initialization, so I certainly might've missed something... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH 3.7.0 0/9] i82975x_edac: driver cleanup
Subject: [PATCH 3.7.0 0/9] i82975x_edac: driver cleanup This patchset cleans up the accumulated mess the driver has become. Currently, it does not crash, but serves no other purpose. This patch-set gets it to print correct DIMM labels on errors, and sync with the core w.r.t memory layout. It consists of 9 patches as follows: 1. fix style errors: clean up the source w.r.t. long lines 2. fix layers initialisation: the wrong initialisation caused the fatal 3.6 crash that has been temporarily fixed. The csrow_init func did not handle rows exceeding 8 in number. This patch sets channels as 1 or 2 and chip_select accordingly, so that there are always 8 ranks of memory. 3. cleanup debug code: Remove / modify debug levels to sane values. Fixes log flooding when CONFIG_EDAC_DEBUG set. Move local ifdef debug code to print DRAM settings into CONFIG_EDAC_DEBUG section. 4. remove unnecessary function: function returning a constant DEV_X8 value and used only once, removed. 5. optimise mode detection: dual_channel_active now returns bool. With minor optimisations. 6. unmap pcibar after init: remove the unused iomapped mch_window in private data structure; and unmap the window after initialisation. Now that __devinit is deprecated, this is needed. 7. correct dimm label initialisation: fix dimm labels to correspond to mobo legends. Assumes 2 ranks per DIMM and DIMMS not spanning channels - as it is on Asus P5WDG2 family at least. Currently, the label bears no correspondance to mobo legends. 8. fix wrong offset reported: fixes the sometimes greate than PAGESIZE offset reporting caused by mixup of grain-size and grain-shift. 9. set SW-SCRUB mode and bump revision: With the error address being correct in the core's view, enable SCRUB_SW_SRC so CEs can be written back. And update driver revision. Signed-off-by: Arvind R. --- Total changes: i82975x_edac.c | 349 +- 1 file changed, 175 insertions(+), 174 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] asm-generic, mm: pgtable: fix include for my_zero_pfn()
From: Deng-Cheng Zhu A MIPS build showed: In file included from arch/mips/include/asm/pgtable.h:388, from mm/init-mm.c:9: include/asm-generic/pgtable.h: In function 'my_zero_pfn': include/asm-generic/pgtable.h:462: error: 'mem_map' undeclared (first use in this function) include/asm-generic/pgtable.h:462: error: (Each undeclared identifier is reported only once include/asm-generic/pgtable.h:462: error: for each function it appears in.) This was caused by the following commit: 816422ad76 asm-generic, mm: pgtable: consolidate zero page helpers Changing my_zero_pfn from #define to an inline function requires the include fix. I believe s390 has the same problem as mips. Although Ralf has added "#include " in arch/mips/include/asm/pgtable.h in his "MIPS: Transparent Huge Pages support" commit, and this error went away, I think this fix is needed since asm-generic/pgtable.h is now the place of the function my_zero_pfn() who requires the definition of mem_map and this header could be included by others. Cc: Kirill A. Shutemov Cc: Ralf Baechle Cc: Steven J. Hill Signed-off-by: Deng-Cheng Zhu --- include/asm-generic/pgtable.h |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/include/asm-generic/pgtable.h b/include/asm-generic/pgtable.h index 284e808..628dbbb 100644 --- a/include/asm-generic/pgtable.h +++ b/include/asm-generic/pgtable.h @@ -4,7 +4,7 @@ #ifndef __ASSEMBLY__ #ifdef CONFIG_MMU -#include +#include #include #ifndef __HAVE_ARCH_PTEP_SET_ACCESS_FLAGS -- 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] omap_vout: find_vma() needs ->mmap_sem held
Walking rbtree while it's modified is a Bad Idea(tm); besides, the result of find_vma() can be freed just as it's getting returned to caller. Fortunately, it's easy to fix - just take ->mmap_sem a bit earlier (and don't bother with find_vma() at all if virtp >= PAGE_OFFSET - in that case we don't even look at its result). Cc: sta...@vger.kernel.org [2.6.35] Signed-off-by: Al Viro --- diff --git a/drivers/media/platform/omap/omap_vout.c b/drivers/media/platform/omap/omap_vout.c index 9935040..984512f 100644 --- a/drivers/media/platform/omap/omap_vout.c +++ b/drivers/media/platform/omap/omap_vout.c @@ -207,19 +207,21 @@ static u32 omap_vout_uservirt_to_phys(u32 virtp) struct vm_area_struct *vma; struct mm_struct *mm = current->mm; - vma = find_vma(mm, virtp); /* For kernel direct-mapped memory, take the easy way */ - if (virtp >= PAGE_OFFSET) { - physp = virt_to_phys((void *) virtp); + if (virtp >= PAGE_OFFSET) + return virt_to_phys((void *) virtp); + + down_read(¤t->mm->mmap_sem); + vma = find_vma(mm, virtp); } else if (vma && (vma->vm_flags & VM_IO) && vma->vm_pgoff) { /* this will catch, kernel-allocated, mmaped-to-usermode addresses */ physp = (vma->vm_pgoff << PAGE_SHIFT) + (virtp - vma->vm_start); + up_read(¤t->mm->mmap_sem); } else { /* otherwise, use get_user_pages() for general userland pages */ int res, nr_pages = 1; struct page *pages; - down_read(¤t->mm->mmap_sem); res = get_user_pages(current, current->mm, virtp, nr_pages, 1, 0, &pages, NULL); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 167/241] SUNRPC: Set alloc_slot for backchannel tcp ops
On Thu, 2012-12-13 at 11:58 -0200, Herton Ronaldo Krzesinski wrote: > 3.5.7.2 -stable review patch. If anyone has any objections, please let me > know. > > -- > > From: Bryan Schumaker > > commit 84e28a307e376f271505af65a7b7e212dd6f61f4 upstream. > > f39c1bfb5a03e2d255451bff05be0d7255298fa4 (SUNRPC: Fix a UDP transport > regression) introduced the "alloc_slot" function for xprt operations, > but never created one for the backchannel operations. This patch fixes > a null pointer dereference when mounting NFS over v4.1. [...] Greg, you missed this in 3.4.y. It might need a context fix; I'm attaching the version I used for 3.2.y. Ben. -- Ben Hutchings Theory and practice are closer in theory than in practice. - John Levine, moderator of comp.compilers From: Bryan Schumaker Date: Mon, 24 Sep 2012 13:39:01 -0400 Subject: SUNRPC: Set alloc_slot for backchannel tcp ops commit 84e28a307e376f271505af65a7b7e212dd6f61f4 upstream. f39c1bfb5a03e2d255451bff05be0d7255298fa4 (SUNRPC: Fix a UDP transport regression) introduced the "alloc_slot" function for xprt operations, but never created one for the backchannel operations. This patch fixes a null pointer dereference when mounting NFS over v4.1. Call Trace: [] ? xprt_reserve+0x47/0x50 [sunrpc] [] call_reserve+0x34/0x60 [sunrpc] [] __rpc_execute+0x90/0x400 [sunrpc] [] rpc_async_schedule+0x2a/0x40 [sunrpc] [] process_one_work+0x139/0x500 [] ? alloc_worker+0x70/0x70 [] ? __rpc_execute+0x400/0x400 [sunrpc] [] worker_thread+0x15e/0x460 [] ? preempt_schedule+0x49/0x70 [] ? rescuer_thread+0x230/0x230 [] kthread+0x93/0xa0 [] kernel_thread_helper+0x4/0x10 [] ? kthread_freezable_should_stop+0x70/0x70 [] ? gs_change+0x13/0x13 Signed-off-by: Bryan Schumaker Signed-off-by: Trond Myklebust [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings --- net/sunrpc/xprtsock.c |1 + 1 file changed, 1 insertion(+) --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2488,6 +2488,7 @@ static struct rpc_xprt_ops xs_tcp_ops = static struct rpc_xprt_ops bc_tcp_ops = { .reserve_xprt = xprt_reserve_xprt, .release_xprt = xprt_release_xprt, + .alloc_slot = xprt_alloc_slot, .buf_alloc = bc_malloc, .buf_free = bc_free, .send_request = bc_send_request, signature.asc Description: This is a digitally signed message part
Re: [GIT PULL] x86/uapi for 3.8
On 2012.12.15 at 11:58 -0800, Linus Torvalds wrote: > On Sat, Dec 15, 2012 at 11:41 AM, H. Peter Anvin wrote: > > > > Matt is on vacation, and I'm partly offline for the weekend, but that > > definitely seems suspicious. Do we have a memory map of the affected > > machine(s)? > > Here's mine. > > e820: BIOS-provided physical RAM map: > BIOS-e820: [mem 0x-0x0009e7ff] usable > BIOS-e820: [mem 0x0009e800-0x0009] reserved > BIOS-e820: [mem 0x000e4000-0x000f] reserved > BIOS-e820: [mem 0x0010-0xbdc6] usable > BIOS-e820: [mem 0xbdc7-0xbdc87fff] ACPI data > BIOS-e820: [mem 0xbdc88000-0xbdcdbfff] ACPI NVS > BIOS-e820: [mem 0xbdcdc000-0xbfff] reserved > BIOS-e820: [mem 0xfee0-0xfee00fff] reserved > BIOS-e820: [mem 0xff80-0x] reserved > BIOS-e820: [mem 0x0001-0x0001fbff] usable > BIOS-e820: [mem 0x0001fc00-0x0001] reserved > BIOS-e820: [mem 0x0002-0x00023fff] usable > > but as mentioned, there's bound to be some particular kernel layout > that triggers this, because I definitely ran a few kernels with that > commit in it without problems (and clearly other people are too). > Looking at my boot log, I had successful boots with both 6a57d104c8cb > and c2714334b944, which contains that commit. > > It might also be that it causes some massive corruption at boot time, > but it then requires that that particular memory is actually used. So > maybe it's not so much about the memory map except indirectly. > > But that commit *does* look a lot more likely than the things I looked at. > > Markus, how did you happen to pinpoint that particular commit? Is it > entirely repeatable for you? Yes, although at one point during bisecting the BUG disappeared and the screen went simply black during boot and X never started. I marked this as bad and continued the bisection. Here is my mem-map: e820: BIOS-provided physical RAM map: BIOS-e820: [mem 0x0100-0x0009fbff] usable BIOS-e820: [mem 0x0009fc00-0x0009] reserved BIOS-e820: [mem 0x000e6000-0x000f] reserved BIOS-e820: [mem 0x0010-0xdfe8] usable BIOS-e820: [mem 0xdfe9-0xdfea7fff] ACPI data BIOS-e820: [mem 0xdfea8000-0xdfec] ACPI NVS BIOS-e820: [mem 0xdfed-0xdfef] reserved BIOS-e820: [mem 0xfff0-0x] reserved BIOS-e820: [mem 0x0001-0x00021fff] usable -- Markus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 162/241] Bluetooth: ath3k: Add support for VAIO VPCEH [0489:e027]
On Thu, 2012-12-13 at 11:58 -0200, Herton Ronaldo Krzesinski wrote: > 3.5.7.2 -stable review patch. If anyone has any objections, please let me > know. > > -- > > From: Marcos Chaparro > > commit acd9454433e28c1a365d8b069813c35c1c3a8ac3 upstream. > > Added Atheros AR3011 internal bluetooth device found in Sony VAIO VPCEH to the > devices list. > Before this, the bluetooth module was identified as an Foxconn / Hai bluetooth > device [0489:e027], now it claims to be an AtherosAR3011 Bluetooth > [0cf3:3005]. [...] This seems to be applicable to 3.{0,2,4,6}.y as well... Ben. -- Ben Hutchings Theory and practice are closer in theory than in practice. - John Levine, moderator of comp.compilers signature.asc Description: This is a digitally signed message part
Re: [GIT PULL] x86/uapi for 3.8
On Sat, Dec 15, 2012 at 11:41 AM, H. Peter Anvin wrote: > > Matt is on vacation, and I'm partly offline for the weekend, but that > definitely seems suspicious. Do we have a memory map of the affected > machine(s)? Here's mine. e820: BIOS-provided physical RAM map: BIOS-e820: [mem 0x-0x0009e7ff] usable BIOS-e820: [mem 0x0009e800-0x0009] reserved BIOS-e820: [mem 0x000e4000-0x000f] reserved BIOS-e820: [mem 0x0010-0xbdc6] usable BIOS-e820: [mem 0xbdc7-0xbdc87fff] ACPI data BIOS-e820: [mem 0xbdc88000-0xbdcdbfff] ACPI NVS BIOS-e820: [mem 0xbdcdc000-0xbfff] reserved BIOS-e820: [mem 0xfee0-0xfee00fff] reserved BIOS-e820: [mem 0xff80-0x] reserved BIOS-e820: [mem 0x0001-0x0001fbff] usable BIOS-e820: [mem 0x0001fc00-0x0001] reserved BIOS-e820: [mem 0x0002-0x00023fff] usable but as mentioned, there's bound to be some particular kernel layout that triggers this, because I definitely ran a few kernels with that commit in it without problems (and clearly other people are too). Looking at my boot log, I had successful boots with both 6a57d104c8cb and c2714334b944, which contains that commit. It might also be that it causes some massive corruption at boot time, but it then requires that that particular memory is actually used. So maybe it's not so much about the memory map except indirectly. But that commit *does* look a lot more likely than the things I looked at. Markus, how did you happen to pinpoint that particular commit? Is it entirely repeatable for you? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 152/241] mm: vmscan: fix endless loop in kswapd balancing
On Thu, 2012-12-13 at 11:58 -0200, Herton Ronaldo Krzesinski wrote: > 3.5.7.2 -stable review patch. If anyone has any objections, please let me > know. > > -- > > From: Johannes Weiner > > commit 60cefed485a02bd99b6299dad70666fe49245da7 upstream. [...] Greg, you missed this in 3.{0,4}.y. I'm attaching the version I used for 3.2.y, which seems to be applicable to 3.0.y. One or other of these should work for 3.4.y. Ben. -- Ben Hutchings Theory and practice are closer in theory than in practice. - John Levine, moderator of comp.compilers From 39d18dc4b8b0c000fa681cbae10ac3f8a132814b Mon Sep 17 00:00:00 2001 From: Johannes Weiner Date: Thu, 29 Nov 2012 13:54:23 -0800 Subject: [PATCH] mm: vmscan: fix endless loop in kswapd balancing commit 60cefed485a02bd99b6299dad70666fe49245da7 upstream. Kswapd does not in all places have the same criteria for a balanced zone. Zones are only being reclaimed when their high watermark is breached, but compaction checks loop over the zonelist again when the zone does not meet the low watermark plus two times the size of the allocation. This gets kswapd stuck in an endless loop over a small zone, like the DMA zone, where the high watermark is smaller than the compaction requirement. Add a function, zone_balanced(), that checks the watermark, and, for higher order allocations, if compaction has enough free memory. Then use it uniformly to check for balanced zones. This makes sure that when the compaction watermark is not met, at least reclaim happens and progress is made - or the zone is declared unreclaimable at some point and skipped entirely. Signed-off-by: Johannes Weiner Reported-by: George Spelvin Reported-by: Johannes Hirte Reported-by: Tomas Racek Tested-by: Johannes Hirte Reviewed-by: Rik van Riel Cc: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings --- mm/vmscan.c | 27 ++- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 313381c..1e4ee1a 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2492,6 +2492,19 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont, } #endif +static bool zone_balanced(struct zone *zone, int order, + unsigned long balance_gap, int classzone_idx) +{ + if (!zone_watermark_ok_safe(zone, order, high_wmark_pages(zone) + +balance_gap, classzone_idx, 0)) + return false; + + if (COMPACTION_BUILD && order && !compaction_suitable(zone, order)) + return false; + + return true; +} + /* * pgdat_balanced is used when checking if a node is balanced for high-order * allocations. Only zones that meet watermarks and are in a zone allowed @@ -2551,8 +2564,7 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining, continue; } - if (!zone_watermark_ok_safe(zone, order, high_wmark_pages(zone), - i, 0)) + if (!zone_balanced(zone, order, 0, i)) all_zones_ok = false; else balanced += zone->present_pages; @@ -2655,8 +2667,7 @@ loop_again: shrink_active_list(SWAP_CLUSTER_MAX, zone, &sc, priority, 0); - if (!zone_watermark_ok_safe(zone, order, - high_wmark_pages(zone), 0, 0)) { + if (!zone_balanced(zone, order, 0, 0)) { end_zone = i; break; } else { @@ -2717,9 +2728,8 @@ loop_again: (zone->present_pages + KSWAPD_ZONE_BALANCE_GAP_RATIO-1) / KSWAPD_ZONE_BALANCE_GAP_RATIO); - if (!zone_watermark_ok_safe(zone, order, - high_wmark_pages(zone) + balance_gap, - end_zone, 0)) { + if (!zone_balanced(zone, order, + balance_gap, end_zone)) { shrink_zone(priority, zone, &sc); reclaim_state->reclaimed_slab = 0; @@ -2746,8 +2756,7 @@ loop_again: continue; } - if (!zone_watermark_ok_safe(zone, order, - high_wmark_pages(zone), end_zone, 0)) { + if (!zone_balanced(zone, order, 0, end_zone)) { all_zones_ok = 0; /* * We are still under min water mark. This signature.asc Description: This is a digitally signed message part
Re: [RFC v2 1/1] RTL8712 alignment bug in 3.6 and up on ARMV5
Thanks for fixing this bug. Your patch works but it's not the right way to do it. The original code here adds 4 to pointers which are currently aligned instead of leaving them as is. We have a kernel ALIGN() macro which works correctly, but actually, it's not needed. On arm, the pointer returned from kmalloc() is already aligned at the 8 byte boundary because "#define ARCH_SLAB_MINALIGN 8". The original code always adds 4 to the pointer so everything is misaligned. Your patch adds another 4 bytes so it is now aligned at the 8 byte boundary again. That works, of course, but it's better to remove the whole mess. pstapriv->pallocated_stainfo_buf = kmalloc(sizeof(struct sta_info) * NUM_STA); Get rid of the ->pstainfo_buf pointer which is only used to store the "aligned" version of ->pallocated_stainfo_buf. Please send a version which applies with "git am" and has the proper sign-off. Sent it to yourself first. Save the raw email (including headers and everything). cat raw_email.txt | git am Type "git log -p" to verify that the commit message looks good. Then resend it to the list. Thanks again. This is a good bugfix. regards, dan carpenter -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
On 12/15/2012 10:35 AM, Linus Torvalds wrote: > On Sat, Dec 15, 2012 at 8:33 AM, Markus Trippelsdorf > wrote: >> On 2012.12.14 at 17:47 -0800, Linus Torvalds wrote: >>> >>> Ho humm. Anybody else see anything strange? >> >> Yes. I'm seeing a BUG early during boot on my machine (RIP=NULL): >> >> BUG: unable to handle kernel NULL pointer dereference at (null) >> >> This is caused by commit 53b87cf088e2 ("x86, mm: Include the >> entire kernel memory map in trampoline_pgd") > > Hmm. That reverts cleanly, and the result boots fine for me. And the > commit looks like exactly the kind of thing that could result in > problems with exactly the right memory layout, so it could explain why > the bisect failed and some kernels randomly worked for me and others > didn't. > > So this at least looks like a very possible candidate. > > Does anybody have an explanation for the problem? > > Btw. the machine in question does not have EFI, and is a bog-standard > PC (DMI string: "P7H57D-V EVO, BIOS 0999 01/19/2010") > > Matt? Jan? > Matt is on vacation, and I'm partly offline for the weekend, but that definitely seems suspicious. Do we have a memory map of the affected machine(s)? -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 139/241] block: Don't access request after it might be freed
On Thu, 2012-12-13 at 11:58 -0200, Herton Ronaldo Krzesinski wrote: > 3.5.7.2 -stable review patch. If anyone has any objections, please let me > know. > > -- > > From: Roland Dreier > > commit 893d290f1d7496db97c9471bc352ad4a11dc8a25 upstream. > > After we've done __elv_add_request() and __blk_run_queue() in > blk_execute_rq_nowait(), the request might finish and be freed > immediately. Therefore checking if the type is REQ_TYPE_PM_RESUME > isn't safe afterwards, because if it isn't, rq might be gone. > Instead, check beforehand and stash the result in a temporary. > > This fixes crashes in blk_execute_rq_nowait() I get occasionally when > running with lots of memory debugging options enabled -- I think this > race is usually harmless because the window for rq to be reallocated > is so small. > > Signed-off-by: Roland Dreier > Signed-off-by: Jens Axboe > [ herton: adjust context ] > Signed-off-by: Herton Ronaldo Krzesinski This is missing from 3.{0,4} but I did apply it to 3.2, again with the need to adjust context. Perhaps the intermediate fixes to blk_execute_rq_nowait() that resulted in the changed context should also be applied to stable updates. The fixes in question are: 8ba61435d73f block: add missing blk_queue_dead() checks - applied in 3.3; missing from 3.{0,2}.y e81ca6fe85b7 [SCSI] block: Fix blk_execute_rq_nowait() dead queue handling - applied in 3.6; missing from 3.{0,2,4,5}.y 893d290f1d74 block: Don't access request after it might be freed - applied in 3.7, then to 3.{2,6}.y; missing from 3.{0,4}.y Jens? Ben. > --- > block/blk-exec.c | 10 +- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/block/blk-exec.c b/block/blk-exec.c > index fb2cbd5..9925fbe 100644 > --- a/block/blk-exec.c > +++ b/block/blk-exec.c > @@ -49,8 +49,16 @@ void blk_execute_rq_nowait(struct request_queue *q, struct > gendisk *bd_disk, > rq_end_io_fn *done) > { > int where = at_head ? ELEVATOR_INSERT_FRONT : ELEVATOR_INSERT_BACK; > + bool is_pm_resume; > > WARN_ON(irqs_disabled()); > + > + /* > + * need to check this before __blk_run_queue(), because rq can > + * be freed before that returns. > + */ > + is_pm_resume = rq->cmd_type == REQ_TYPE_PM_RESUME; > + > spin_lock_irq(q->queue_lock); > > if (unlikely(blk_queue_dead(q))) { > @@ -66,7 +74,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct > gendisk *bd_disk, > __elv_add_request(q, rq, where); > __blk_run_queue(q); > /* the queue is stopped so it won't be run */ > - if (rq->cmd_type == REQ_TYPE_PM_RESUME) > + if (is_pm_resume) > q->request_fn(q); > spin_unlock_irq(q->queue_lock); > } -- Ben Hutchings Theory and practice are closer in theory than in practice. - John Levine, moderator of comp.compilers signature.asc Description: This is a digitally signed message part
Re: [tip:x86/microcode] x86/microcode_intel_early.c: Early update ucode on Intel's CPU
On 12/14/2012 11:57 PM, Yinghai Lu wrote: > > I tailored your patch and made use 2M page increase to replace patch > ioremap function. > >[PATCH v6 12/27] x86: use io_remap to access real_mode_data > > and it will extend init_level4_pgt to map extra range. that will limit > affect to even others. > > please check if that is ok to you. > What is the point of only managing 2M at a time? Now you have to have more conditionals and you don't get any more memory efficiency. Filling arbitrarily into the brk is not acceptable... the brk is an O(1) area and all brk allocations need to be reserved at compile time, so the overflow handling is still necessary. So no, this patch is not acceptable. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 135/241] md/raid10: close race that lose writes lost when replacement completes.
On Thu, 2012-12-13 at 11:58 -0200, Herton Ronaldo Krzesinski wrote: > 3.5.7.2 -stable review patch. If anyone has any objections, please let me > know. > > -- > > From: NeilBrown > > commit e7c0c3fa29280d62aa5e11101a674bb3064bd791 upstream. > > When a replacement operation completes there is a small window > when the original device is marked 'faulty' and the replacement > still looks like a replacement. The faulty should be removed and > the replacement moved in place very quickly, bit it isn't instant. > > So the code write out to the array must handle the possibility that > the only working device for some slot in the replacement - but it > doesn't. If the primary device is faulty it just gives up. This > can lead to corruption. > > So make the code more robust: if either the primary or the > replacement is present and working, write to them. Only when > neither are present do we give up. > > This bug has been present since replacement was introduced in > 3.3, so it is suitable for any -stable kernel since then. This is missing from 3.4, so Greg will presumably want to apply this (if the backport is correct). Ben. > Reported-by: "George Spelvin" > Signed-off-by: NeilBrown > [ herton: hairy code adjustment on 3rd hunk (conf->copies for loop) ] > Signed-off-by: Herton Ronaldo Krzesinski > --- > drivers/md/raid10.c | 113 > +++ > 1 file changed, 59 insertions(+), 54 deletions(-) > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 17fae37..0920adf 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -1267,18 +1267,21 @@ retry_write: > blocked_rdev = rrdev; > break; > } > + if (rdev && (test_bit(Faulty, &rdev->flags) > + || test_bit(Unmerged, &rdev->flags))) > + rdev = NULL; > if (rrdev && (test_bit(Faulty, &rrdev->flags) > || test_bit(Unmerged, &rrdev->flags))) > rrdev = NULL; > > r10_bio->devs[i].bio = NULL; > r10_bio->devs[i].repl_bio = NULL; > - if (!rdev || test_bit(Faulty, &rdev->flags) || > - test_bit(Unmerged, &rdev->flags)) { > + > + if (!rdev && !rrdev) { > set_bit(R10BIO_Degraded, &r10_bio->state); > continue; > } > - if (test_bit(WriteErrorSeen, &rdev->flags)) { > + if (rdev && test_bit(WriteErrorSeen, &rdev->flags)) { > sector_t first_bad; > sector_t dev_sector = r10_bio->devs[i].addr; > int bad_sectors; > @@ -1320,8 +1323,10 @@ retry_write: > max_sectors = good_sectors; > } > } > - r10_bio->devs[i].bio = bio; > - atomic_inc(&rdev->nr_pending); > + if (rdev) { > + r10_bio->devs[i].bio = bio; > + atomic_inc(&rdev->nr_pending); > + } > if (rrdev) { > r10_bio->devs[i].repl_bio = bio; > atomic_inc(&rrdev->nr_pending); > @@ -1377,58 +1382,58 @@ retry_write: > for (i = 0; i < conf->copies; i++) { > struct bio *mbio; > int d = r10_bio->devs[i].devnum; > - if (!r10_bio->devs[i].bio) > - continue; > - > - mbio = bio_clone_mddev(bio, GFP_NOIO, mddev); > - md_trim_bio(mbio, r10_bio->sector - bio->bi_sector, > - max_sectors); > - r10_bio->devs[i].bio = mbio; > - > - mbio->bi_sector = (r10_bio->devs[i].addr+ > -choose_data_offset(r10_bio, > - conf->mirrors[d].rdev)); > - mbio->bi_bdev = conf->mirrors[d].rdev->bdev; > - mbio->bi_end_io = raid10_end_write_request; > - mbio->bi_rw = WRITE | do_sync | do_fua; > - mbio->bi_private = r10_bio; > - > - atomic_inc(&r10_bio->remaining); > - spin_lock_irqsave(&conf->device_lock, flags); > - bio_list_add(&conf->pending_bio_list, mbio); > - conf->pending_count++; > - spin_unlock_irqrestore(&conf->device_lock, flags); > - if (!mddev_check_plugged(mddev)) > - md_wakeup_thread(mddev->thread); > - > - if (!r10_bio->devs[i].repl_bio) > - continue; > + if (r10_bio->devs[i].bio) { > + struct md_rdev *rdev = conf->mirrors[d].rdev; > + mbio = bio_clone_mddev(bio, GFP_NOIO, mddev); > + md_trim_bio(mbio, r10_bio->sector - bio->bi_sector, > + max_sectors); > +
Re: man page for s390_runtime_instr syscall
Hello Jan, On Mon, Dec 10, 2012 at 12:34 PM, Jan Glauber wrote: > Hi Michael, > > I've written a man page for the s390_runtime_instr syscall which was > merged with 3.7 (e4b8b3f). Now the question is if you would like to > include it in the man-pages although it is completely s390 specific and wont be available on any other arch? Or should it go into a > different package? Thanks for this page. The man-pages package is the right place for it, but a few things need fixing. Could you see below and resubmit please? > --- /dev/null 2012-12-04 10:52:46.657720288 +0100 > +++ s390_runtime_instr.22012-10-09 13:55:39.0 +0200 > @@ -0,0 +1,73 @@ > +.\" Copyright IBM Corp. 2012 > +.\" Author: Jan Glauber You have provided no license here. Can you please add one. Please see http://www.kernel.org/doc/man-pages/licenses.html. (My preference is the "verbatim" license, but others are of course possible.) > +.\" > +.TH S390_RUNTIME_INSTR 2 2012-10-09 "Linux Programmer's Manual" Update the date here. > +.SH NAME > +s390_runtime_instr \- enable/disable s390 CPU runtime instrumentation > +.SH SYNOPSIS > +.nf > +.B #include > + > +.BI "int s390_runtime_instr(int " command ", int " signum "); > +.fi > + > +.SH DESCRIPTION > +The > +.BR s390_runtime_instr () > +system call starts or stops CPU runtime instrumentation for the current > thread. > + > +The > +.IR command > +argument controls whether runtime instumentation is started Spelling: instrumentation > +( 1 ) or stopped ( 2 ) for the current thread. > + > +The > +.IR signum > +argument specifies the number of a real-time signal. The Please start new sentences on a new source line. > +real-time signal is sent to the thread if the runtime instrumentation > +buffer is full or if the runtime-instrumentation-halted interrupt > +occured. Spelling: occurred. > + > +.SH RETURN VALUE > +On success > +.BR s390_runtime_instr () > +returns 0 and enables the thread for > +runtime instrumentation by assigning the thread a default runtime > +instrumentation control block. The caller can then read and modify the Start new sentence on a new source line. > +control block and start the runtime instrumentation. On error, -1 is Start new sentence on a new source line. > +returned and > +.IR errno > +is set to one of the error codes listed below. > + > +.SH ERRORS > +.TP > +.B EOPNOTSUPP > +The runtime instrumentation facility is not available. > +.TP > +.B EINVAL > +The value specified in > +.IR command > +is not a valid command or the value specified in > +.IR signum > +is not a real-time signal number. > +.TP > +.B ENOMEM > +Allocating memory for the runtime instrumentation control block > +failed. > + > +.SH VERSIONS > +This system call is available since Linux 3.7. > + > +.SH CONFORMING TO > +This system call This Linux-specific system call > +is only available on the s390 architecture. The runtime instrumentation > facility is available http://www.kernel.org/doc/man-pages/licenses.html > +beginning with System z EC12. > + > +.SH NOTES > +Glibc does not provide a wrapper for this system call, use > +.BR syscall (2) > +to call it. Somewhere around here it would be nice to have some notes on how one uses this RI feature. The commit e4b8b3f33fcaa0ed6e6b5482a606091d8cd20beb has a bit of info. I'd suggest including that info in the page, with (possibly) an example. > + > +.SH SEE ALSO > +.BR signal (7), > +.BR syscall (2) Order entries here by section number. Thanks, Michael -- Michael Kerrisk Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ Author of "The Linux Programming Interface"; http://man7.org/tlpi/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [GIT PULL] x86/uapi for 3.8
On Sat, Dec 15, 2012 at 8:33 AM, Markus Trippelsdorf wrote: > On 2012.12.14 at 17:47 -0800, Linus Torvalds wrote: >> >> Ho humm. Anybody else see anything strange? > > Yes. I'm seeing a BUG early during boot on my machine (RIP=NULL): > > BUG: unable to handle kernel NULL pointer dereference at (null) > > This is caused by commit 53b87cf088e2 ("x86, mm: Include the > entire kernel memory map in trampoline_pgd") Hmm. That reverts cleanly, and the result boots fine for me. And the commit looks like exactly the kind of thing that could result in problems with exactly the right memory layout, so it could explain why the bisect failed and some kernels randomly worked for me and others didn't. So this at least looks like a very possible candidate. Does anybody have an explanation for the problem? Btw. the machine in question does not have EFI, and is a bog-standard PC (DMI string: "P7H57D-V EVO, BIOS 0999 01/19/2010") Matt? Jan? Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] avoid entropy starvation due to stack protection
Why not use nonblocking pool and seed nonblocking pool only with half of collected entropy to get /dev/random in almost all practical scenarios nonblocking? On Thu, Dec 13, 2012 at 08:44:36AM +0100, Stephan Mueller wrote: > On 13.12.2012 01:43:21, +0100, Andrew Morton > wrote: > > Hi Andrew, > > On Tue, 11 Dec 2012 13:33:04 +0100 > > Stephan Mueller wrote: > > > >> Some time ago, I noticed the fact that for every newly > >> executed process, the function create_elf_tables requests 16 bytes of > >> randomness from get_random_bytes. This is easily visible when calling > >> > >> while [ 1 ] > >> do > >>cat /proc/sys/kernel/random/entropy_avail > >>sleep 1 > >> done > > Please see > > http://ozlabs.org/~akpm/mmotm/broken-out/binfmt_elfc-use-get_random_int-to-fix-entropy-depleting.patch > > > > That patch is about one week from a mainline merge, btw. > > Initially I was also thinking about get_random_int. But stack protection > depends on non-predictable numbers to ensure it cannot be defeated. As > get_random_int depends on MD5 which is assumed to be broken now, I > discarded the idea of using get_random_int. > > Moreover, please consider that get_cycles is an architecture-specific > function that on some architectures only returns 0 (For all > architectures where this is implemented, you have no guarantee that it > increments as a high-resolution timer). So, the quality of > get_random_int is questionable IMHO for the use as a stack protector. > > Also note, that other in-kernel users of get_random_bytes may be > converted to using the proposed kernel pool to avoid more entropy drainage. > > Please note that the suggested approach of fully seeding a deterministic > RNG never followed by a re-seeding is used elsewhere (e.g. the OpenSSL > RNG). Therefore, I think the suggested approach is viable. > > Ciao > Stephan > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- It's those computer people in X {city of world}. They keep stuffing things up. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
oopsable race in xen-gntdev (unsafe vma access)
1) find_vma() is *not* safe without ->mmap_sem and its result may very well be freed just as it's returned to caller. IOW, gntdev_ioctl_get_offset_for_vaddr() is racy and may end up with dereferencing freed memory. 2) gntdev_vma_close() is putting NULL into map->vma with only ->mmap_sem held by caller. Things like if (!map->vma) continue; if (map->vma->vm_start >= end) continue; if (map->vma->vm_end <= start) done with just priv->lock held are racy. I'm not familiar with the code, but it looks like we need to protect gntdev_vma_close() guts with the same spinlock and probably hold ->mmap_sem shared around the "find_vma()+get to map->{index,count}" in the ioctl. Or replace the logics in ioctl with search through the list of grant_map under the same spinlock... Comments? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 12/12] edac: fix kernel panic on module unloading
On Fri, Dec 14, 2012 at 03:03:10PM +0400, Konstantin Khlebnikov wrote: > This patch fixes use-after-free and double-free bugs in > edac_mc_sysfs_exit(). mci_pdev has single reference and put_device() > calls mc_attr_release() which calls kfree(), thus following > device_del() works with already released memory. An another kfree() in > edac_mc_sysfs_exit() releses the same memory again. Great. Applied and tagged for 3.6 and 3.7 stable. Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Panic at shutdown in x86-64 3.7 kernel under qemu 1.3.0?
Reasonably vanilla versions of both just did this. No idea why. Just did it the once, haven't gotten it to reproduce... Rob Restarting system. reboot: machine restart general protection fault: fff2 [#1] CPU 0 Pid: 8542, comm: oneit Not tainted 3.7.0 #1 Bochs Bochs RIP: 0010:[] [] lapic_shutdown+0x29/0x2b RSP: 0018:88000fb57e28 EFLAGS: 0202 RAX: 8130e2d0 RBX: 0202 RCX: RDX: 81322a40 RSI: 00ff RDI: 00f0 RBP: 28121969 R08: 88000fb57fd8 R09: R10: R11: 81015721 R12: fee1dead R13: R14: 0004 R15: 00425e02 FS: () GS:81304000() knlGS: CS: 0010 DS: ES: CR0: 8005003b CR2: 00440f11 CR3: 0fb53000 CR4: 06b0 DR0: DR1: DR2: DR3: DR6: DR7: Process oneit (pid: 8542, threadinfo 88000fb56000, task 88000e848f90) Stack: 01234567 810136cb 28121969 810136a8 01234567 8102b451 0011 00040001 0023 0006 0001802a0027 Call Trace: [] ? native_machine_shutdown+0x9/0x1e [] ? native_machine_restart+0x20/0x29 [] ? sys_reboot+0x11f/0x14a [] ? __kill_pgrp_info+0x37/0x5f [] ? do_exit+0x61f/0x623 [] ? schedule_tail+0x20/0x46 [] ? ret_from_fork+0xf/0xb0 [] ? system_call_fastpath+0x16/0x1b Code: c8 c3 48 8b 05 8a cc 31 00 53 f6 c4 02 75 12 83 3d c9 da 38 00 00 74 13 83 3d d8 ea 38 00 00 75 0a 9c 5b fa e8 3a ff ff ff 53 9d <5b> c3 48 83 ec 08 eb 02 f3 90 48 8b 05 eb db 31 00 bf 00 03 00 RIP [] lapic_shutdown+0x29/0x2b RSP ---[ end trace 0c69c9c16377bd9d ]- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v6 04/27] x86, boot: Move lldt/ltr out of 64bit code section
On Thu, Dec 13, 2012 at 02:01:58PM -0800, Yinghai Lu wrote: > commit 08da5a2ca > > x86_64: Early segment setup for VT > > add lldt/ltr to clean more segments. > > Those code are put in code64, and it is using gdt that is only > loaded from code32 path. > > That breaks booting with 64bit bootloader that does not go through > code32 path. It get at startup_64 directly, and it has different > gdt. > > Move those lines into code32 after their gdt is loaded. Let me rewrite that commit message for ya, you tell me whether I got it right: "08da5a2ca479 ("x86_64: Early segment setup for VT") sets up LDT and TR into a valid state in order to speed up boot decompression under VT. The code which loads the GDT is executed in the 32-bit startup code while the above change in the 64-bit part. However, this breaks 64-bit bootloaders which jump straight to the 64-bit startup entry point and thus skip LDR and TR setup because they use a different GDT. Fix this by moving the LDT and TR setup to the 32-bit section." Is that correct? If so, why not take the time and try to write your commits more understandably so that bystanders like me don't have to look at the code first and scramble to understand what you mean? Thanks. -- Regards/Gruss, Boris. Sent from a fat crate under my desk. Formatting is fine. -- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/2 v3] Fix memory freeing issues
On Fri, Dec 14, 2012 at 11:33:50AM +0200, Vitalii Demianets wrote: > > Hans, why do you want to put in this patch, which is dealing with > memory-freeing issues only, completely unrelated functional changes? Because during review of your patch we happened to find another issue a few lines up and down. Why not fix it on the way? What I'd like is simply [PATCH] Fix uio_pdrv_genirq issues If you like, make it two patches, one with your memory-freeing issue and one "Remove irq tracking" or something like that. That's just three or four lines difference, I'd even accept it if it were only one patch. I don't want to fix one thing now and leave the other one unresolved. That would just be a waste of time. To be clear, I have no objections regarding your memory freeing ideas. Thanks, Hans -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: WARNING: at drivers/tty/tty_buffer.c:476 flush_to_ldisc+0x1de/0x1f0()
On Fri, Dec 14, 2012 at 10:53:16PM -0500, Peter Hurley wrote: > On Fri, 2012-12-14 at 18:29 -0800, Greg Kroah-Hartman wrote: > > On Tue, Dec 11, 2012 at 10:01:24PM -0500, Dave Jones wrote: > > > Fuzz-testing fallout from post 3.7 tree as of commit > > > 414a6750e59b0b687034764c464e9ddecac0f7a6 > > > > > > [ 2181.230579] [ cut here ] > > > [ 2181.231277] WARNING: at drivers/tty/tty_buffer.c:476 > > > flush_to_ldisc+0x1de/0x1f0() > > > [ 2181.232358] Hardware name: GA-MA78GM-S2H > > > [ 2181.232925] tty is NULL > > > [ 2181.233430] Modules linked in: l2tp_ppp l2tp_core fuse rfcomm > > > binfmt_misc hidp bnep scsi_transport_iscsi ipt_ULOG nfnetlink rose ipx > > > p8023 p8022 caif_socket caif af_rxrpc x25 irda af_key appletalk pppoe > > > netrom pppox ppp_generic decnet phonet slhc psnap crc_ccitt ax25 llc2 rds > > > atm llc nfc can nfsv3 nfs_acl nfs fscache lockd sunrpc ip6t_REJECT > > > nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack > > > ip6table_filter ip6_tables snd_hda_codec_realtek btusb snd_hda_intel > > > bluetooth usb_debug snd_hda_codec microcode snd_pcm serio_raw pcspkr > > > snd_page_alloc snd_timer edac_core snd soundcore r8169 mii vhost_net tun > > > macvtap macvlan kvm_amd kvm > > > [ 2181.245632] Pid: 29787, comm: kworker/0:1 Not tainted 3.7.0+ #12 > > > [ 2181.246503] Call Trace: > > > [ 2181.246851] [] warn_slowpath_common+0x7f/0xc0 > > > [ 2181.247725] [] warn_slowpath_fmt+0x46/0x50 > > > [ 2181.248558] [] ? ___ratelimit+0x9a/0x120 > > > [ 2181.249347] [] flush_to_ldisc+0x1de/0x1f0 > > > [ 2181.250164] [] process_one_work+0x207/0x750 > > > [ 2181.251013] [] ? process_one_work+0x197/0x750 > > > [ 2181.251893] [] ? destroy_work_on_stack+0x20/0x20 > > > [ 2181.252809] [] ? > > > tty_insert_flip_string_fixed_flag+0x110/0x110 > > > [ 2181.253993] [] worker_thread+0x156/0x440 > > > [ 2181.254815] [] ? rescuer_thread+0x240/0x240 > > > [ 2181.255638] [] kthread+0xed/0x100 > > > [ 2181.256374] [] ? put_lock_stats.isra.23+0xe/0x40 > > > [ 2181.257290] [] ? kthread_create_on_node+0x160/0x160 > > > [ 2181.258223] [] ret_from_fork+0x7c/0xb0 > > > [ 2181.259018] [] ? kthread_create_on_node+0x160/0x160 > > > [ 2181.259969] ---[ end trace 12dd9f01acd7e09f ]--- > > > > Jiri, I thought we resolved these warnings in the linux-next tree, how > > are they still showing up? > > Greg, that's what the series that I just sent v2 of fixes. Look for > "[PATCH v2 0/11] tty: Fix buffer work access-after-free" et al. Ah, ok, I was holding off on looking at those until after 3.8-rc1 is out, I'll do so then, thanks. greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/