Re: [PATCH] clocksource: Add heuristics to avoid switching away from TSC due to timer delay
> /* > * Proper multiline comments look like this not like > * the above. > */ Got it, will fix next time around. > That aside. Why are you trying to do heuristics on the delta? > > We have way better information than that. The watchdog timer expiry time is > known and we can determine the exact delay of the timer. > > The watchdog clocksource provides the maximum 'idle' time, i.e. the time > between two reads, in clocksource::max_idle_ns. That value is filled in > when the clocksource is configured. > > So without doing speculation we can make an informed decision: > > elapsed = jiffies_to_nsec(jiffies - watchdog_timer->expires) + > WATCHDOG_INTERVAL_NS; > > if (elapsed > wdcs->max_idle_ns) { > Skip .. > } Yes, that makes more sense than what I was doing, although I'm not sure on the details. Just missed that idea. Why are you adding the watchdog interval to the calculated elapsed time? It seems we have an issue exactly if jiffies - watchdog_timer->expires is too big, without adding the interval we tried to wait in on top. Also I think we might want to be careful that jiffies is >= the expires time - or is it not possible that a timer fires one jiffy early? Also for full generality it seems we should check against the clocksource max_idle_ns as well - for x86 TSC is wider than HPET but there may be other architectures that could hit the same problem, just with the clocksource being checked wrapping around instead of the watchdog clocksource. Right? Thanks! Roland
Re: [PATCH] clocksource: Add heuristics to avoid switching away from TSC due to timer delay
> /* > * Proper multiline comments look like this not like > * the above. > */ Got it, will fix next time around. > That aside. Why are you trying to do heuristics on the delta? > > We have way better information than that. The watchdog timer expiry time is > known and we can determine the exact delay of the timer. > > The watchdog clocksource provides the maximum 'idle' time, i.e. the time > between two reads, in clocksource::max_idle_ns. That value is filled in > when the clocksource is configured. > > So without doing speculation we can make an informed decision: > > elapsed = jiffies_to_nsec(jiffies - watchdog_timer->expires) + > WATCHDOG_INTERVAL_NS; > > if (elapsed > wdcs->max_idle_ns) { > Skip .. > } Yes, that makes more sense than what I was doing, although I'm not sure on the details. Just missed that idea. Why are you adding the watchdog interval to the calculated elapsed time? It seems we have an issue exactly if jiffies - watchdog_timer->expires is too big, without adding the interval we tried to wait in on top. Also I think we might want to be careful that jiffies is >= the expires time - or is it not possible that a timer fires one jiffy early? Also for full generality it seems we should check against the clocksource max_idle_ns as well - for x86 TSC is wider than HPET but there may be other architectures that could hit the same problem, just with the clocksource being checked wrapping around instead of the watchdog clocksource. Right? Thanks! Roland
[tip:x86/timers] x86/hpet: Remove unused FSEC_PER_NSEC define
Commit-ID: d999c0ec2498e54b9328db6b2c1037710025add1 Gitweb: https://git.kernel.org/tip/d999c0ec2498e54b9328db6b2c1037710025add1 Author: Roland Dreier AuthorDate: Fri, 30 Nov 2018 13:14:50 -0800 Committer: Borislav Petkov CommitDate: Tue, 4 Dec 2018 12:17:21 +0100 x86/hpet: Remove unused FSEC_PER_NSEC define The FSEC_PER_NSEC macro has had zero users since commit ab0e08f15d23 ("x86: hpet: Cleanup the clockevents init and register code"). Remove it. Signed-off-by: Roland Dreier Signed-off-by: Borislav Petkov Acked-by: Thomas Gleixner Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: x86-ml Link: https://lkml.kernel.org/r/20181130211450.5200-1-rol...@purestorage.com --- arch/x86/kernel/hpet.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index b0acb22e5a46..dfd3aca82c61 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -21,10 +21,6 @@ #define HPET_MASK CLOCKSOURCE_MASK(32) -/* FSEC = 10^-15 - NSEC = 10^-9 */ -#define FSEC_PER_NSEC 100L - #define HPET_DEV_USED_BIT 2 #define HPET_DEV_USED (1 << HPET_DEV_USED_BIT) #define HPET_DEV_VALID 0x8
[tip:x86/timers] x86/hpet: Remove unused FSEC_PER_NSEC define
Commit-ID: d999c0ec2498e54b9328db6b2c1037710025add1 Gitweb: https://git.kernel.org/tip/d999c0ec2498e54b9328db6b2c1037710025add1 Author: Roland Dreier AuthorDate: Fri, 30 Nov 2018 13:14:50 -0800 Committer: Borislav Petkov CommitDate: Tue, 4 Dec 2018 12:17:21 +0100 x86/hpet: Remove unused FSEC_PER_NSEC define The FSEC_PER_NSEC macro has had zero users since commit ab0e08f15d23 ("x86: hpet: Cleanup the clockevents init and register code"). Remove it. Signed-off-by: Roland Dreier Signed-off-by: Borislav Petkov Acked-by: Thomas Gleixner Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: x86-ml Link: https://lkml.kernel.org/r/20181130211450.5200-1-rol...@purestorage.com --- arch/x86/kernel/hpet.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index b0acb22e5a46..dfd3aca82c61 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -21,10 +21,6 @@ #define HPET_MASK CLOCKSOURCE_MASK(32) -/* FSEC = 10^-15 - NSEC = 10^-9 */ -#define FSEC_PER_NSEC 100L - #define HPET_DEV_USED_BIT 2 #define HPET_DEV_USED (1 << HPET_DEV_USED_BIT) #define HPET_DEV_VALID 0x8
[PATCH] x86/hpet: Remove unused FSEC_PER_NSEC define
The FSEC_PER_NSEC macro has had zero users since commit ab0e08f15d23 ("x86: hpet: Cleanup the clockevents init and register code"). Signed-off-by: Roland Dreier --- arch/x86/kernel/hpet.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index b0acb22e5a46..dfd3aca82c61 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -21,10 +21,6 @@ #define HPET_MASK CLOCKSOURCE_MASK(32) -/* FSEC = 10^-15 - NSEC = 10^-9 */ -#define FSEC_PER_NSEC 100L - #define HPET_DEV_USED_BIT 2 #define HPET_DEV_USED (1 << HPET_DEV_USED_BIT) #define HPET_DEV_VALID 0x8 -- 2.19.1
[PATCH] x86/hpet: Remove unused FSEC_PER_NSEC define
The FSEC_PER_NSEC macro has had zero users since commit ab0e08f15d23 ("x86: hpet: Cleanup the clockevents init and register code"). Signed-off-by: Roland Dreier --- arch/x86/kernel/hpet.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index b0acb22e5a46..dfd3aca82c61 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -21,10 +21,6 @@ #define HPET_MASK CLOCKSOURCE_MASK(32) -/* FSEC = 10^-15 - NSEC = 10^-9 */ -#define FSEC_PER_NSEC 100L - #define HPET_DEV_USED_BIT 2 #define HPET_DEV_USED (1 << HPET_DEV_USED_BIT) #define HPET_DEV_VALID 0x8 -- 2.19.1
[PATCH] clocksource: Add heuristics to avoid switching away from TSC due to timer delay
On a modern x86 system, the TSC is used as a clocksource, with HPET used in the clocksource watchdog to make sure that the TSC is stable. If the clocksource watchdog_timer is delayed for an extremely long time (for example if softirqs are being serviced in ksoftirqd, and realtime threads are starving ksoftirqd), then the 32-bit HPET counter may wrap around. For example, with an HPET running at 24 MHz, 2^32 cycles is about 179 seconds - a long time for timers to be starved, but possible with a poorly behaved realtime thread. If this happens, since the TSC is a 64-bit counter and won't wrap, the watchdog will detect skew - the TSC interval will be 179 seconds longer than the HPET interval - and will mark the TSC as unstable. This causes the system to switch to the HPET as a clocksource, which has a huge negative performance impact. In this case, switching to the HPET just makes a bad situation (timers starved) that the system might recover from turn permanently even worse (more expensive clock_gettime() calls), due to a spurious false positive detection of TSC instability. To improve this, add some heuristics to detect cases where the watchdog is delayed long enough for the instability detection to be likely to be wrong: - If the clocksource being tested (eg TSC) has counted so many cycles that converting to nsecs will overflow multiplication, *AND* the watchdog clocksource (eg HPET) shows that the watchdog timer has missed its interval by at least a factor of 3, skip marking the clocksource as unstable for a timer interation. This is not perfect - for example it is possible for the watchdog clocksource to wrap around and show a small interval - but at least in the specific x86 it is unlikely, since the watchdog interval is a small fraction of the wraparound interval. - If there is a skew between the clocksource being tested and the watchdog clocksource that is at least as big as the wraparound interval for the watchdog clocksource, then don't mark the clocksource as unstable. Again, this might fail to mark a clocksource as unstable for one iteration, but it is unlikely that the instability is bad enough that we will see a larger skew than the wraparound interval for many iterations. These heuristics are imperfect but are chosen to make false detection of instability much less likely, while leaving detection of true instability very likely within a few clocksource watchdog iterations. Signed-off-by: Roland Dreier --- kernel/time/clocksource.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index ffe081623aec..f1b3d8ff2437 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -243,12 +243,47 @@ static void clocksource_watchdog(struct timer_list *unused) watchdog->shift); delta = clocksource_delta(csnow, cs->cs_last, cs->mask); + + /* If the cycle delta is beyond what we can safely +* convert to nsecs, and the watchdog clocksource +* suggests that we've overslept, skip checking this +* iteration to avoid marking a clocksource as +* unstable because of a severely delayed timer. */ + if (delta > cs->max_cycles && + wd_nsec > 3 * jiffies_to_nsecs(WATCHDOG_INTERVAL)) { + pr_warn("timekeeping watchdog: Clocksource '%s' not checked due to apparent long timer delay:\n", + cs->name); + pr_warn(" Delta %llx > max_cycles %llx, wd_nsec %lld\n", + delta, cs->max_cycles, wd_nsec); + continue; + } + cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift); wdlast = cs->wd_last; /* save these in case we print them */ cslast = cs->cs_last; cs->cs_last = csnow; cs->wd_last = wdnow; + /* If the clocksource interval is far off from the +* watchdog clocksource interval but the interval is +* big enough that the watchdog may have wrapped +* around (again due to a severely delayed timer), +* skip this iteration. For example, this saves us +* from marking the TSC as unstable just because the +* 32-bit HPET wrapped around on x86. */ + if (abs(cs_nsec - wd_nsec) > + clocksource_cyc2ns(watchdog->max_cycles, watchdog->mult, + watchdog->shift) - WATCHDOG_THRESHOLD) { + pr_warn("timekeeping watchdog: Clocksource '%s' not checked due to apparent t
[PATCH] clocksource: Add heuristics to avoid switching away from TSC due to timer delay
On a modern x86 system, the TSC is used as a clocksource, with HPET used in the clocksource watchdog to make sure that the TSC is stable. If the clocksource watchdog_timer is delayed for an extremely long time (for example if softirqs are being serviced in ksoftirqd, and realtime threads are starving ksoftirqd), then the 32-bit HPET counter may wrap around. For example, with an HPET running at 24 MHz, 2^32 cycles is about 179 seconds - a long time for timers to be starved, but possible with a poorly behaved realtime thread. If this happens, since the TSC is a 64-bit counter and won't wrap, the watchdog will detect skew - the TSC interval will be 179 seconds longer than the HPET interval - and will mark the TSC as unstable. This causes the system to switch to the HPET as a clocksource, which has a huge negative performance impact. In this case, switching to the HPET just makes a bad situation (timers starved) that the system might recover from turn permanently even worse (more expensive clock_gettime() calls), due to a spurious false positive detection of TSC instability. To improve this, add some heuristics to detect cases where the watchdog is delayed long enough for the instability detection to be likely to be wrong: - If the clocksource being tested (eg TSC) has counted so many cycles that converting to nsecs will overflow multiplication, *AND* the watchdog clocksource (eg HPET) shows that the watchdog timer has missed its interval by at least a factor of 3, skip marking the clocksource as unstable for a timer interation. This is not perfect - for example it is possible for the watchdog clocksource to wrap around and show a small interval - but at least in the specific x86 it is unlikely, since the watchdog interval is a small fraction of the wraparound interval. - If there is a skew between the clocksource being tested and the watchdog clocksource that is at least as big as the wraparound interval for the watchdog clocksource, then don't mark the clocksource as unstable. Again, this might fail to mark a clocksource as unstable for one iteration, but it is unlikely that the instability is bad enough that we will see a larger skew than the wraparound interval for many iterations. These heuristics are imperfect but are chosen to make false detection of instability much less likely, while leaving detection of true instability very likely within a few clocksource watchdog iterations. Signed-off-by: Roland Dreier --- kernel/time/clocksource.c | 35 +++ 1 file changed, 35 insertions(+) diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c index ffe081623aec..f1b3d8ff2437 100644 --- a/kernel/time/clocksource.c +++ b/kernel/time/clocksource.c @@ -243,12 +243,47 @@ static void clocksource_watchdog(struct timer_list *unused) watchdog->shift); delta = clocksource_delta(csnow, cs->cs_last, cs->mask); + + /* If the cycle delta is beyond what we can safely +* convert to nsecs, and the watchdog clocksource +* suggests that we've overslept, skip checking this +* iteration to avoid marking a clocksource as +* unstable because of a severely delayed timer. */ + if (delta > cs->max_cycles && + wd_nsec > 3 * jiffies_to_nsecs(WATCHDOG_INTERVAL)) { + pr_warn("timekeeping watchdog: Clocksource '%s' not checked due to apparent long timer delay:\n", + cs->name); + pr_warn(" Delta %llx > max_cycles %llx, wd_nsec %lld\n", + delta, cs->max_cycles, wd_nsec); + continue; + } + cs_nsec = clocksource_cyc2ns(delta, cs->mult, cs->shift); wdlast = cs->wd_last; /* save these in case we print them */ cslast = cs->cs_last; cs->cs_last = csnow; cs->wd_last = wdnow; + /* If the clocksource interval is far off from the +* watchdog clocksource interval but the interval is +* big enough that the watchdog may have wrapped +* around (again due to a severely delayed timer), +* skip this iteration. For example, this saves us +* from marking the TSC as unstable just because the +* 32-bit HPET wrapped around on x86. */ + if (abs(cs_nsec - wd_nsec) > + clocksource_cyc2ns(watchdog->max_cycles, watchdog->mult, + watchdog->shift) - WATCHDOG_THRESHOLD) { + pr_warn("timekeeping watchdog: Clocksource '%s' not checked due to apparent t
[PATCH] x86/hpet: Remove unused FSEC_PER_NSEC define
The FSEC_PER_NSEC macro has had zero users since commit ab0e08f15d23 ("x86: hpet: Cleanup the clockevents init and register code"). Signed-off-by: Roland Dreier --- arch/x86/kernel/hpet.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index b0acb22e5a46..dfd3aca82c61 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -21,10 +21,6 @@ #define HPET_MASK CLOCKSOURCE_MASK(32) -/* FSEC = 10^-15 - NSEC = 10^-9 */ -#define FSEC_PER_NSEC 100L - #define HPET_DEV_USED_BIT 2 #define HPET_DEV_USED (1 << HPET_DEV_USED_BIT) #define HPET_DEV_VALID 0x8 -- 2.19.1
[PATCH] x86/hpet: Remove unused FSEC_PER_NSEC define
The FSEC_PER_NSEC macro has had zero users since commit ab0e08f15d23 ("x86: hpet: Cleanup the clockevents init and register code"). Signed-off-by: Roland Dreier --- arch/x86/kernel/hpet.c | 4 1 file changed, 4 deletions(-) diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c index b0acb22e5a46..dfd3aca82c61 100644 --- a/arch/x86/kernel/hpet.c +++ b/arch/x86/kernel/hpet.c @@ -21,10 +21,6 @@ #define HPET_MASK CLOCKSOURCE_MASK(32) -/* FSEC = 10^-15 - NSEC = 10^-9 */ -#define FSEC_PER_NSEC 100L - #define HPET_DEV_USED_BIT 2 #define HPET_DEV_USED (1 << HPET_DEV_USED_BIT) #define HPET_DEV_VALID 0x8 -- 2.19.1
Re: [PATCH 0/3] Provide more fine grained control over multipathing
> The sensible thing to do in nvme is to use different paths for > different queues. That is e.g. in the RDMA case use the HCA closer > to a given CPU by default. We might allow to override this for > cases where the is a good reason, but what I really don't want is > configurability for configurabilities sake. That makes sense but I'm not sure it covers everything. Probably the most common way to do NVMe/RDMA will be with a single HCA that has multiple ports, so there's no sensible CPU locality. On the other hand we want to keep both ports to the fabric busy. Setting different paths for different queues makes sense, but there may be single-threaded applications that want a different policy. I'm not saying anything very profound, but we have to find the right balance between too many and too few knobs. - R.
Re: [PATCH 0/3] Provide more fine grained control over multipathing
> The sensible thing to do in nvme is to use different paths for > different queues. That is e.g. in the RDMA case use the HCA closer > to a given CPU by default. We might allow to override this for > cases where the is a good reason, but what I really don't want is > configurability for configurabilities sake. That makes sense but I'm not sure it covers everything. Probably the most common way to do NVMe/RDMA will be with a single HCA that has multiple ports, so there's no sensible CPU locality. On the other hand we want to keep both ports to the fabric busy. Setting different paths for different queues makes sense, but there may be single-threaded applications that want a different policy. I'm not saying anything very profound, but we have to find the right balance between too many and too few knobs. - R.
Re: [PATCH 0/3] Provide more fine grained control over multipathing
> Moreover, I also wanted to point out that fabrics array vendors are > building products that rely on standard nvme multipathing (and probably > multipathing over dispersed namespaces as well), and keeping a knob that > will keep nvme users with dm-multipath will probably not help them > educate their customers as well... So there is another angle to this. As a vendor who is building an NVMe-oF storage array, I can say that clarity around how Linux wants to handle NVMe multipath would definitely be appreciated. It would be great if we could all converge around the upstream native driver but right now it doesn't look adequate - having only a single active path is not the best way to use a multi-controller storage system. Unfortunately it looks like we're headed to a world where people have to write separate "best practices" documents to cover RHEL, SLES and other vendors. We plan to implement all the fancy NVMe standards like ANA, but it seems that there is still a requirement to let the host side choose policies about how to use paths (round-robin vs least queue depth for example). Even in the modern SCSI world with VPD pages and ALUA, there are still knobs that are needed. Maybe NVMe will be different and we can find defaults that work in all cases but I have to admit I'm skeptical... - R.
Re: [PATCH 0/3] Provide more fine grained control over multipathing
> Moreover, I also wanted to point out that fabrics array vendors are > building products that rely on standard nvme multipathing (and probably > multipathing over dispersed namespaces as well), and keeping a knob that > will keep nvme users with dm-multipath will probably not help them > educate their customers as well... So there is another angle to this. As a vendor who is building an NVMe-oF storage array, I can say that clarity around how Linux wants to handle NVMe multipath would definitely be appreciated. It would be great if we could all converge around the upstream native driver but right now it doesn't look adequate - having only a single active path is not the best way to use a multi-controller storage system. Unfortunately it looks like we're headed to a world where people have to write separate "best practices" documents to cover RHEL, SLES and other vendors. We plan to implement all the fancy NVMe standards like ANA, but it seems that there is still a requirement to let the host side choose policies about how to use paths (round-robin vs least queue depth for example). Even in the modern SCSI world with VPD pages and ALUA, there are still knobs that are needed. Maybe NVMe will be different and we can find defaults that work in all cases but I have to admit I'm skeptical... - R.
Re: KASAN: use-after-free Read in __list_add_valid (5)
> Still reproducible on Linus' tree (commit 66e1c94db3cd4e) and on linux-next > (next-20180511). Here's a simplified reproducer: Thanks! That's a fantastic test case. The issue is a race where rdma_listen() sees invalid state in the middle of an rdma_bind_addr() call that will ultimately fail. I'll send a proposed patch shortly. - R.
Re: KASAN: use-after-free Read in __list_add_valid (5)
> Still reproducible on Linus' tree (commit 66e1c94db3cd4e) and on linux-next > (next-20180511). Here's a simplified reproducer: Thanks! That's a fantastic test case. The issue is a race where rdma_listen() sees invalid state in the middle of an rdma_bind_addr() call that will ultimately fail. I'll send a proposed patch shortly. - R.
Re: [Patch v2 00/19] CIFS: Implement SMBDirect
> Starting with SMB2 dialect 3.0, Microsoft introduced SMBDirect transport > protocol for transferring upper layer (SMB2) payload over RDMA via > Infiniband, RoCE or iWARP. The prococol is published in [MS-SMBD] > (https://msdn.microsoft.com/en-us/library/hh536346.aspx). This is great to see. Is there a Linux implementation of the server side (in Samba?) so that the client can be tested without needing a Windows server? - R.
Re: [Patch v2 00/19] CIFS: Implement SMBDirect
> Starting with SMB2 dialect 3.0, Microsoft introduced SMBDirect transport > protocol for transferring upper layer (SMB2) payload over RDMA via > Infiniband, RoCE or iWARP. The prococol is published in [MS-SMBD] > (https://msdn.microsoft.com/en-us/library/hh536346.aspx). This is great to see. Is there a Linux implementation of the server side (in Samba?) so that the client can be tested without needing a Windows server? - R.
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Fri, Jul 8, 2016 at 9:51 AM, Jason Gunthorpewrote: > So, it appears, the dst and neigh can be used for all performances cases. > > For the non performance dst == null case, can we just burn cycles and > stuff the daddr in front of the packet at hardheader time, even if we > have to copy? OK, sounds interesting. Unfortunately the scope of this work has gotten to the point where I can't take it on right now. My system is running 4.4.y for now (before struct skb_gso_cb grew) so I think shrinking struct skb_gso_cb to 8 bytes plus changing SKB_SGO_CB_OFFSET to 20 will work for now. Hope someone is able to come up with a real fix before I need to upgrade to 4.10.y... - R.
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Fri, Jul 8, 2016 at 9:51 AM, Jason Gunthorpe wrote: > So, it appears, the dst and neigh can be used for all performances cases. > > For the non performance dst == null case, can we just burn cycles and > stuff the daddr in front of the packet at hardheader time, even if we > have to copy? OK, sounds interesting. Unfortunately the scope of this work has gotten to the point where I can't take it on right now. My system is running 4.4.y for now (before struct skb_gso_cb grew) so I think shrinking struct skb_gso_cb to 8 bytes plus changing SKB_SGO_CB_OFFSET to 20 will work for now. Hope someone is able to come up with a real fix before I need to upgrade to 4.10.y... - R.
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Thu, Jul 7, 2016 at 4:14 PM, Jason Gunthorpewrote: > We have neighbour_priv, and ndo_neigh_construct/destruct now .. > > A first blush that would seem to be enough to let ipoib store the AH > and other path information in the neigh and avoid the cb? At least the > example in clip sure looks like what ipoib needs to do. Do you think those new facilities let us go back to using the neigh and still avoid the issues that led to commit b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path")? - R.
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Thu, Jul 7, 2016 at 4:14 PM, Jason Gunthorpe wrote: > We have neighbour_priv, and ndo_neigh_construct/destruct now .. > > A first blush that would seem to be enough to let ipoib store the AH > and other path information in the neigh and avoid the cb? At least the > example in clip sure looks like what ipoib needs to do. Do you think those new facilities let us go back to using the neigh and still avoid the issues that led to commit b63b70d87741 ("IPoIB: Use a private hash table for path lookup in xmit path")? - R.
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
>> struct skb_gso_cb { >> int mac_offset; >> int encap_level; >> __u16 csum_start; >> }; > This is based on an out-dated version of this struct. The 4.7 RC > kernel has a few more fields that were added to support local checksum > offload for encapsulated frames. Thanks for pointing that out. I hit the perf regression on 4.4.y (stable) and looked at the struct there. I see that latest upstream has changed, and I agree that this struct really can't shrink below 10 bytes. Since IP needs 20 bytes, GSO needs 10 bytes and IPoIB needs 20 bytes, we're 2 bytes over the 48 that are available in cb[]. So this is harder to fix than just changing skb_gso_cb and SKB_SGO_CB_OFFSET unfortunately. >> What is the best way to keep the crash fix but not kill IPoIB performance? > > It seems like what would probably need to happen is to move where the > IPoIB address is stored. I'm not sure the control buffer is really > the best place for it since the cb gets overwritten at various levels, > and storing 20 bytes makes it hard to avoid bumping up against the > size restrictions of the buffer. Seeing as how the IPoIB hwaddr is > generated around the same time we generate the L2 header for the > frame, I wonder if you couldn't get away with using a bit of extra skb > headroom to store it and then use a offset from the MAC header to > access it. An added bonus would be that with a few tricks with > SKB_GSO_CB(skb)->mac_offset you might even be able to set things up so > that you copy the hwaddr when you copy the header for each fragment > instead of having to go and copy the hwaddr out of the cb and clone it > for each frame. Can we assume there are 20 bytes of skb headroom available? What if we're forwarding an skb received on an Ethernet device? The reason we moved to the cb storage is that in the past, trying to hide some data in the actual skb buffer that we don't actually send led to some awkward-at-best code. (As I recall GRO was difficult to handle before commit 936d7de3d736 "IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses") But maybe there's a third way to handle this other than the old way and the skb->cb way. - R.
Re: Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
>> struct skb_gso_cb { >> int mac_offset; >> int encap_level; >> __u16 csum_start; >> }; > This is based on an out-dated version of this struct. The 4.7 RC > kernel has a few more fields that were added to support local checksum > offload for encapsulated frames. Thanks for pointing that out. I hit the perf regression on 4.4.y (stable) and looked at the struct there. I see that latest upstream has changed, and I agree that this struct really can't shrink below 10 bytes. Since IP needs 20 bytes, GSO needs 10 bytes and IPoIB needs 20 bytes, we're 2 bytes over the 48 that are available in cb[]. So this is harder to fix than just changing skb_gso_cb and SKB_SGO_CB_OFFSET unfortunately. >> What is the best way to keep the crash fix but not kill IPoIB performance? > > It seems like what would probably need to happen is to move where the > IPoIB address is stored. I'm not sure the control buffer is really > the best place for it since the cb gets overwritten at various levels, > and storing 20 bytes makes it hard to avoid bumping up against the > size restrictions of the buffer. Seeing as how the IPoIB hwaddr is > generated around the same time we generate the L2 header for the > frame, I wonder if you couldn't get away with using a bit of extra skb > headroom to store it and then use a offset from the MAC header to > access it. An added bonus would be that with a few tricks with > SKB_GSO_CB(skb)->mac_offset you might even be able to set things up so > that you copy the hwaddr when you copy the header for each fragment > instead of having to go and copy the hwaddr out of the cb and clone it > for each frame. Can we assume there are 20 bytes of skb headroom available? What if we're forwarding an skb received on an Ethernet device? The reason we moved to the cb storage is that in the past, trying to hide some data in the actual skb buffer that we don't actually send led to some awkward-at-best code. (As I recall GRO was difficult to handle before commit 936d7de3d736 "IPoIB: Stop lying about hard_header_len and use skb->cb to stash LL addresses") But maybe there's a third way to handle this other than the old way and the skb->cb way. - R.
Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Thu, Jan 7, 2016 at 3:00 AM, Konstantin Khlebnikovwrote: > Or just shift GSO CB and add couple checks like > BUILD_BUG_ON(sizeof(SKB_GSO_CB(skb)->room) < sizeof(*IPCB(skb))); Resurrecting this old thread, because the patch that ultimately went upstream (commit 9207f9d45b0a / net: preserve IP control block during GSO segmentation) causes a huge IPoIB performance regression (to the point of being unusable): https://bugzilla.kernel.org/show_bug.cgi?id=111921 I don't think anyone has explained what goes wrong or why IPoIB works the way it does. The underlying difference that IPoIB has from other drivers is that there are two levels of address resolution. First, normal ARP / ND resolves an IP address to a "hardware" address. The difference is that in IPoIB, the hardware address is an IB GID (plus a QPN, but we can ignore that). To actually send data to that GID, the IPoIB driver has to do a second lookup - it needs to ask the IB subnet manager for a path record that tells it how to reach that GID. In particular this means that "destination address" (as the IP / ARP layer understands it) actually isn't in the packet anywhere - there's nothing like an ethernet header as there is for "normal" network drivers. Instead, the driver stashes the address in skb->cb during hard_header_ops->create() and then looks at it in the xmit routine - this was designed way back around when commit a0417fa3a18a / net: Make qdisc_skb_cb upper size bound explicit. was merged. The expectation was that the part of the cb after sizeof (struct qdisc_skb_cb) would be preserved. The problem with commit 9207f9d45b0a is that GSO operations now access cb after SKB_SGO_CB_OFFSET==32, which lands right in the middle of where IPoIB stashes its hwaddr. It seems that the intent of the commit is to preserve the IP control block - struct inet_skb_parm (and presumably struct inet6_skb_parm) - even when using SKB_GSO_CB(). Seems like both inet_skb_parm and inet6_skb_parm are 20 bytes. IPoIB uses the part of cb after 28 bytes, so if we could squeeze struct skb_gso_cb down to 8 bytes and set SKB_SGO_CB_OFFSET to 20, then everything would work. The struct is struct skb_gso_cb { int mac_offset; int encap_level; __u16 csum_start; }; is it feasible to make encap_level a __u16 (which would make the overall struct exactly 8 bytes)? If I understand this correctly, 64K nested encapsulations seems like quite a bit for a packet... Or, earlier in this thread, having the GSO in ip_output and other gso paths save and restore the IP/IP6 control block was suggested as an alternate approach. I don't know if there are performance implications to that. What is the best way to keep the crash fix but not kill IPoIB performance? Thanks! - R.
Resurrecting due to huge ipoib perf regression - [BUG] skb corruption and kernel panic at forwarding with fragmentation
On Thu, Jan 7, 2016 at 3:00 AM, Konstantin Khlebnikov wrote: > Or just shift GSO CB and add couple checks like > BUILD_BUG_ON(sizeof(SKB_GSO_CB(skb)->room) < sizeof(*IPCB(skb))); Resurrecting this old thread, because the patch that ultimately went upstream (commit 9207f9d45b0a / net: preserve IP control block during GSO segmentation) causes a huge IPoIB performance regression (to the point of being unusable): https://bugzilla.kernel.org/show_bug.cgi?id=111921 I don't think anyone has explained what goes wrong or why IPoIB works the way it does. The underlying difference that IPoIB has from other drivers is that there are two levels of address resolution. First, normal ARP / ND resolves an IP address to a "hardware" address. The difference is that in IPoIB, the hardware address is an IB GID (plus a QPN, but we can ignore that). To actually send data to that GID, the IPoIB driver has to do a second lookup - it needs to ask the IB subnet manager for a path record that tells it how to reach that GID. In particular this means that "destination address" (as the IP / ARP layer understands it) actually isn't in the packet anywhere - there's nothing like an ethernet header as there is for "normal" network drivers. Instead, the driver stashes the address in skb->cb during hard_header_ops->create() and then looks at it in the xmit routine - this was designed way back around when commit a0417fa3a18a / net: Make qdisc_skb_cb upper size bound explicit. was merged. The expectation was that the part of the cb after sizeof (struct qdisc_skb_cb) would be preserved. The problem with commit 9207f9d45b0a is that GSO operations now access cb after SKB_SGO_CB_OFFSET==32, which lands right in the middle of where IPoIB stashes its hwaddr. It seems that the intent of the commit is to preserve the IP control block - struct inet_skb_parm (and presumably struct inet6_skb_parm) - even when using SKB_GSO_CB(). Seems like both inet_skb_parm and inet6_skb_parm are 20 bytes. IPoIB uses the part of cb after 28 bytes, so if we could squeeze struct skb_gso_cb down to 8 bytes and set SKB_SGO_CB_OFFSET to 20, then everything would work. The struct is struct skb_gso_cb { int mac_offset; int encap_level; __u16 csum_start; }; is it feasible to make encap_level a __u16 (which would make the overall struct exactly 8 bytes)? If I understand this correctly, 64K nested encapsulations seems like quite a bit for a packet... Or, earlier in this thread, having the GSO in ip_output and other gso paths save and restore the IP/IP6 control block was suggested as an alternate approach. I don't know if there are performance implications to that. What is the best way to keep the crash fix but not kill IPoIB performance? Thanks! - R.
[PATCH] iommu/vt-d: Don't reject NTB devices due to scope mismatch
From: Roland Dreier <rol...@purestorage.com> On a system with an Intel PCIe port configured as an NTB device, iommu initialization fails with DMAR: Device scope type does not match for :80:03.0 This is because the DMAR table reports this device as having scope 2 (ACPI_DMAR_SCOPE_TYPE_BRIDGE): [0A0h 0160 1] Device Scope Entry Type : 02 [0A1h 0161 1] Entry Length : 08 [0A2h 0162 2] Reserved : [0A4h 0164 1] Enumeration ID : 00 [0A5h 0165 1] PCI Bus Number : 80 [0A6h 0166 2] PCI Path : 03,00 but the device has a type 0 PCI header: 80:03.0 Bridge [0680]: Intel Corporation Device [8086:2f0d] (rev 02) 00: 86 80 0d 2f 00 00 10 00 02 00 80 06 10 00 80 00 10: 0c 00 c0 00 c0 38 00 00 0c 00 00 00 80 38 00 00 20: 00 00 00 c8 00 00 10 c8 00 00 00 00 86 80 00 00 30: 00 00 00 00 60 00 00 00 00 00 00 00 ff 01 00 00 VT-d works perfectly on this system, so there's no reason to bail out on initialization due to this apparent scope mismatch. Use the class 0x0680 ("Other bridge device") as a heuristic for allowing DMAR initialization for non-bridge PCI devices listed with scope bridge. Signed-off-by: Roland Dreier <rol...@purestorage.com> --- drivers/iommu/dmar.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index 6a86b5d1defa..2eff7b6c6c98 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -241,8 +241,20 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info, if (!dmar_match_pci_path(info, scope->bus, path, level)) continue; - if ((scope->entry_type == ACPI_DMAR_SCOPE_TYPE_ENDPOINT) ^ - (info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL)) { + /* +* We expect devices with endpoint scope to have normal PCI +* headers, and devices with bridge scope to have bridge PCI +* headers. However PCI NTB devices may be listed in the +* DMAR table with bridge scope, even though they have a +* normal PCI header. NTB devices are identified by class +* "BRIDGE_OTHER" (0680h) - we don't declare a socpe mismatch +* for this special case. +*/ + if ((scope->entry_type == ACPI_DMAR_SCOPE_TYPE_ENDPOINT && +info->dev->hdr_type != PCI_HEADER_TYPE_NORMAL) || + (scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE && +(info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL && + info->dev->class >> 8 != PCI_CLASS_BRIDGE_OTHER))) { pr_warn("Device scope type does not match for %s\n", pci_name(info->dev)); return -EINVAL; -- 2.7.4
[PATCH] iommu/vt-d: Don't reject NTB devices due to scope mismatch
From: Roland Dreier On a system with an Intel PCIe port configured as an NTB device, iommu initialization fails with DMAR: Device scope type does not match for :80:03.0 This is because the DMAR table reports this device as having scope 2 (ACPI_DMAR_SCOPE_TYPE_BRIDGE): [0A0h 0160 1] Device Scope Entry Type : 02 [0A1h 0161 1] Entry Length : 08 [0A2h 0162 2] Reserved : [0A4h 0164 1] Enumeration ID : 00 [0A5h 0165 1] PCI Bus Number : 80 [0A6h 0166 2] PCI Path : 03,00 but the device has a type 0 PCI header: 80:03.0 Bridge [0680]: Intel Corporation Device [8086:2f0d] (rev 02) 00: 86 80 0d 2f 00 00 10 00 02 00 80 06 10 00 80 00 10: 0c 00 c0 00 c0 38 00 00 0c 00 00 00 80 38 00 00 20: 00 00 00 c8 00 00 10 c8 00 00 00 00 86 80 00 00 30: 00 00 00 00 60 00 00 00 00 00 00 00 ff 01 00 00 VT-d works perfectly on this system, so there's no reason to bail out on initialization due to this apparent scope mismatch. Use the class 0x0680 ("Other bridge device") as a heuristic for allowing DMAR initialization for non-bridge PCI devices listed with scope bridge. Signed-off-by: Roland Dreier --- drivers/iommu/dmar.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c index 6a86b5d1defa..2eff7b6c6c98 100644 --- a/drivers/iommu/dmar.c +++ b/drivers/iommu/dmar.c @@ -241,8 +241,20 @@ int dmar_insert_dev_scope(struct dmar_pci_notify_info *info, if (!dmar_match_pci_path(info, scope->bus, path, level)) continue; - if ((scope->entry_type == ACPI_DMAR_SCOPE_TYPE_ENDPOINT) ^ - (info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL)) { + /* +* We expect devices with endpoint scope to have normal PCI +* headers, and devices with bridge scope to have bridge PCI +* headers. However PCI NTB devices may be listed in the +* DMAR table with bridge scope, even though they have a +* normal PCI header. NTB devices are identified by class +* "BRIDGE_OTHER" (0680h) - we don't declare a socpe mismatch +* for this special case. +*/ + if ((scope->entry_type == ACPI_DMAR_SCOPE_TYPE_ENDPOINT && +info->dev->hdr_type != PCI_HEADER_TYPE_NORMAL) || + (scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE && +(info->dev->hdr_type == PCI_HEADER_TYPE_NORMAL && + info->dev->class >> 8 != PCI_CLASS_BRIDGE_OTHER))) { pr_warn("Device scope type does not match for %s\n", pci_name(info->dev)); return -EINVAL; -- 2.7.4
Re: Regression in IO resource allocation
On Tue, May 31, 2016 at 3:31 PM, Rafael J. Wysocki <raf...@kernel.org> wrote: > It may not be called at all if _PTC is used on that system, for example. Yes, that's exactly the case on my system. So from my POV: Tested-by: Roland Dreier <rol...@purestorage.com> Thanks!
Re: Regression in IO resource allocation
On Tue, May 31, 2016 at 3:31 PM, Rafael J. Wysocki wrote: > It may not be called at all if _PTC is used on that system, for example. Yes, that's exactly the case on my system. So from my POV: Tested-by: Roland Dreier Thanks!
Re: Regression in IO resource allocation
On Tue, May 31, 2016 at 2:11 PM, Rafael J. Wysockiwrote: > Can you please try the appended patch (untested)? Thanks for the quick reply. Patch looks OK on my system... it boots (which is very good :) and I see system 00:01: [io 0x0400-0x047f] has been reserved however I don't see the "ACPI CPU throttle" region reserved in /proc/ioports... haven't debugged why acpi_processor_get_throttling() isn't getting called or what is happening yet. Will dig a bit deeper and let you know. - R.
Re: Regression in IO resource allocation
On Tue, May 31, 2016 at 2:11 PM, Rafael J. Wysocki wrote: > Can you please try the appended patch (untested)? Thanks for the quick reply. Patch looks OK on my system... it boots (which is very good :) and I see system 00:01: [io 0x0400-0x047f] has been reserved however I don't see the "ACPI CPU throttle" region reserved in /proc/ioports... haven't debugged why acpi_processor_get_throttling() isn't getting called or what is happening yet. Will dig a bit deeper and let you know. - R.
Regression in IO resource allocation
Hi, I recently updated one of my systems from 3.10.y to 4.4.11, and discovered a regression that stops it from booting. It's actually very similar to https://bugzilla.kernel.org/show_bug.cgi?id=99831 (which I reported about the same system last year). The problem is that commit ac212b6980d8 ("ACPI / processor: Use common hotplug infrastructure") changes the order that the ACPI processor and PnP initialization run. pnp_system_init() is run at fs_initcall time, while acpi_processor_init() is run from acpi_scan_init(), earlier at subsys_initcall time. Pre-ac212b6980d8, the ACPI processor initialization all ran from acpi_processor_init() at module_init time. So the processor driver initialization has flipped from after to before pnp_system_init(). Just as before, the failure is that the resource allocation code puts some AHCI IO BARs around 0x400, and reservation fails because some other ACPI stuff is also there. The problem is that when acpi_processor_init() runs, it reserves a range 0x410 - 0x415 for "ACPI CPU throttle", and if that happens before pnp_system_init(), then I get system 00:01: [io 0x0400-0x047f] could not be reserved because that overlaps the already-reserved range. Then the PCI resource allocation code is free to put PCI resources into that range and tons of things go south after that. For now I've worked around it by commenting out the request_region() in acpi_processor.c but that doesn't seem like a very good long-term solution. Does it make sense to resurrect the patches you had to let ACPI and PnP coexist in resource reservation? Or could we move the request_region() for CPU throttle into the still-modular initialization done from acpi_processor_driver_init()? Thanks! Roland
Regression in IO resource allocation
Hi, I recently updated one of my systems from 3.10.y to 4.4.11, and discovered a regression that stops it from booting. It's actually very similar to https://bugzilla.kernel.org/show_bug.cgi?id=99831 (which I reported about the same system last year). The problem is that commit ac212b6980d8 ("ACPI / processor: Use common hotplug infrastructure") changes the order that the ACPI processor and PnP initialization run. pnp_system_init() is run at fs_initcall time, while acpi_processor_init() is run from acpi_scan_init(), earlier at subsys_initcall time. Pre-ac212b6980d8, the ACPI processor initialization all ran from acpi_processor_init() at module_init time. So the processor driver initialization has flipped from after to before pnp_system_init(). Just as before, the failure is that the resource allocation code puts some AHCI IO BARs around 0x400, and reservation fails because some other ACPI stuff is also there. The problem is that when acpi_processor_init() runs, it reserves a range 0x410 - 0x415 for "ACPI CPU throttle", and if that happens before pnp_system_init(), then I get system 00:01: [io 0x0400-0x047f] could not be reserved because that overlaps the already-reserved range. Then the PCI resource allocation code is free to put PCI resources into that range and tons of things go south after that. For now I've worked around it by commenting out the request_region() in acpi_processor.c but that doesn't seem like a very good long-term solution. Does it make sense to resurrect the patches you had to let ACPI and PnP coexist in resource reservation? Or could we move the request_region() for CPU throttle into the still-modular initialization done from acpi_processor_driver_init()? Thanks! Roland
Re: Running out of IO space because of innocuous-looking DSDT change
On Mon, Oct 19, 2015 at 10:00 AM, Yinghai Lu wrote: > I would suggest to expand standard_io_resources[] to include all > possible conflict that we should avoid, like the io port for serial and > cf8/cf9. > > Then we could just set PCIBIOS_MIN_IO to 0 for x86. That would work on my system, which is a well-behaved standard server. But I thought the issue was weird vendor-specific stuff (Sony laptops?) where there are undocumented nonstandard IO resources that also aren't reserved in ACPI? - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Running out of IO space because of innocuous-looking DSDT change
I recently ran into an interesting issue with IO space allocation, and I'm looking for opinions on whether this is a BIOS issue, a kernel issue, both, or neither ;) What happened is that a BIOS update for my system changed the DSDT from having three ranges in PCI0: WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x, // Granularity 0x, // Range Minimum 0x03AF, // Range Maximum 0x, // Translation Offset 0x03B0, // Length ,, , TypeStatic) WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x, // Granularity 0x03E0, // Range Minimum 0x0CF7, // Range Maximum 0x, // Translation Offset 0x0918, // Length ,, , TypeStatic) WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x, // Granularity 0x03B0, // Range Minimum 0x03DF, // Range Maximum 0x, // Translation Offset 0x0030, // Length ,, , TypeStatic) to a single range: WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x, // Granularity 0x, // Range Minimum 0x0CF7, // Range Maximum 0x, // Translation Offset 0x0CF8, // Length ,, , TypeStatic) Naively it seems like this shouldn't make a difference, since in the end we've covered the space 0...0xCF7. However because of the code min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM; /* First, try exact prefetching match.. */ ret = pci_bus_alloc_resource(bus, res, size, align, min, IORESOURCE_PREFETCH, pcibios_align_resource, dev); in pci_bus_alloc_resource(), the single range ultimately means we end up running out of IO space for our devices (we have various devices asking for IO space as well as quite a few downstream PCI switch ports that get allocated IO space). What happens is that PCIBIOS_MIN_IO is 0x1000, so that code means with the new BIOS we can't allocate any IO in the range 0...0xCF7; with the old BIOS we only ruled out the range 0...0x3AF and happily put small IO resources (for SMBus controller devices etc) at places like 0x480 etc. Looking at the code and history, I see that the code with PCIBIOS_MIN_IO is there to deal with systems where not all resources are declared and the kernel might accidentally allocate something that clashes with strange hardware. However in my case I'm pretty confident there isn't anything in the range we used to use (since my system didn't blow up, and I know there isn't any weird proprietary stuff anyway). Would it make sense to change the kernel to reduce PCIBIOS_MIN_IO in my case? I could make it generic and send it upstream, or just hack it locally. Or (given my ignorance of ACPI in the real world) is this a broken BIOS change that I should ask my BIOS vendor to revert? Or... ? Thanks! Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Running out of IO space because of innocuous-looking DSDT change
I recently ran into an interesting issue with IO space allocation, and I'm looking for opinions on whether this is a BIOS issue, a kernel issue, both, or neither ;) What happened is that a BIOS update for my system changed the DSDT from having three ranges in PCI0: WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x, // Granularity 0x, // Range Minimum 0x03AF, // Range Maximum 0x, // Translation Offset 0x03B0, // Length ,, , TypeStatic) WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x, // Granularity 0x03E0, // Range Minimum 0x0CF7, // Range Maximum 0x, // Translation Offset 0x0918, // Length ,, , TypeStatic) WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x, // Granularity 0x03B0, // Range Minimum 0x03DF, // Range Maximum 0x, // Translation Offset 0x0030, // Length ,, , TypeStatic) to a single range: WordIO (ResourceProducer, MinFixed, MaxFixed, PosDecode, EntireRange, 0x, // Granularity 0x, // Range Minimum 0x0CF7, // Range Maximum 0x, // Translation Offset 0x0CF8, // Length ,, , TypeStatic) Naively it seems like this shouldn't make a difference, since in the end we've covered the space 0...0xCF7. However because of the code min = (res->flags & IORESOURCE_IO) ? PCIBIOS_MIN_IO : PCIBIOS_MIN_MEM; /* First, try exact prefetching match.. */ ret = pci_bus_alloc_resource(bus, res, size, align, min, IORESOURCE_PREFETCH, pcibios_align_resource, dev); in pci_bus_alloc_resource(), the single range ultimately means we end up running out of IO space for our devices (we have various devices asking for IO space as well as quite a few downstream PCI switch ports that get allocated IO space). What happens is that PCIBIOS_MIN_IO is 0x1000, so that code means with the new BIOS we can't allocate any IO in the range 0...0xCF7; with the old BIOS we only ruled out the range 0...0x3AF and happily put small IO resources (for SMBus controller devices etc) at places like 0x480 etc. Looking at the code and history, I see that the code with PCIBIOS_MIN_IO is there to deal with systems where not all resources are declared and the kernel might accidentally allocate something that clashes with strange hardware. However in my case I'm pretty confident there isn't anything in the range we used to use (since my system didn't blow up, and I know there isn't any weird proprietary stuff anyway). Would it make sense to change the kernel to reduce PCIBIOS_MIN_IO in my case? I could make it generic and send it upstream, or just hack it locally. Or (given my ignorance of ACPI in the real world) is this a broken BIOS change that I should ask my BIOS vendor to revert? Or... ? Thanks! Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Running out of IO space because of innocuous-looking DSDT change
On Mon, Oct 19, 2015 at 10:00 AM, Yinghai Luwrote: > I would suggest to expand standard_io_resources[] to include all > possible conflict that we should avoid, like the io port for serial and > cf8/cf9. > > Then we could just set PCIBIOS_MIN_IO to 0 for x86. That would work on my system, which is a well-behaved standard server. But I thought the issue was weird vendor-specific stuff (Sony laptops?) where there are undocumented nonstandard IO resources that also aren't reserved in ACPI? - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] target/iscsi: fix digest computation for chained SGs
On Tue, Jul 21, 2015 at 1:57 AM, Sagi Grimberg wrote: > How were you able to get a chained SG list in the target code? Local hack. So this bug can't be hit in current mainline code, but patch improves the code and removes a hidden booby-trap, so I think it makes sense to apply. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] target/iscsi: fix digest computation for chained SGs
On Tue, Jul 21, 2015 at 1:57 AM, Sagi Grimberg sa...@dev.mellanox.co.il wrote: How were you able to get a chained SG list in the target code? Local hack. So this bug can't be hit in current mainline code, but patch improves the code and removes a hidden booby-trap, so I think it makes sense to apply. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Sat, Jun 13, 2015 at 9:56 AM, Roland Dreier wrote: > Below is a more sophisticated, so to speak, version of it with a changelog and > all. It works for me, but more testing would be much appreciated. Yes, the patch works as expected: Tested-by: Roland Dreier It does change /proc/ioports heirarchy to 0400-0403 : ACPI PM1a_EVT_BLK 0404-0405 : ACPI PM1a_CNT_BLK 0406-0407 : pnp 00:06 0408-040b : ACPI PM_TMR 040c-041f : pnp 00:06 0410-0415 : ACPI CPU throttle 0420-042f : ACPI GPE0_BLK 0430-044f : pnp 00:06 0430-0433 : iTCO_wdt 0430-0433 : iTCO_wdt 0450-0450 : ACPI PM2_CNT_BLK 0451-047f : pnp 00:06 0460-047f : iTCO_wdt 0460-047f : iTCO_wdt where the old kernel had 0400-047f : pnp 00:06 0400-0403 : ACPI PM1a_EVT_BLK 0404-0405 : ACPI PM1a_CNT_BLK 0408-040b : ACPI PM_TMR 0410-0415 : ACPI CPU throttle 0420-042f : ACPI GPE0_BLK 0430-0433 : iTCO_wdt 0430-0433 : iTCO_wdt 0450-0450 : ACPI PM2_CNT_BLK 0460-047f : iTCO_wdt 0460-047f : iTCO_wdt but I don't think that matters. Thanks, - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Sat, Jun 13, 2015 at 9:56 AM, Roland Dreier rol...@purestorage.com wrote: Below is a more sophisticated, so to speak, version of it with a changelog and all. It works for me, but more testing would be much appreciated. Yes, the patch works as expected: Tested-by: Roland Dreier rol...@purestorage.com It does change /proc/ioports heirarchy to 0400-0403 : ACPI PM1a_EVT_BLK 0404-0405 : ACPI PM1a_CNT_BLK 0406-0407 : pnp 00:06 0408-040b : ACPI PM_TMR 040c-041f : pnp 00:06 0410-0415 : ACPI CPU throttle 0420-042f : ACPI GPE0_BLK 0430-044f : pnp 00:06 0430-0433 : iTCO_wdt 0430-0433 : iTCO_wdt 0450-0450 : ACPI PM2_CNT_BLK 0451-047f : pnp 00:06 0460-047f : iTCO_wdt 0460-047f : iTCO_wdt where the old kernel had 0400-047f : pnp 00:06 0400-0403 : ACPI PM1a_EVT_BLK 0404-0405 : ACPI PM1a_CNT_BLK 0408-040b : ACPI PM_TMR 0410-0415 : ACPI CPU throttle 0420-042f : ACPI GPE0_BLK 0430-0433 : iTCO_wdt 0430-0433 : iTCO_wdt 0450-0450 : ACPI PM2_CNT_BLK 0460-047f : iTCO_wdt 0460-047f : iTCO_wdt but I don't think that matters. Thanks, - R. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Fri, Jun 12, 2015 at 7:52 PM, Rafael J. Wysocki wrote: > Below is a more sophisticated, so to speak, version of it with a changelog and > all. It works for me, but more testing would be much appreciated. Great, I'm convinced by your reasoning that this makes sense. I'm building 3.10.80 patched with this (needed a tiny bit of context adjustment because acpi_dev_filter_resource_type() hadn't been added to 3.10 yet), and will confirm that it fixes the issue I saw. Thanks! Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Fri, Jun 12, 2015 at 7:52 PM, Rafael J. Wysocki r...@rjwysocki.net wrote: Below is a more sophisticated, so to speak, version of it with a changelog and all. It works for me, but more testing would be much appreciated. Great, I'm convinced by your reasoning that this makes sense. I'm building 3.10.80 patched with this (needed a tiny bit of context adjustment because acpi_dev_filter_resource_type() hadn't been added to 3.10 yet), and will confirm that it fixes the issue I saw. Thanks! Roland -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Thu, Jun 11, 2015 at 1:50 PM, Rafael J. Wysocki wrote: > Changing the ordering between those two routines would work around that > problem, > but in my view that wouldn't be a proper fix. In fact, the role of > reserve_range() > is to reserve the resources so as to prevent them from being used going > forward, > so they need not be reserved each in one piece. Instead, we can just check > if they > overlap with the ones reserved by acpi_reserve_resources() and only request > the > non-overlapping parts of them to avoid conflicts. > > So I wonder if the patch below makes any difference? I will give this a try and make sure it fixes my system, although I'm pretty sure it will. However I'm not sure I agree that this is a better fix than just having pnp reserve ranges before acpi. It already creates a special relationship between pnp and acpi, and acpi_reserve_region is a bunch of extra code. Could we really have a system where the hierarchy of acpi being a subset of a pnp bus doesn't work? I looked at a few other systems I have, and things like the following seem quite common: supermicro: 03e0-0cf7 : PCI Bus :00 03f8-03ff : serial 0400-0453 : pnp 00:0c 0400-0403 : ACPI PM1a_EVT_BLK 0404-0405 : ACPI PM1a_CNT_BLK 0408-040b : ACPI PM_TMR 0410-0415 : ACPI CPU throttle 0420-042f : ACPI GPE0_BLK 0430-0433 : iTCO_wdt 0450-0450 : ACPI PM2_CNT_BLK dell: 03e0-0cf7 : PCI Bus :00 03f8-03ff : serial 0800-087f : pnp 00:06 0800-0803 : ACPI PM1a_EVT_BLK 0804-0805 : ACPI PM1a_CNT_BLK 0808-080b : ACPI PM_TMR 0810-0815 : ACPI CPU throttle 0820-082f : ACPI GPE0_BLK 0830-0833 : iTCO_wdt 0830-0833 : iTCO_wdt 0850-0850 : ACPI PM2_CNT_BLK 0860-087f : iTCO_wdt 0860-087f : iTCO_wdt but I wasn't able to find anything that required more generality... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Thu, Jun 11, 2015 at 1:50 PM, Rafael J. Wysocki r...@rjwysocki.net wrote: Changing the ordering between those two routines would work around that problem, but in my view that wouldn't be a proper fix. In fact, the role of reserve_range() is to reserve the resources so as to prevent them from being used going forward, so they need not be reserved each in one piece. Instead, we can just check if they overlap with the ones reserved by acpi_reserve_resources() and only request the non-overlapping parts of them to avoid conflicts. So I wonder if the patch below makes any difference? I will give this a try and make sure it fixes my system, although I'm pretty sure it will. However I'm not sure I agree that this is a better fix than just having pnp reserve ranges before acpi. It already creates a special relationship between pnp and acpi, and acpi_reserve_region is a bunch of extra code. Could we really have a system where the hierarchy of acpi being a subset of a pnp bus doesn't work? I looked at a few other systems I have, and things like the following seem quite common: supermicro: 03e0-0cf7 : PCI Bus :00 03f8-03ff : serial 0400-0453 : pnp 00:0c 0400-0403 : ACPI PM1a_EVT_BLK 0404-0405 : ACPI PM1a_CNT_BLK 0408-040b : ACPI PM_TMR 0410-0415 : ACPI CPU throttle 0420-042f : ACPI GPE0_BLK 0430-0433 : iTCO_wdt 0450-0450 : ACPI PM2_CNT_BLK dell: 03e0-0cf7 : PCI Bus :00 03f8-03ff : serial 0800-087f : pnp 00:06 0800-0803 : ACPI PM1a_EVT_BLK 0804-0805 : ACPI PM1a_CNT_BLK 0808-080b : ACPI PM_TMR 0810-0815 : ACPI CPU throttle 0820-082f : ACPI GPE0_BLK 0830-0833 : iTCO_wdt 0830-0833 : iTCO_wdt 0850-0850 : ACPI PM2_CNT_BLK 0860-087f : iTCO_wdt 0860-087f : iTCO_wdt but I wasn't able to find anything that required more generality... -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Wed, Jun 10, 2015 at 4:23 PM, Rafael J. Wysocki wrote: > Can you please file a bug at bugzilla.kernel.org to track this and attach > the output of acpidump from the affected system in there? Done: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Thanks! -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Wed, Jun 10, 2015 at 4:23 PM, Rafael J. Wysocki r...@rjwysocki.net wrote: Can you please file a bug at bugzilla.kernel.org to track this and attach the output of acpidump from the affected system in there? Done: https://bugzilla.kernel.org/show_bug.cgi?id=99831 Thanks! -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Tue, Jun 9, 2015 at 4:43 PM, Roland Dreier wrote: > I understand that the change here fixed another regression, but I'm > wondering if there's a way to make everyone happy here? I can provide > debugging info from my system as required... Maybe sent my mail too quickly, as I have some thoughts after looking at the code. >From the link order, drivers/acpi init wll be called before drivers/pnp init, right? In my case, the acpi resources ("ACPI PM1a_EVT_BLK") etc are under a pnp bus. But if acpi requests the resources first, then pnp can't request the enclosing range. Is the right fix to make sure the pnp init happens before acpi requests resources? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Regression in 3.10.80 vs. 3.10.79
Hi, I recently updated from 3.10.79 to 3.10.80, and my system wouldn't boot any more. I tracked this down to commit 92c934b10ec3 ("ACPI / init: Fix the ordering of acpi_reserve_resources()"). With that commit reverted, my system is OK again. What happens is that ahci fails to initialize because pcim_iomap_regions_request_all() fails with EBUSY, due to a resource conflict on the first IO region of the ahci device. Since my root device is on ahci, that's the end of that. I'm sure this is due to a BIOS / ACPI table bug on my particular platform, but that's scant comfort when the system won't boot :) I patched 3.10.80 so that ahci continues to initialize after the EBUSY, and relevant parts of the kernel log seem to be: [3.836643,26] system 00:06: [io 0x0400-0x047f] could not be reserved ... [3.844112,26] pci :00:1f.2: BAR 0: assigned [io 0x0410-0x0417] ... [6.020040,00] ahci :00:1f.2: BAR 0: can't reserve [io 0x0410-0x0417] and /proc/ioports shows 0410-0415 : ACPI CPU throttle So if I'm understanding properly, for some reason we discover but fail to reserve the region with the ACPI resources, then PCI decides to assign ahci IO ports into that range, then ACPI loads and reserves 0x0410-0x0415, and then ahci fails to load. If I fully revert the patch, then I see [3.853857,08] system 00:06: [io 0x0400-0x047f] has been reserved ... [3.861806,08] pci :00:1f.2: BAR 0: assigned [io 0x0820-0x0827] We're able to reserve the range, and then PCI assigns ahci into a non-conflicting range. I understand that the change here fixed another regression, but I'm wondering if there's a way to make everyone happy here? I can provide debugging info from my system as required... Thanks, Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: Regression in 3.10.80 vs. 3.10.79
On Tue, Jun 9, 2015 at 4:43 PM, Roland Dreier rol...@purestorage.com wrote: I understand that the change here fixed another regression, but I'm wondering if there's a way to make everyone happy here? I can provide debugging info from my system as required... Maybe sent my mail too quickly, as I have some thoughts after looking at the code. From the link order, drivers/acpi init wll be called before drivers/pnp init, right? In my case, the acpi resources (ACPI PM1a_EVT_BLK) etc are under a pnp bus. But if acpi requests the resources first, then pnp can't request the enclosing range. Is the right fix to make sure the pnp init happens before acpi requests resources? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Regression in 3.10.80 vs. 3.10.79
Hi, I recently updated from 3.10.79 to 3.10.80, and my system wouldn't boot any more. I tracked this down to commit 92c934b10ec3 (ACPI / init: Fix the ordering of acpi_reserve_resources()). With that commit reverted, my system is OK again. What happens is that ahci fails to initialize because pcim_iomap_regions_request_all() fails with EBUSY, due to a resource conflict on the first IO region of the ahci device. Since my root device is on ahci, that's the end of that. I'm sure this is due to a BIOS / ACPI table bug on my particular platform, but that's scant comfort when the system won't boot :) I patched 3.10.80 so that ahci continues to initialize after the EBUSY, and relevant parts of the kernel log seem to be: [3.836643,26] system 00:06: [io 0x0400-0x047f] could not be reserved ... [3.844112,26] pci :00:1f.2: BAR 0: assigned [io 0x0410-0x0417] ... [6.020040,00] ahci :00:1f.2: BAR 0: can't reserve [io 0x0410-0x0417] and /proc/ioports shows 0410-0415 : ACPI CPU throttle So if I'm understanding properly, for some reason we discover but fail to reserve the region with the ACPI resources, then PCI decides to assign ahci IO ports into that range, then ACPI loads and reserves 0x0410-0x0415, and then ahci fails to load. If I fully revert the patch, then I see [3.853857,08] system 00:06: [io 0x0400-0x047f] has been reserved ... [3.861806,08] pci :00:1f.2: BAR 0: assigned [io 0x0820-0x0827] We're able to reserve the range, and then PCI assigns ahci into a non-conflicting range. I understand that the change here fixed another regression, but I'm wondering if there's a way to make everyone happy here? I can provide debugging info from my system as required... Thanks, Roland -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA updates for 4.1: - IPoIB fixes from Doug Ledford and Erez Shitrit - iSER updates from Sagi Grimberg - mlx4 GUID handling changes from Yishai Hadas - other misc fixes Bart Van Assche (1): IB/srp: Use P_Key cache for P_Key lookups Doug Ledford (11): IB/ipoib: factor out ah flushing IB/ipoib: change init sequence ordering IB/ipoib: Consolidate rtnl_lock tasks in workqueue IB/ipoib: Make the carrier_on_task race aware IB/ipoib: Use dedicated workqueues per interface IB/ipoib: No longer use flush as a parameter IB/ipoib: fix MCAST_FLAG_BUSY usage IB/ipoib: deserialize multicast joins IB/ipoib: drop mcast_mutex usage ib_srpt: convert printk's to pr_* functions Merge branches 'cve-fixup', 'ipoib', 'iser', 'misc-4.1', 'or-mlx4' and 'srp' into for-4.1 Erez Shitrit (6): IB/ipoib: Use one linear skb in RX flow IB/ipoib: Update broadcast record values after each successful join request IB/ipoib: Handle QP in SQE state IB/ipoib: Save only IPOIB_MAX_PATH_REC_QUEUE skb's IB/ipoib: Remove IPOIB_MCAST_RUN bit IB/mlx4: Fix WQE LSO segment calculation Honggang LI (1): mlx5: wrong page mask if CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for 32Bit architectures Sagi Grimberg (18): IB/iser: Fix unload during ep_poll wrong dereference IB/iser: Handle fastreg/local_inv completion errors IB/iser: Fix wrong calculation of protection buffer length IB/iser: Remove redundant cmd_data_len calculation IB/iser: Remove a redundant struct iser_data_buf IB/iser: Don't pass ib_device to fall_to_bounce_buff routine IB/iser: Move memory reg/dereg routines to iser_memory.c IB/iser: Remove redundant assignments in iser_reg_page_vec IB/iser: Get rid of struct iser_rdma_regd IB/iser: Merge build page-vec into register page-vec IB/iser: Move fastreg descriptor pool get/put to helper functions IB/iser: Move PI context alloc/free to routines IB/iser: Make fastreg pool cache friendly IB/iser: Modify struct iser_mem_reg members IB/iser: Pass struct iser_mem_reg to iser_fast_reg_mr and iser_reg_sig_mr IB/iser: Remove code duplication for a single DMA entry IB/iser: Bump version to 1.6 IB/iser: Rewrite bounce buffer code path Sebastian Ott (1): infiniband/mlx4: check for mapping error Selvin Xavier (1): MAINTAINERS: Adding list of maintainers for ocrdma Stephen Hemminger (1): rdma: replace deprecated ifconfig in doc Sébastien Dugué (1): ib_uverbs: Fix pages leak when using XRC SRQs Yann Droneaud (2): IB/core: disallow registering 0-sized memory region IB/core: don't disallow registering region starting at 0x0 Yishai Hadas (9): IB/mlx4: Alias GUID adding persistency support net/mlx4_core: Manage alias GUID per VF net/mlx4_core: Set initial admin GUIDs for VFs IB/mlx4: Manage admin alias GUID upon admin request IB/mlx4: Change init flow to request alias GUIDs for active VFs IB/mlx4: Request alias GUID on demand net/mlx4_core: Raise slave shutdown event upon FLR net/mlx4_core: Return the admin alias GUID upon host view request IB/mlx4: Change alias guids default to be host assigned Documentation/filesystems/nfs/nfs-rdma.txt | 9 +- MAINTAINERS| 9 + drivers/infiniband/core/umem.c | 7 +- drivers/infiniband/core/uverbs_main.c | 22 +- drivers/infiniband/hw/mlx4/alias_GUID.c| 457 +- drivers/infiniband/hw/mlx4/mad.c | 9 + drivers/infiniband/hw/mlx4/main.c | 26 +- drivers/infiniband/hw/mlx4/mlx4_ib.h | 14 +- drivers/infiniband/hw/mlx4/qp.c| 7 +- drivers/infiniband/hw/mlx4/sysfs.c | 44 +- drivers/infiniband/ulp/ipoib/ipoib.h | 31 +- drivers/infiniband/ulp/ipoib/ipoib_cm.c| 18 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c| 195 drivers/infiniband/ulp/ipoib/ipoib_main.c | 73 ++- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 520 ++-- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 44 +- drivers/infiniband/ulp/iser/iscsi_iser.h | 66 +-- drivers/infiniband/ulp/iser/iser_initiator.c | 66 ++- drivers/infiniband/ulp/iser/iser_memory.c | 523 - drivers/infiniband/ulp/iser/iser_verbs.c | 220 +++-- drivers/infiniband/ulp/srp/ib_srp.c| 9 +- drivers/infiniband/ulp/srpt/ib_srpt.c | 188
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA updates for 4.1: - IPoIB fixes from Doug Ledford and Erez Shitrit - iSER updates from Sagi Grimberg - mlx4 GUID handling changes from Yishai Hadas - other misc fixes Bart Van Assche (1): IB/srp: Use P_Key cache for P_Key lookups Doug Ledford (11): IB/ipoib: factor out ah flushing IB/ipoib: change init sequence ordering IB/ipoib: Consolidate rtnl_lock tasks in workqueue IB/ipoib: Make the carrier_on_task race aware IB/ipoib: Use dedicated workqueues per interface IB/ipoib: No longer use flush as a parameter IB/ipoib: fix MCAST_FLAG_BUSY usage IB/ipoib: deserialize multicast joins IB/ipoib: drop mcast_mutex usage ib_srpt: convert printk's to pr_* functions Merge branches 'cve-fixup', 'ipoib', 'iser', 'misc-4.1', 'or-mlx4' and 'srp' into for-4.1 Erez Shitrit (6): IB/ipoib: Use one linear skb in RX flow IB/ipoib: Update broadcast record values after each successful join request IB/ipoib: Handle QP in SQE state IB/ipoib: Save only IPOIB_MAX_PATH_REC_QUEUE skb's IB/ipoib: Remove IPOIB_MCAST_RUN bit IB/mlx4: Fix WQE LSO segment calculation Honggang LI (1): mlx5: wrong page mask if CONFIG_ARCH_DMA_ADDR_T_64BIT enabled for 32Bit architectures Sagi Grimberg (18): IB/iser: Fix unload during ep_poll wrong dereference IB/iser: Handle fastreg/local_inv completion errors IB/iser: Fix wrong calculation of protection buffer length IB/iser: Remove redundant cmd_data_len calculation IB/iser: Remove a redundant struct iser_data_buf IB/iser: Don't pass ib_device to fall_to_bounce_buff routine IB/iser: Move memory reg/dereg routines to iser_memory.c IB/iser: Remove redundant assignments in iser_reg_page_vec IB/iser: Get rid of struct iser_rdma_regd IB/iser: Merge build page-vec into register page-vec IB/iser: Move fastreg descriptor pool get/put to helper functions IB/iser: Move PI context alloc/free to routines IB/iser: Make fastreg pool cache friendly IB/iser: Modify struct iser_mem_reg members IB/iser: Pass struct iser_mem_reg to iser_fast_reg_mr and iser_reg_sig_mr IB/iser: Remove code duplication for a single DMA entry IB/iser: Bump version to 1.6 IB/iser: Rewrite bounce buffer code path Sebastian Ott (1): infiniband/mlx4: check for mapping error Selvin Xavier (1): MAINTAINERS: Adding list of maintainers for ocrdma Stephen Hemminger (1): rdma: replace deprecated ifconfig in doc Sébastien Dugué (1): ib_uverbs: Fix pages leak when using XRC SRQs Yann Droneaud (2): IB/core: disallow registering 0-sized memory region IB/core: don't disallow registering region starting at 0x0 Yishai Hadas (9): IB/mlx4: Alias GUID adding persistency support net/mlx4_core: Manage alias GUID per VF net/mlx4_core: Set initial admin GUIDs for VFs IB/mlx4: Manage admin alias GUID upon admin request IB/mlx4: Change init flow to request alias GUIDs for active VFs IB/mlx4: Request alias GUID on demand net/mlx4_core: Raise slave shutdown event upon FLR net/mlx4_core: Return the admin alias GUID upon host view request IB/mlx4: Change alias guids default to be host assigned Documentation/filesystems/nfs/nfs-rdma.txt | 9 +- MAINTAINERS| 9 + drivers/infiniband/core/umem.c | 7 +- drivers/infiniband/core/uverbs_main.c | 22 +- drivers/infiniband/hw/mlx4/alias_GUID.c| 457 +- drivers/infiniband/hw/mlx4/mad.c | 9 + drivers/infiniband/hw/mlx4/main.c | 26 +- drivers/infiniband/hw/mlx4/mlx4_ib.h | 14 +- drivers/infiniband/hw/mlx4/qp.c| 7 +- drivers/infiniband/hw/mlx4/sysfs.c | 44 +- drivers/infiniband/ulp/ipoib/ipoib.h | 31 +- drivers/infiniband/ulp/ipoib/ipoib_cm.c| 18 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c| 195 drivers/infiniband/ulp/ipoib/ipoib_main.c | 73 ++- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 520 ++-- drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 44 +- drivers/infiniband/ulp/iser/iscsi_iser.h | 66 +-- drivers/infiniband/ulp/iser/iser_initiator.c | 66 ++- drivers/infiniband/ulp/iser/iser_memory.c | 523 - drivers/infiniband/ulp/iser/iser_verbs.c | 220 +++-- drivers/infiniband/ulp/srp/ib_srp.c| 9 +- drivers/infiniband/ulp/srpt/ib_srpt.c | 188
Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib
On Thu, Apr 16, 2015 at 9:44 AM, Jason Gunthorpe wrote: >> We can give client->add() callback a return value and make >> ib_register_device() return -ENOMEM when it failed, just wondering >> why we don't do this at first, any special reason? > No idea, but having ib_register_device fail and unwind if a client > fails to attach makes sense to me. It seems a bit unfriendly to fail an entire device if one ULP has a problem. Let's say you have a system whose main network connection is IPoIB. Would you want that connection to come up even if, say, the NFS/RDMA server fails to find the memory registration type it likes? - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v3 07/28] IB/Verbs: Reform IB-ulp ipoib
On Thu, Apr 16, 2015 at 9:44 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: We can give client-add() callback a return value and make ib_register_device() return -ENOMEM when it failed, just wondering why we don't do this at first, any special reason? No idea, but having ib_register_device fail and unwind if a client fails to attach makes sense to me. It seems a bit unfriendly to fail an entire device if one ULP has a problem. Let's say you have a system whose main network connection is IPoIB. Would you want that connection to come up even if, say, the NFS/RDMA server fails to find the memory registration type it likes? - R. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus One 4.0 RDMA change: - Fix for exploitable integer overflow in uverbs interface. Shachar Raindel (1): IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic drivers/infiniband/core/umem.c | 8 1 file changed, 8 insertions(+) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus One 4.0 RDMA change: - Fix for exploitable integer overflow in uverbs interface. Shachar Raindel (1): IB/uverbs: Prevent integer overflow in ib_umem_get address arithmetic drivers/infiniband/core/umem.c | 8 1 file changed, 8 insertions(+) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA changes for 3.20 merge window: - Re-enable on-demand paging changes with stable ABI - Fairly large set of ocrdma HW driver fixes - Some qib HW driver fixes - Other miscellaneous changes Andreea-Cristina Bernat (2): IB/qib: Replace rcu_assign_pointer() with RCU_INIT_POINTER() in qib_qp.c IB/qib: Replace rcu_assign_pointer() with RCU_INIT_POINTER() in qib_keys.c Ariel Nahum (1): IB/iser: Release the iscsi endpoint if ep_disconnect wasn't called Bart Van Assche (1): MAINTAINERS: Update SRP initiator entry Dan Carpenter (2): IB/mlx5: Fix error code in get_port_caps() RDMA/ocrdma: Fix off by one in ocrdma_query_gid() Devesh Sharma (4): RDMA/ocrdma: Report correct count of interrupt vectors while registering ocrdma device RDMA/ocrdma: Discontinue support of RDMA-READ-WITH-INVALIDATE RDMA/ocrdma: Honor return value of ocrdma_resolve_dmac RDMA/ocrdma: set vlan present bit for user AH Eli Cohen (1): IB/core: Add support for extended query device caps Haggai Eran (3): IB/core: Properly handle registration of on-demand paging MRs after dereg IB/core: Add on demand paging caps to ib_uverbs_ex_query_device IB/mlx5: Enable the ODP capability query verb Hariprasad S (2): RDMA/cxgb4: Serialize CQ event upcalls with CQ destruction RDMA/cxgb4: Don't hang threads forever waiting on WR replies Ilya Nelkenbaum (1): IB/core: When marshaling ucma path from user-space, clear unused fields Jack Morgenstein (1): IB/mlx4: In mlx4_ib_demux_cm, print out GUID in host-endian order Majd Dibbiny (3): IB/mlx4: Fix memory leak in __mlx4_ib_modify_qp IB/mlx4: Bug fixes in mlx4_ib_resize_cq IB/mlx5: Update the dev in reg_create Mike Marciniszyn (3): IB/qib: Fix sizeof checkpatch warnings IB/qib: Fix checkpatch warnings IB/qib: Add blank line after declaration Mitesh Ahuja (7): RDMA/ocrdma: Add support for IB stack compliant stats in sysfs. RDMA/ocrdma: Increase the GID table size. RDMA/ocrdma: Move PD resource management to driver. RDMA/ocrdma: Host crash on destroying device resources RDMA/ocrdma: Add support for interrupt moderation RDMA/ocrdma: remove reference of ocrdma_dev out of ocrdma_qp structure RDMA/ocrdma: Update the ocrdma module version string Mitko Haralanov (1): IB/qib: Do not write EEPROM Moshe Lazer (1): IB/core: Fix deadlock on uverbs modify_qp error flow Or Gerlitz (1): IB/mlx4: Fix wrong usage of IPv4 protocol for multicast attach/detach Padmanabh Ratnakar (1): RDMA/ocrdma: Report correct state in ibv_query_qp Rasmus Villemoes (2): RDMA/ocrdma: Help gcc generate better code for ocrdma_srq_toggle_bit RDMA/ocrdma: Use unsigned for bit index Rickard Strandqvist (1): IB/ipath: Remove unused function in ipath_wc_ppc64 Roi Dayan (1): IB/iser: Use correct dma direction when unmapping SGs Roland Dreier (1): Merge branches 'core', 'cxgb4', 'iser', 'mlx4', 'mlx5', 'ocrdma', 'odp', 'qib' and 'srp' into for-next Sagi Grimberg (1): IB/iser: Fix memory regions possible leak Selvin Xavier (2): RDMA/ocrdma: Debugfs enhancments for ocrdma driver RDMA/ocrdma: Allow expansion of the SQ CQEs via buddy CQ expansion of the QP Vinit Agnihotri (1): IB/qib: Add support for the new QMH7360 card MAINTAINERS | 2 +- drivers/infiniband/core/ucma.c| 3 + drivers/infiniband/core/umem_odp.c| 3 +- drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_cmd.c | 158 + drivers/infiniband/core/uverbs_main.c | 1 + drivers/infiniband/hw/cxgb4/ev.c | 9 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h| 29 ++- drivers/infiniband/hw/ipath/ipath_kernel.h| 3 - drivers/infiniband/hw/ipath/ipath_wc_ppc64.c | 13 -- drivers/infiniband/hw/ipath/ipath_wc_x86_64.c | 15 -- drivers/infiniband/hw/mlx4/cm.c | 2 +- drivers/infiniband/hw/mlx4/cq.c | 7 +- drivers/infiniband/hw/mlx4/main.c | 10 +- drivers/infiniband/hw/mlx4/qp.c | 6 +- drivers/infiniband/hw/mlx5/main.c | 4 +- drivers/infiniband/hw/mlx5/mr.c | 1 + drivers/infiniband/hw/ocrdma/ocrdma.h | 38 +++- drivers/infiniband/hw/ocrdma/ocrdma_ah.c | 38 +++- drivers/infiniband/hw/ocrdma/ocrdma_ah.h | 6 + drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 312 ++ drivers/infiniband/hw/ocrdma/ocrdma_hw.h | 2 + drivers/infiniband/hw/ocrdma/ocrdma_main.c| 12
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA changes for 3.20 merge window: - Re-enable on-demand paging changes with stable ABI - Fairly large set of ocrdma HW driver fixes - Some qib HW driver fixes - Other miscellaneous changes Andreea-Cristina Bernat (2): IB/qib: Replace rcu_assign_pointer() with RCU_INIT_POINTER() in qib_qp.c IB/qib: Replace rcu_assign_pointer() with RCU_INIT_POINTER() in qib_keys.c Ariel Nahum (1): IB/iser: Release the iscsi endpoint if ep_disconnect wasn't called Bart Van Assche (1): MAINTAINERS: Update SRP initiator entry Dan Carpenter (2): IB/mlx5: Fix error code in get_port_caps() RDMA/ocrdma: Fix off by one in ocrdma_query_gid() Devesh Sharma (4): RDMA/ocrdma: Report correct count of interrupt vectors while registering ocrdma device RDMA/ocrdma: Discontinue support of RDMA-READ-WITH-INVALIDATE RDMA/ocrdma: Honor return value of ocrdma_resolve_dmac RDMA/ocrdma: set vlan present bit for user AH Eli Cohen (1): IB/core: Add support for extended query device caps Haggai Eran (3): IB/core: Properly handle registration of on-demand paging MRs after dereg IB/core: Add on demand paging caps to ib_uverbs_ex_query_device IB/mlx5: Enable the ODP capability query verb Hariprasad S (2): RDMA/cxgb4: Serialize CQ event upcalls with CQ destruction RDMA/cxgb4: Don't hang threads forever waiting on WR replies Ilya Nelkenbaum (1): IB/core: When marshaling ucma path from user-space, clear unused fields Jack Morgenstein (1): IB/mlx4: In mlx4_ib_demux_cm, print out GUID in host-endian order Majd Dibbiny (3): IB/mlx4: Fix memory leak in __mlx4_ib_modify_qp IB/mlx4: Bug fixes in mlx4_ib_resize_cq IB/mlx5: Update the dev in reg_create Mike Marciniszyn (3): IB/qib: Fix sizeof checkpatch warnings IB/qib: Fix checkpatch warnings IB/qib: Add blank line after declaration Mitesh Ahuja (7): RDMA/ocrdma: Add support for IB stack compliant stats in sysfs. RDMA/ocrdma: Increase the GID table size. RDMA/ocrdma: Move PD resource management to driver. RDMA/ocrdma: Host crash on destroying device resources RDMA/ocrdma: Add support for interrupt moderation RDMA/ocrdma: remove reference of ocrdma_dev out of ocrdma_qp structure RDMA/ocrdma: Update the ocrdma module version string Mitko Haralanov (1): IB/qib: Do not write EEPROM Moshe Lazer (1): IB/core: Fix deadlock on uverbs modify_qp error flow Or Gerlitz (1): IB/mlx4: Fix wrong usage of IPv4 protocol for multicast attach/detach Padmanabh Ratnakar (1): RDMA/ocrdma: Report correct state in ibv_query_qp Rasmus Villemoes (2): RDMA/ocrdma: Help gcc generate better code for ocrdma_srq_toggle_bit RDMA/ocrdma: Use unsigned for bit index Rickard Strandqvist (1): IB/ipath: Remove unused function in ipath_wc_ppc64 Roi Dayan (1): IB/iser: Use correct dma direction when unmapping SGs Roland Dreier (1): Merge branches 'core', 'cxgb4', 'iser', 'mlx4', 'mlx5', 'ocrdma', 'odp', 'qib' and 'srp' into for-next Sagi Grimberg (1): IB/iser: Fix memory regions possible leak Selvin Xavier (2): RDMA/ocrdma: Debugfs enhancments for ocrdma driver RDMA/ocrdma: Allow expansion of the SQ CQEs via buddy CQ expansion of the QP Vinit Agnihotri (1): IB/qib: Add support for the new QMH7360 card MAINTAINERS | 2 +- drivers/infiniband/core/ucma.c| 3 + drivers/infiniband/core/umem_odp.c| 3 +- drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_cmd.c | 158 + drivers/infiniband/core/uverbs_main.c | 1 + drivers/infiniband/hw/cxgb4/ev.c | 9 +- drivers/infiniband/hw/cxgb4/iw_cxgb4.h| 29 ++- drivers/infiniband/hw/ipath/ipath_kernel.h| 3 - drivers/infiniband/hw/ipath/ipath_wc_ppc64.c | 13 -- drivers/infiniband/hw/ipath/ipath_wc_x86_64.c | 15 -- drivers/infiniband/hw/mlx4/cm.c | 2 +- drivers/infiniband/hw/mlx4/cq.c | 7 +- drivers/infiniband/hw/mlx4/main.c | 10 +- drivers/infiniband/hw/mlx4/qp.c | 6 +- drivers/infiniband/hw/mlx5/main.c | 4 +- drivers/infiniband/hw/mlx5/mr.c | 1 + drivers/infiniband/hw/ocrdma/ocrdma.h | 38 +++- drivers/infiniband/hw/ocrdma/ocrdma_ah.c | 38 +++- drivers/infiniband/hw/ocrdma/ocrdma_ah.h | 6 + drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 312 ++ drivers/infiniband/hw/ocrdma/ocrdma_hw.h | 2 + drivers/infiniband/hw/ocrdma/ocrdma_main.c| 12
Re: linux-next: build failure after merge of the infiniband tree
On Tue, Feb 17, 2015 at 6:32 PM, Stephen Rothwell wrote: > After merging the livepatching tree, today's linux-next build (powerpc > allyesconfig) failed like this: > > In file included from drivers/infiniband/hw/qib/qib_cq.c:41:0: > drivers/infiniband/hw/qib/qib.h: In function 'qib_flush_wc': > drivers/infiniband/hw/qib/qib.h:1470:1: error: expected ';' before '}' token > } > ^ > > and it went badly down hill from there :-( Weird, I could have sworn I fixed that before I pushed the tree out. Anyway I'll try adding the missing ';' again and push it out again :( -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] IB/mthca: remove deprecated use of pci api
On Wed, Feb 4, 2015 at 6:09 AM, Quentin Lambert wrote: > - dev->eq_table.icm_dma = pci_map_page(dev->pdev, > dev->eq_table.icm_page, 0, > - PAGE_SIZE, > PCI_DMA_BIDIRECTIONAL); > - if (pci_dma_mapping_error(dev->pdev, dev->eq_table.icm_dma)) { > + dev->eq_table.icm_dma = dma_map_page(>pdev->dev, > + dev->eq_table.icm_page, 0, > + PAGE_SIZE, > + (enum > dma_data_direction)PCI_DMA_BIDIRECTIONAL); Surely this can't be right? Shouldn't the direction just change to DMA_BIDIRECTIONAL? Are we really sweeping through the kernel and getting rid of pci_map_ etc. calls? If so please respin your semantic patch so that it doesn't add crazy stuff like (enum dma_data_direction)PCI_DMA_BIDIRECTIONAL and resend the change. Thanks, Roland -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the infiniband tree
On Tue, Feb 17, 2015 at 6:32 PM, Stephen Rothwell s...@canb.auug.org.au wrote: After merging the livepatching tree, today's linux-next build (powerpc allyesconfig) failed like this: In file included from drivers/infiniband/hw/qib/qib_cq.c:41:0: drivers/infiniband/hw/qib/qib.h: In function 'qib_flush_wc': drivers/infiniband/hw/qib/qib.h:1470:1: error: expected ';' before '}' token } ^ and it went badly down hill from there :-( Weird, I could have sworn I fixed that before I pushed the tree out. Anyway I'll try adding the missing ';' again and push it out again :( -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH 1/1] IB/mthca: remove deprecated use of pci api
On Wed, Feb 4, 2015 at 6:09 AM, Quentin Lambert lambert.quen...@gmail.com wrote: - dev-eq_table.icm_dma = pci_map_page(dev-pdev, dev-eq_table.icm_page, 0, - PAGE_SIZE, PCI_DMA_BIDIRECTIONAL); - if (pci_dma_mapping_error(dev-pdev, dev-eq_table.icm_dma)) { + dev-eq_table.icm_dma = dma_map_page(dev-pdev-dev, + dev-eq_table.icm_page, 0, + PAGE_SIZE, + (enum dma_data_direction)PCI_DMA_BIDIRECTIONAL); Surely this can't be right? Shouldn't the direction just change to DMA_BIDIRECTIONAL? Are we really sweeping through the kernel and getting rid of pci_map_ etc. calls? If so please respin your semantic patch so that it doesn't add crazy stuff like (enum dma_data_direction)PCI_DMA_BIDIRECTIONAL and resend the change. Thanks, Roland -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus One more last-second RDMA change for 3.19: - Yann realized that the previous revert of new userspace ABI did not go far enough, and we're still exposing a change that we don't want. Revert even closer to 3.18 interface to make sure we get things right in the long run. Sorry for sending this at the very end of the release cycle, but we didn't realize the scope of the required fix until just now. Yann Droneaud (1): Revert "IB/core: Add support for extended query device caps" drivers/infiniband/core/uverbs.h | 1 - drivers/infiniband/core/uverbs_cmd.c | 137 +++ drivers/infiniband/hw/mlx5/main.c| 2 - include/rdma/ib_verbs.h | 5 +- include/uapi/rdma/ib_user_verbs.h| 27 --- 5 files changed, 42 insertions(+), 130 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus One more last-second RDMA change for 3.19: - Yann realized that the previous revert of new userspace ABI did not go far enough, and we're still exposing a change that we don't want. Revert even closer to 3.18 interface to make sure we get things right in the long run. Sorry for sending this at the very end of the release cycle, but we didn't realize the scope of the required fix until just now. Yann Droneaud (1): Revert IB/core: Add support for extended query device caps drivers/infiniband/core/uverbs.h | 1 - drivers/infiniband/core/uverbs_cmd.c | 137 +++ drivers/infiniband/hw/mlx5/main.c| 2 - include/rdma/ib_verbs.h | 5 +- include/uapi/rdma/ib_user_verbs.h| 27 --- 5 files changed, 42 insertions(+), 130 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Last minute InfiniBand/RDMA changes for 3.19: - Revert IPoIB driver back to 3.18 state. We had a number of fixes go into 3.19, but they introduced regressions. We tried to get everything fixed up but ran out of time, so we'll try again for 3.20. - Similarly, turn off the new "extended query port" verb. Late in the cycle we realized the ABI is not quite right, and rather than freeze something in a rush and make a mistake, we'll take a bit more time and get it right in 3.20. Haggai Eran (1): IB/core: Temporarily disable ex_query_device uverb Roland Dreier (9): Revert "IPoIB: No longer use flush as a parameter" Revert "IPoIB: Make ipoib_mcast_stop_thread flush the workqueue" Revert "IPoIB: Use dedicated workqueues per interface" Revert "IPoIB: change init sequence ordering" Revert "IPoIB: fix mcast_dev_flush/mcast_restart_task race" Revert "IPoIB: fix MCAST_FLAG_BUSY usage" Revert "IPoIB: Make the carrier_on_task race aware" Revert "IPoIB: Consolidate rtnl_lock tasks in workqueue" Merge branches 'ipoib' and 'odp' into for-next drivers/infiniband/core/uverbs_main.c | 1 - drivers/infiniband/ulp/ipoib/ipoib.h | 19 +- drivers/infiniband/ulp/ipoib/ipoib_cm.c| 18 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c| 27 +-- drivers/infiniband/ulp/ipoib/ipoib_main.c | 49 ++--- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 239 + drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 22 +-- 7 files changed, 134 insertions(+), 241 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Last minute InfiniBand/RDMA changes for 3.19: - Revert IPoIB driver back to 3.18 state. We had a number of fixes go into 3.19, but they introduced regressions. We tried to get everything fixed up but ran out of time, so we'll try again for 3.20. - Similarly, turn off the new extended query port verb. Late in the cycle we realized the ABI is not quite right, and rather than freeze something in a rush and make a mistake, we'll take a bit more time and get it right in 3.20. Haggai Eran (1): IB/core: Temporarily disable ex_query_device uverb Roland Dreier (9): Revert IPoIB: No longer use flush as a parameter Revert IPoIB: Make ipoib_mcast_stop_thread flush the workqueue Revert IPoIB: Use dedicated workqueues per interface Revert IPoIB: change init sequence ordering Revert IPoIB: fix mcast_dev_flush/mcast_restart_task race Revert IPoIB: fix MCAST_FLAG_BUSY usage Revert IPoIB: Make the carrier_on_task race aware Revert IPoIB: Consolidate rtnl_lock tasks in workqueue Merge branches 'ipoib' and 'odp' into for-next drivers/infiniband/core/uverbs_main.c | 1 - drivers/infiniband/ulp/ipoib/ipoib.h | 19 +- drivers/infiniband/ulp/ipoib/ipoib_cm.c| 18 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c| 27 +-- drivers/infiniband/ulp/ipoib/ipoib_main.c | 49 ++--- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 239 + drivers/infiniband/ulp/ipoib/ipoib_verbs.c | 22 +-- 7 files changed, 134 insertions(+), 241 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main batch of InfiniBand/RDMA changes for 3.19: - On-demand paging support in core midlayer and mlx5 driver. This lets userspace create non-pinned memory regions and have the adapter HW trigger page faults. - iSER and IPoIB updates and fixes. - Low-level HW driver updates for cxgb4, mlx4 and ocrdma. - Other miscellaneous fixes. Ariel Nahum (2): IB/iser: Collapse cleanup and disconnect handlers IB/iser: Fix possible NULL derefernce ib_conn->device in session_create Devesh Sharma (1): RDMA/ocrdma: Always resolve destination mac from GRH for UD QPs Doug Ledford (8): IPoIB: Consolidate rtnl_lock tasks in workqueue IPoIB: Make the carrier_on_task race aware IPoIB: fix MCAST_FLAG_BUSY usage IPoIB: fix mcast_dev_flush/mcast_restart_task race IPoIB: change init sequence ordering IPoIB: Use dedicated workqueues per interface IPoIB: Make ipoib_mcast_stop_thread flush the workqueue IPoIB: No longer use flush as a parameter Eli Cohen (1): IB/core: Add support for extended query device caps Haggai Eran (14): IB/mlx5: Remove per-MR pas and dma pointers IB/mlx5: Enhance UMR support to allow partial page table update IB/core: Replace ib_umem's offset field with a full address IB/core: Add umem function to read data from user-space IB/mlx5: Add function to read WQE from user-space IB/core: Implement support for MMU notifiers regarding on demand paging regions mlx5_core: Add support for page faults events and low level handling IB/mlx5: Implement the ODP capability query verb IB/mlx5: Changes in memory region creation to support on-demand paging IB/mlx5: Add mlx5_ib_update_mtt to update page tables after creation IB/mlx5: Page faults handling infrastructure IB/mlx5: Handle page faults IB/mlx5: Add support for RDMA read/write responder page faults IB/mlx5: Implement on demand paging by adding support for MMU notifiers Hariprasad S (1): RDMA/cxgb4: Handle NET_XMIT return codes Hariprasad Shenai (2): RDMA/cxgb4: Fix locking issue in process_mpa_request RDMA/cxgb4: Limit MRs to < 8GB for T4/T5 devices Jack Morgenstein (2): IB/core: Fix mgid key handling in SA agent multicast data-base IB/mlx4: Fix an incorrectly shadowed variable in mlx4_ib_rereg_user_mr Max Gurtovoy (1): IB/iser: Fix possible SQ overflow Minh Tran (1): IB/iser: Re-adjust CQ and QP send ring sizes to HW limits Mitesh Ahuja (1): RDMA/ocrdma: Fix ocrdma_query_qp() to report q_key value for UD QPs Moni Shoua (1): IB/core: Do not resolve VLAN if already resolved Or Gerlitz (1): IB/iser: Bump version to 1.5 Or Kehati (1): IB/addr: Improve address resolution callback scheduling Pramod Kumar (2): RDMA/cxgb4: Increase epd buff size for debug interface RDMA/cxgb4: Configure 0B MRs to match HW implementation Roland Dreier (2): mlx5_core: Re-add MLX5_DEV_CAP_FLAG_ON_DMND_PG flag Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'mlx4', 'ocrdma', 'odp' and 'srp' into for-next Sagi Grimberg (13): IB/iser: Fix catastrophic error flow hang IB/iser: Decrement CQ's active QPs accounting when QP creation fails IB/iser: Fix sparse warnings IB/iser: Fix race between iser connection teardown and scsi TMFs IB/iser: Terminate connection before cleaning inflight tasks IB/iser: Centralize memory region invalidation to a function IB/iser: Remove redundant is_mr indicator IB/iser: Use more completion queues IB/iser: Micro-optimize iser logging IB/iser: Micro-optimize iser_handle_wc IB/iser: DIX update IB/core: Add flags for on demand paging support IB/srp: Allow newline separator for connection string Shachar Raindel (1): IB/core: Add support for on demand paging regions Steve Wise (1): RDMA/cxgb4: Wake up waiters after flushing the qp Yuval Shaia (1): mlx4_core: Check for DPDP violation only when DPDP is not supported drivers/infiniband/Kconfig | 11 + drivers/infiniband/core/Makefile | 1 + drivers/infiniband/core/addr.c | 4 +- drivers/infiniband/core/multicast.c| 11 +- drivers/infiniband/core/umem.c | 72 ++- drivers/infiniband/core/umem_odp.c | 668 + drivers/infiniband/core/umem_rbtree.c | 94 +++ drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_cmd.c | 171 -- drivers/infiniband/core/uverbs_main.c | 5 +- drivers/infiniband/core/verbs.c| 3 +- d
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main batch of InfiniBand/RDMA changes for 3.19: - On-demand paging support in core midlayer and mlx5 driver. This lets userspace create non-pinned memory regions and have the adapter HW trigger page faults. - iSER and IPoIB updates and fixes. - Low-level HW driver updates for cxgb4, mlx4 and ocrdma. - Other miscellaneous fixes. Ariel Nahum (2): IB/iser: Collapse cleanup and disconnect handlers IB/iser: Fix possible NULL derefernce ib_conn-device in session_create Devesh Sharma (1): RDMA/ocrdma: Always resolve destination mac from GRH for UD QPs Doug Ledford (8): IPoIB: Consolidate rtnl_lock tasks in workqueue IPoIB: Make the carrier_on_task race aware IPoIB: fix MCAST_FLAG_BUSY usage IPoIB: fix mcast_dev_flush/mcast_restart_task race IPoIB: change init sequence ordering IPoIB: Use dedicated workqueues per interface IPoIB: Make ipoib_mcast_stop_thread flush the workqueue IPoIB: No longer use flush as a parameter Eli Cohen (1): IB/core: Add support for extended query device caps Haggai Eran (14): IB/mlx5: Remove per-MR pas and dma pointers IB/mlx5: Enhance UMR support to allow partial page table update IB/core: Replace ib_umem's offset field with a full address IB/core: Add umem function to read data from user-space IB/mlx5: Add function to read WQE from user-space IB/core: Implement support for MMU notifiers regarding on demand paging regions mlx5_core: Add support for page faults events and low level handling IB/mlx5: Implement the ODP capability query verb IB/mlx5: Changes in memory region creation to support on-demand paging IB/mlx5: Add mlx5_ib_update_mtt to update page tables after creation IB/mlx5: Page faults handling infrastructure IB/mlx5: Handle page faults IB/mlx5: Add support for RDMA read/write responder page faults IB/mlx5: Implement on demand paging by adding support for MMU notifiers Hariprasad S (1): RDMA/cxgb4: Handle NET_XMIT return codes Hariprasad Shenai (2): RDMA/cxgb4: Fix locking issue in process_mpa_request RDMA/cxgb4: Limit MRs to 8GB for T4/T5 devices Jack Morgenstein (2): IB/core: Fix mgid key handling in SA agent multicast data-base IB/mlx4: Fix an incorrectly shadowed variable in mlx4_ib_rereg_user_mr Max Gurtovoy (1): IB/iser: Fix possible SQ overflow Minh Tran (1): IB/iser: Re-adjust CQ and QP send ring sizes to HW limits Mitesh Ahuja (1): RDMA/ocrdma: Fix ocrdma_query_qp() to report q_key value for UD QPs Moni Shoua (1): IB/core: Do not resolve VLAN if already resolved Or Gerlitz (1): IB/iser: Bump version to 1.5 Or Kehati (1): IB/addr: Improve address resolution callback scheduling Pramod Kumar (2): RDMA/cxgb4: Increase epd buff size for debug interface RDMA/cxgb4: Configure 0B MRs to match HW implementation Roland Dreier (2): mlx5_core: Re-add MLX5_DEV_CAP_FLAG_ON_DMND_PG flag Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'mlx4', 'ocrdma', 'odp' and 'srp' into for-next Sagi Grimberg (13): IB/iser: Fix catastrophic error flow hang IB/iser: Decrement CQ's active QPs accounting when QP creation fails IB/iser: Fix sparse warnings IB/iser: Fix race between iser connection teardown and scsi TMFs IB/iser: Terminate connection before cleaning inflight tasks IB/iser: Centralize memory region invalidation to a function IB/iser: Remove redundant is_mr indicator IB/iser: Use more completion queues IB/iser: Micro-optimize iser logging IB/iser: Micro-optimize iser_handle_wc IB/iser: DIX update IB/core: Add flags for on demand paging support IB/srp: Allow newline separator for connection string Shachar Raindel (1): IB/core: Add support for on demand paging regions Steve Wise (1): RDMA/cxgb4: Wake up waiters after flushing the qp Yuval Shaia (1): mlx4_core: Check for DPDP violation only when DPDP is not supported drivers/infiniband/Kconfig | 11 + drivers/infiniband/core/Makefile | 1 + drivers/infiniband/core/addr.c | 4 +- drivers/infiniband/core/multicast.c| 11 +- drivers/infiniband/core/umem.c | 72 ++- drivers/infiniband/core/umem_odp.c | 668 + drivers/infiniband/core/umem_rbtree.c | 94 +++ drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_cmd.c | 171 -- drivers/infiniband/core/uverbs_main.c | 5 +- drivers/infiniband/core/verbs.c| 3 +- drivers
Re: linux-next: build failure after merge of the infiniband tree
On Mon, Dec 15, 2014 at 5:56 PM, Roland Dreier wrote: > I'll add a partial revert of that patch to my tree to get back the > now-used enum values. I rebased my tree on top of the merge-window merge of davem's tree, and added the missing flag on top of the "remove this flag" commit. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the infiniband tree
On Mon, Dec 15, 2014 at 5:47 PM, Stephen Rothwell wrote: > Hi all, > > After merging the infiniband tree, today's linux-next build (x86_64 > allmodconfig) failed like this: > > drivers/infiniband/hw/mlx5/main.c: In function 'mlx5_ib_query_device': > drivers/infiniband/hw/mlx5/main.c:248:34: error: > 'MLX5_DEV_CAP_FLAG_ON_DMND_PG' undeclared (first use in this function) > if (dev->mdev->caps.gen.flags & MLX5_DEV_CAP_FLAG_ON_DMND_PG) > ^ > drivers/net/ethernet/mellanox/mlx5/core/fw.c: In function > 'mlx5_query_odp_caps': > drivers/net/ethernet/mellanox/mlx5/core/fw.c:79:30: error: > 'MLX5_DEV_CAP_FLAG_ON_DMND_PG' undeclared (first use in this function) > if (!(dev->caps.gen.flags & MLX5_DEV_CAP_FLAG_ON_DMND_PG)) > ^ > drivers/net/ethernet/mellanox/mlx5/core/eq.c: In function 'mlx5_start_eqs': > drivers/net/ethernet/mellanox/mlx5/core/eq.c:459:28: error: > 'MLX5_DEV_CAP_FLAG_ON_DMND_PG' undeclared (first use in this function) > if (dev->caps.gen.flags & MLX5_DEV_CAP_FLAG_ON_DMND_PG) > ^ > > Really? Code added half way though the merge window not even build > tested? It's not quite as bad as it seems. The infiniband tree itself builds, the problem is the merged tree. The Mellanox guys merged the "cleanup" commit 0c7aac854f52 Author: Eli Cohen Date: Tue Dec 2 02:26:14 2014 net/mlx5_core: Remove unused dev cap enum fields These enumerations are not used so remove them. Signed-off-by: Eli Cohen Signed-off-by: David S. Miller through davem's tree, and then went ahead and used at least MLX5_DEV_CAP_FLAG_ON_DMND_PG (which that patch removes) in patches they merged through my tree. I'll add a partial revert of that patch to my tree to get back the now-used enum values. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the infiniband tree
On Mon, Dec 15, 2014 at 5:47 PM, Stephen Rothwell s...@canb.auug.org.au wrote: Hi all, After merging the infiniband tree, today's linux-next build (x86_64 allmodconfig) failed like this: drivers/infiniband/hw/mlx5/main.c: In function 'mlx5_ib_query_device': drivers/infiniband/hw/mlx5/main.c:248:34: error: 'MLX5_DEV_CAP_FLAG_ON_DMND_PG' undeclared (first use in this function) if (dev-mdev-caps.gen.flags MLX5_DEV_CAP_FLAG_ON_DMND_PG) ^ drivers/net/ethernet/mellanox/mlx5/core/fw.c: In function 'mlx5_query_odp_caps': drivers/net/ethernet/mellanox/mlx5/core/fw.c:79:30: error: 'MLX5_DEV_CAP_FLAG_ON_DMND_PG' undeclared (first use in this function) if (!(dev-caps.gen.flags MLX5_DEV_CAP_FLAG_ON_DMND_PG)) ^ drivers/net/ethernet/mellanox/mlx5/core/eq.c: In function 'mlx5_start_eqs': drivers/net/ethernet/mellanox/mlx5/core/eq.c:459:28: error: 'MLX5_DEV_CAP_FLAG_ON_DMND_PG' undeclared (first use in this function) if (dev-caps.gen.flags MLX5_DEV_CAP_FLAG_ON_DMND_PG) ^ Really? Code added half way though the merge window not even build tested? It's not quite as bad as it seems. The infiniband tree itself builds, the problem is the merged tree. The Mellanox guys merged the cleanup commit 0c7aac854f52 Author: Eli Cohen e...@dev.mellanox.co.il Date: Tue Dec 2 02:26:14 2014 net/mlx5_core: Remove unused dev cap enum fields These enumerations are not used so remove them. Signed-off-by: Eli Cohen e...@mellanox.com Signed-off-by: David S. Miller da...@davemloft.net through davem's tree, and then went ahead and used at least MLX5_DEV_CAP_FLAG_ON_DMND_PG (which that patch removes) in patches they merged through my tree. I'll add a partial revert of that patch to my tree to get back the now-used enum values. - R. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux-next: build failure after merge of the infiniband tree
On Mon, Dec 15, 2014 at 5:56 PM, Roland Dreier rol...@kernel.org wrote: I'll add a partial revert of that patch to my tree to get back the now-used enum values. I rebased my tree on top of the merge-window merge of davem's tree, and added the missing flag on top of the remove this flag commit. - R. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main set of InfiniBand/RDMA updates for 3.18 merge window: - Large set of iSER initiator improvements - Hardware driver fixes for cxgb4, mlx5 and ocrdma - Small fixes to core midlayer Ariel Nahum (3): IB/iser: Unbind at conn_stop stage IB/iser: Use iser_warn instead of BUG_ON in iser_conn_release IB/iser: Change iscsi_conn_stop log level to info Devesh Sharma (3): RDMA/ocrdma: Add default GID at index 0 RDMA/ocrdma: Convert kernel VA to PA for mmap in user IB/core: Clear AH attr variable to prevent garbage data Eli Cohen (5): IB/mlx5: Clear umr resources after ib_unregister_device IB/mlx5: Improve debug prints in mlx5_ib_reg_user_mr IB/core: Avoid leakage from kernel to user space IB/mlx5: Fix possible array overflow IB/mlx5: Remove duplicate code from mlx5_set_path Hariprasad S (3): RDMA/cxgb4: Take IPv6 into account for best_mtu and set_emss RDMA/cxgb4: Add missing neigh_release in find_route RDMA/cxgb4: Fix ntuple calculation for ipv6 and remove duplicate line Jack Morgenstein (1): IB/core: Fix XRC race condition in ib_uverbs_open_qp Jes Sorensen (3): RDMA/ocrdma: Don't memset() buffers we just allocated with kzalloc() RDMA/ocrdma: The kernel has a perfectly good BIT() macro - use it RDMA/ocrdma: Save the bit environment, spare unncessary parenthesis Li RongQing (1): RDMA/ocrdma: Remove a unused-label warning Or Gerlitz (1): IB/iser: Bump version, add maintainer Roi Dayan (1): IB/iser: Remove unused variables and dead code Roland Dreier (1): Merge branches 'core', 'cxgb4', 'iser', 'mlx5' and 'ocrdma' into for-next Sagi Grimberg (23): IB/iser: Rename ib_conn -> iser_conn IB/iser: Re-introduce ib_conn IB/iser: Extend iser_free_ib_conn_res() IB/iser: Fix DEVICE REMOVAL handling in the absence of iscsi daemon IB/iser: Don't bound release_work completions timeouts IB/iser: Protect tasks cleanup in case IB device was already released IB/iser: Signal iSCSI layer that transport is broken in error completions IB/iser: Centralize iser completion contexts IB/iser: Use internal polling budget to avoid possible live-lock IB/iser: Use single CQ for RX and TX IB/iser: Use beacon to indicate all completions were consumed IB/iser: Optimize completion polling IB/iser: Suppress scsi command send completions IB/iser: Nit - add space after __func__ in iser logging IB/iser: Add/Fix kernel doc style descriptions in iscsi_iser.h IB/iser: Fix/add kernel-doc style description in iscsi_iser.c IB/mlx5: Use enumerations for PI copy mask IB/iser: Remove redundant assignment IB/iser: Set IP_CSUM as default guard type IB/mlx5: Use extended internal signature layout IB/iser: Centralize ib_sig_domain settings Target/iser: Centralize ib_sig_domain setting IB/mlx5, iser, isert: Add Signature API additions Selvin Xavier (1): RDMA/ocrdma: Get vlan tag from ib_qp_attrs Steve Wise (1): RDMA/cxgb4: Make c4iw_wr_log_size_order static Yishai Hadas (1): IB/mlx5: Modify to work with arbitrary page size MAINTAINERS | 1 + drivers/infiniband/core/uverbs_cmd.c | 2 + drivers/infiniband/core/uverbs_main.c| 5 + drivers/infiniband/hw/cxgb4/cm.c | 32 +- drivers/infiniband/hw/cxgb4/device.c | 2 +- drivers/infiniband/hw/mlx5/main.c| 8 +- drivers/infiniband/hw/mlx5/mem.c | 18 +- drivers/infiniband/hw/mlx5/mr.c | 6 +- drivers/infiniband/hw/mlx5/qp.c | 149 +++--- drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 25 +- drivers/infiniband/hw/ocrdma/ocrdma_main.c | 12 + drivers/infiniband/hw/ocrdma/ocrdma_sli.h| 238 +- drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 10 +- drivers/infiniband/ulp/iser/iscsi_iser.c | 313 ++--- drivers/infiniband/ulp/iser/iscsi_iser.h | 408 +++- drivers/infiniband/ulp/iser/iser_initiator.c | 198 drivers/infiniband/ulp/iser/iser_memory.c| 99 ++-- drivers/infiniband/ulp/iser/iser_verbs.c | 667 +++ drivers/infiniband/ulp/isert/ib_isert.c | 65 ++- include/linux/mlx5/qp.h | 35 +- include/rdma/ib_verbs.h | 32 +- 21 files changed, 1372 insertions(+), 953 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main set of InfiniBand/RDMA updates for 3.18 merge window: - Large set of iSER initiator improvements - Hardware driver fixes for cxgb4, mlx5 and ocrdma - Small fixes to core midlayer Ariel Nahum (3): IB/iser: Unbind at conn_stop stage IB/iser: Use iser_warn instead of BUG_ON in iser_conn_release IB/iser: Change iscsi_conn_stop log level to info Devesh Sharma (3): RDMA/ocrdma: Add default GID at index 0 RDMA/ocrdma: Convert kernel VA to PA for mmap in user IB/core: Clear AH attr variable to prevent garbage data Eli Cohen (5): IB/mlx5: Clear umr resources after ib_unregister_device IB/mlx5: Improve debug prints in mlx5_ib_reg_user_mr IB/core: Avoid leakage from kernel to user space IB/mlx5: Fix possible array overflow IB/mlx5: Remove duplicate code from mlx5_set_path Hariprasad S (3): RDMA/cxgb4: Take IPv6 into account for best_mtu and set_emss RDMA/cxgb4: Add missing neigh_release in find_route RDMA/cxgb4: Fix ntuple calculation for ipv6 and remove duplicate line Jack Morgenstein (1): IB/core: Fix XRC race condition in ib_uverbs_open_qp Jes Sorensen (3): RDMA/ocrdma: Don't memset() buffers we just allocated with kzalloc() RDMA/ocrdma: The kernel has a perfectly good BIT() macro - use it RDMA/ocrdma: Save the bit environment, spare unncessary parenthesis Li RongQing (1): RDMA/ocrdma: Remove a unused-label warning Or Gerlitz (1): IB/iser: Bump version, add maintainer Roi Dayan (1): IB/iser: Remove unused variables and dead code Roland Dreier (1): Merge branches 'core', 'cxgb4', 'iser', 'mlx5' and 'ocrdma' into for-next Sagi Grimberg (23): IB/iser: Rename ib_conn - iser_conn IB/iser: Re-introduce ib_conn IB/iser: Extend iser_free_ib_conn_res() IB/iser: Fix DEVICE REMOVAL handling in the absence of iscsi daemon IB/iser: Don't bound release_work completions timeouts IB/iser: Protect tasks cleanup in case IB device was already released IB/iser: Signal iSCSI layer that transport is broken in error completions IB/iser: Centralize iser completion contexts IB/iser: Use internal polling budget to avoid possible live-lock IB/iser: Use single CQ for RX and TX IB/iser: Use beacon to indicate all completions were consumed IB/iser: Optimize completion polling IB/iser: Suppress scsi command send completions IB/iser: Nit - add space after __func__ in iser logging IB/iser: Add/Fix kernel doc style descriptions in iscsi_iser.h IB/iser: Fix/add kernel-doc style description in iscsi_iser.c IB/mlx5: Use enumerations for PI copy mask IB/iser: Remove redundant assignment IB/iser: Set IP_CSUM as default guard type IB/mlx5: Use extended internal signature layout IB/iser: Centralize ib_sig_domain settings Target/iser: Centralize ib_sig_domain setting IB/mlx5, iser, isert: Add Signature API additions Selvin Xavier (1): RDMA/ocrdma: Get vlan tag from ib_qp_attrs Steve Wise (1): RDMA/cxgb4: Make c4iw_wr_log_size_order static Yishai Hadas (1): IB/mlx5: Modify to work with arbitrary page size MAINTAINERS | 1 + drivers/infiniband/core/uverbs_cmd.c | 2 + drivers/infiniband/core/uverbs_main.c| 5 + drivers/infiniband/hw/cxgb4/cm.c | 32 +- drivers/infiniband/hw/cxgb4/device.c | 2 +- drivers/infiniband/hw/mlx5/main.c| 8 +- drivers/infiniband/hw/mlx5/mem.c | 18 +- drivers/infiniband/hw/mlx5/mr.c | 6 +- drivers/infiniband/hw/mlx5/qp.c | 149 +++--- drivers/infiniband/hw/ocrdma/ocrdma_hw.c | 25 +- drivers/infiniband/hw/ocrdma/ocrdma_main.c | 12 + drivers/infiniband/hw/ocrdma/ocrdma_sli.h| 238 +- drivers/infiniband/hw/ocrdma/ocrdma_verbs.c | 10 +- drivers/infiniband/ulp/iser/iscsi_iser.c | 313 ++--- drivers/infiniband/ulp/iser/iscsi_iser.h | 408 +++- drivers/infiniband/ulp/iser/iser_initiator.c | 198 drivers/infiniband/ulp/iser/iser_memory.c| 99 ++-- drivers/infiniband/ulp/iser/iser_verbs.c | 667 +++ drivers/infiniband/ulp/isert/ib_isert.c | 65 ++- include/linux/mlx5/qp.h | 35 +- include/rdma/ib_verbs.h | 32 +- 21 files changed, 1372 insertions(+), 953 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus This is later and bigger than I would like, and the blame is all on me: I got very busy with other stuff for a few weeks during the 3.17 cycle, and didn't prepare this tree as soon as I should have. However I don't think there's anything risky here, and no one really cares if we break InfiniBand in 3.17 anyway... Last late set of InfiniBand/RDMA fixes for 3.17: - Fixes for the new memory region re-registration support - iSER initiator error path fixes - Grab bag of small fixes for the qib and ocrdma hardware drivers - Larger set of fixes for mlx4, especially in RoCE mode Alex Estrin (1): IPoIB: Remove unnecessary port query Devesh Sharma (2): RDMA/ocrdma: Report correct value of max_fast_reg_page_list_len RDMA/ocrdma: Do not skip setting deferred_arm Jack Morgenstein (6): IB/mlx4: Fix lockdep splat for the iboe lock mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header IB/mlx4: Don't update QP1 in native mode IB/mlx4: Do not allow APM under RoCE IB/mlx4: Fix VF mac handling in RoCE Markus Stockhausen (1): IB/mlx4: Disable TSO for Connect-X rev. A0 HCAs Matan Barak (2): mlx4: Correct error flows in rereg_mr IB/core: When marshaling uverbs path, clear unused fields Mike Marciniszyn (3): IB/ipath: Change get_user_pages() usage to always NULL vmas IB/qib: Change get_user_pages() usage to always NULL vmas IB/qib: Correct reference counting in debugfs qp_stats Moni Shoua (5): IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs() IB/mlx4: Don't duplicate the default RoCE GID IB/mlx4: Reorder steps in RoCE GID table initialization IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up IB/mlx4: Avoid executing gid task when device is being removed Or Gerlitz (1): IB/iser: Bump version to 1.4.1 Roi Dayan (1): IB/iser: Fix RX/TX CQ resource leak on error flow Roland Dreier (1): Merge branches 'core', 'ipoib', 'iser', 'mlx4', 'ocrdma' and 'qib' into for-next Sagi Grimberg (1): IB/iser: Allow bind only when connection state is UP Shawn Bohrer (1): IB: ib_umem_release() should decrement mm->pinned_vm from ib_umem_get devesh.sha...@emulex.com (2): RDMA/ocrdma: Resolve L2 address when creating user AH RDMA/ocrdma: Use right macro in query AH drivers/infiniband/core/umem.c | 19 ++- drivers/infiniband/core/uverbs_marshall.c | 4 + drivers/infiniband/hw/ipath/ipath_user_pages.c | 6 +- drivers/infiniband/hw/mlx4/main.c | 169 + drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 + drivers/infiniband/hw/mlx4/mr.c| 7 +- drivers/infiniband/hw/mlx4/qp.c| 60 + drivers/infiniband/hw/ocrdma/ocrdma_ah.c | 43 +-- drivers/infiniband/hw/ocrdma/ocrdma_verbs.c| 6 +- drivers/infiniband/hw/qib/qib_debugfs.c| 3 +- drivers/infiniband/hw/qib/qib_qp.c | 8 -- drivers/infiniband/hw/qib/qib_user_pages.c | 6 +- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 10 +- drivers/infiniband/ulp/iser/iscsi_iser.c | 19 ++- drivers/infiniband/ulp/iser/iscsi_iser.h | 2 +- drivers/infiniband/ulp/iser/iser_verbs.c | 24 ++-- drivers/net/ethernet/mellanox/mlx4/mr.c| 33 +++-- drivers/net/ethernet/mellanox/mlx4/port.c | 11 +- include/rdma/ib_umem.h | 1 + 19 files changed, 277 insertions(+), 155 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus This is later and bigger than I would like, and the blame is all on me: I got very busy with other stuff for a few weeks during the 3.17 cycle, and didn't prepare this tree as soon as I should have. However I don't think there's anything risky here, and no one really cares if we break InfiniBand in 3.17 anyway... Last late set of InfiniBand/RDMA fixes for 3.17: - Fixes for the new memory region re-registration support - iSER initiator error path fixes - Grab bag of small fixes for the qib and ocrdma hardware drivers - Larger set of fixes for mlx4, especially in RoCE mode Alex Estrin (1): IPoIB: Remove unnecessary port query Devesh Sharma (2): RDMA/ocrdma: Report correct value of max_fast_reg_page_list_len RDMA/ocrdma: Do not skip setting deferred_arm Jack Morgenstein (6): IB/mlx4: Fix lockdep splat for the iboe lock mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header IB/mlx4: Don't update QP1 in native mode IB/mlx4: Do not allow APM under RoCE IB/mlx4: Fix VF mac handling in RoCE Markus Stockhausen (1): IB/mlx4: Disable TSO for Connect-X rev. A0 HCAs Matan Barak (2): mlx4: Correct error flows in rereg_mr IB/core: When marshaling uverbs path, clear unused fields Mike Marciniszyn (3): IB/ipath: Change get_user_pages() usage to always NULL vmas IB/qib: Change get_user_pages() usage to always NULL vmas IB/qib: Correct reference counting in debugfs qp_stats Moni Shoua (5): IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs() IB/mlx4: Don't duplicate the default RoCE GID IB/mlx4: Reorder steps in RoCE GID table initialization IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up IB/mlx4: Avoid executing gid task when device is being removed Or Gerlitz (1): IB/iser: Bump version to 1.4.1 Roi Dayan (1): IB/iser: Fix RX/TX CQ resource leak on error flow Roland Dreier (1): Merge branches 'core', 'ipoib', 'iser', 'mlx4', 'ocrdma' and 'qib' into for-next Sagi Grimberg (1): IB/iser: Allow bind only when connection state is UP Shawn Bohrer (1): IB: ib_umem_release() should decrement mm-pinned_vm from ib_umem_get devesh.sha...@emulex.com (2): RDMA/ocrdma: Resolve L2 address when creating user AH RDMA/ocrdma: Use right macro in query AH drivers/infiniband/core/umem.c | 19 ++- drivers/infiniband/core/uverbs_marshall.c | 4 + drivers/infiniband/hw/ipath/ipath_user_pages.c | 6 +- drivers/infiniband/hw/mlx4/main.c | 169 + drivers/infiniband/hw/mlx4/mlx4_ib.h | 1 + drivers/infiniband/hw/mlx4/mr.c| 7 +- drivers/infiniband/hw/mlx4/qp.c| 60 + drivers/infiniband/hw/ocrdma/ocrdma_ah.c | 43 +-- drivers/infiniband/hw/ocrdma/ocrdma_verbs.c| 6 +- drivers/infiniband/hw/qib/qib_debugfs.c| 3 +- drivers/infiniband/hw/qib/qib_qp.c | 8 -- drivers/infiniband/hw/qib/qib_user_pages.c | 6 +- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 10 +- drivers/infiniband/ulp/iser/iscsi_iser.c | 19 ++- drivers/infiniband/ulp/iser/iscsi_iser.h | 2 +- drivers/infiniband/ulp/iser/iser_verbs.c | 24 ++-- drivers/net/ethernet/mellanox/mlx4/mr.c| 33 +++-- drivers/net/ethernet/mellanox/mlx4/port.c | 11 +- include/rdma/ib_umem.h | 1 + 19 files changed, 277 insertions(+), 155 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v1 for-next 00/16] On demand paging
> I would like to note that we at Los Alamos National Laboratory are very > interested in this functionality and it would be great if it gets accepted. Have you done any review or testing of these changes? If so can you share the results? - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v1 for-next 00/16] On demand paging
I would like to note that we at Los Alamos National Laboratory are very interested in this functionality and it would be great if it gets accepted. Have you done any review or testing of these changes? If so can you share the results? - R. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main set of InfiniBand/RDMA updates for 3.17 merge window: - MR reregistration support - MAD support for RMPP in userspace - iSER and SRP initiator updates - ocrdma hardware driver updates - other fixes... Alex Estrin (1): IB/ipoib: Avoid multicast join attempts with invalid P_key Ariel Nahum (3): IB/iser: Seperate iser_conn and iscsi_endpoint storage space IB/iser: Protect iser state machine with a mutex IB/iser: Replace connection waitqueue with completion object Bart Van Assche (3): scsi_transport_srp: Fix fast_io_fail_tmo=dev_loss_tmo=off behavior IB/srp: Fix deadlock between host removal and multipathd IB/srp: Fix residual handling Dan Carpenter (1): RDMA/amso1100: Check for integer overflow in c2_alloc_cq_buf() Devesh Sharma (7): RDMA/ocrdma: Avoid posting DPP requests for RDMA READ be2net: Issue shutdown event to ocrdma driver RDMA/ocrdma: Handle shutdown event from be2net driver RDMA/ocrdma: Remove hardcoding of the max DPP QPs supported RDMA/ocrdma: Delete AH table if ocrdma_init_hw fails after AH table creation RDMA/ocrdma: Obtain SL from device structure RDMA/ocrdma: Update sli data structure for endianness Doug Ledford (2): IB/srpt: Handle GID change events RDMA/uapi: Include socket.h in rdma_user_cm.h Erez Shitrit (2): IB/ipoib: Use P_Key change event instead of P_Key polling mechanism IB/ipoib: Avoid flushing the workqueue from worker context Fabian Frederick (3): IPoIB: Remove unnecessary test for NULL before debugfs_remove() IB/mlx4: Use ARRAY_SIZE instead of sizeof/sizeof[0] IB/mlx5: Use ARRAY_SIZE instead of sizeof/sizeof[0] Ira Weiny (5): IB/umad: Update module to [pr|dev]_* style print messages IB/mad: Update module to [pr|dev]_* style print messages IB/mad: Add dev_notice messages for various umad/mad registration failures IB/mad: add new ioctl to ABI to support new registration options IB/mad: Add user space RMPP support Jack Morgenstein (1): mlx4_core: Add support for secure-host and SMP firewall Matan Barak (3): IB/core: Add user MR re-registration support mlx4_core: Add helper functions to support MR re-registration IB/mlx4_ib: Add support for user MR re-registration Mitesh Ahuja (4): RDMA/ocrdma: Allow only SEND opcode in case of UD QPs RDMA/ocrdma: Do proper cleanup even if FW is in error state RDMA/ocrdma: Return proper value for max_mr_size RDMA/ocrdma: report asic-id in query device Or Gerlitz (1): IB/ipath: Add P_Key change event support Roi Dayan (3): IB/iser: Support IPv6 address family IB/iser: Add TIMEWAIT_EXIT event handling IB/iser: Clarify a duplicate counters check Roland Dreier (1): Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'iwcm', 'mad', 'misc', 'mlx4', 'mlx5', 'ocrdma' and 'srp' into for-next Sagi Grimberg (2): IB/iser: Fix responder resources advertisement IB/iser: Remove redundant return code in iser_free_ib_conn_res() Selvin Xavier (8): RDMA/ocrdma: Query and initalize the PFC SL RDMA/ocrdma: Add hca_type and fixing fw_version string in device atrributes RDMA/ocrdma: Avoid reporting wrong completions in case of error CQEs RDMA/ocrdma: Add missing adapter mailbox opcodes RDMA/ocrdma: Increase the size of STAG array in dev structure to 16K RDMA/ocrdma: Initialize the GID table while registering the device RDMA/ocrdma: Fix a sparse warning RDMA/ocrdma: Update the ocrdma module version string Steve Wise (2): RDMA/cxgb4: Only call CQ completion handler if it is armed RDMA/iwcm: Use a default listen backlog if needed Wei Yongjun (1): IB/srp: Fix return value check in srp_init_module() Documentation/infiniband/user_mad.txt | 13 +- drivers/infiniband/core/agent.c| 16 +- drivers/infiniband/core/cm.c | 5 +- drivers/infiniband/core/iwcm.c | 27 ++ drivers/infiniband/core/mad.c | 283 +--- drivers/infiniband/core/mad_priv.h | 3 - drivers/infiniband/core/sa_query.c | 2 +- drivers/infiniband/core/user_mad.c | 188 +++-- drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_cmd.c | 93 +++ drivers/infiniband/core/uverbs_main.c | 1 + drivers/infiniband/hw/amso1100/c2_cq.c | 7 +- drivers/infiniband/hw/cxgb4/ev.c | 1 + drivers/infiniband/hw/cxgb4/qp.c | 37 ++- drivers/infiniband
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main set of InfiniBand/RDMA updates for 3.17 merge window: - MR reregistration support - MAD support for RMPP in userspace - iSER and SRP initiator updates - ocrdma hardware driver updates - other fixes... Alex Estrin (1): IB/ipoib: Avoid multicast join attempts with invalid P_key Ariel Nahum (3): IB/iser: Seperate iser_conn and iscsi_endpoint storage space IB/iser: Protect iser state machine with a mutex IB/iser: Replace connection waitqueue with completion object Bart Van Assche (3): scsi_transport_srp: Fix fast_io_fail_tmo=dev_loss_tmo=off behavior IB/srp: Fix deadlock between host removal and multipathd IB/srp: Fix residual handling Dan Carpenter (1): RDMA/amso1100: Check for integer overflow in c2_alloc_cq_buf() Devesh Sharma (7): RDMA/ocrdma: Avoid posting DPP requests for RDMA READ be2net: Issue shutdown event to ocrdma driver RDMA/ocrdma: Handle shutdown event from be2net driver RDMA/ocrdma: Remove hardcoding of the max DPP QPs supported RDMA/ocrdma: Delete AH table if ocrdma_init_hw fails after AH table creation RDMA/ocrdma: Obtain SL from device structure RDMA/ocrdma: Update sli data structure for endianness Doug Ledford (2): IB/srpt: Handle GID change events RDMA/uapi: Include socket.h in rdma_user_cm.h Erez Shitrit (2): IB/ipoib: Use P_Key change event instead of P_Key polling mechanism IB/ipoib: Avoid flushing the workqueue from worker context Fabian Frederick (3): IPoIB: Remove unnecessary test for NULL before debugfs_remove() IB/mlx4: Use ARRAY_SIZE instead of sizeof/sizeof[0] IB/mlx5: Use ARRAY_SIZE instead of sizeof/sizeof[0] Ira Weiny (5): IB/umad: Update module to [pr|dev]_* style print messages IB/mad: Update module to [pr|dev]_* style print messages IB/mad: Add dev_notice messages for various umad/mad registration failures IB/mad: add new ioctl to ABI to support new registration options IB/mad: Add user space RMPP support Jack Morgenstein (1): mlx4_core: Add support for secure-host and SMP firewall Matan Barak (3): IB/core: Add user MR re-registration support mlx4_core: Add helper functions to support MR re-registration IB/mlx4_ib: Add support for user MR re-registration Mitesh Ahuja (4): RDMA/ocrdma: Allow only SEND opcode in case of UD QPs RDMA/ocrdma: Do proper cleanup even if FW is in error state RDMA/ocrdma: Return proper value for max_mr_size RDMA/ocrdma: report asic-id in query device Or Gerlitz (1): IB/ipath: Add P_Key change event support Roi Dayan (3): IB/iser: Support IPv6 address family IB/iser: Add TIMEWAIT_EXIT event handling IB/iser: Clarify a duplicate counters check Roland Dreier (1): Merge branches 'core', 'cxgb4', 'ipoib', 'iser', 'iwcm', 'mad', 'misc', 'mlx4', 'mlx5', 'ocrdma' and 'srp' into for-next Sagi Grimberg (2): IB/iser: Fix responder resources advertisement IB/iser: Remove redundant return code in iser_free_ib_conn_res() Selvin Xavier (8): RDMA/ocrdma: Query and initalize the PFC SL RDMA/ocrdma: Add hca_type and fixing fw_version string in device atrributes RDMA/ocrdma: Avoid reporting wrong completions in case of error CQEs RDMA/ocrdma: Add missing adapter mailbox opcodes RDMA/ocrdma: Increase the size of STAG array in dev structure to 16K RDMA/ocrdma: Initialize the GID table while registering the device RDMA/ocrdma: Fix a sparse warning RDMA/ocrdma: Update the ocrdma module version string Steve Wise (2): RDMA/cxgb4: Only call CQ completion handler if it is armed RDMA/iwcm: Use a default listen backlog if needed Wei Yongjun (1): IB/srp: Fix return value check in srp_init_module() Documentation/infiniband/user_mad.txt | 13 +- drivers/infiniband/core/agent.c| 16 +- drivers/infiniband/core/cm.c | 5 +- drivers/infiniband/core/iwcm.c | 27 ++ drivers/infiniband/core/mad.c | 283 +--- drivers/infiniband/core/mad_priv.h | 3 - drivers/infiniband/core/sa_query.c | 2 +- drivers/infiniband/core/user_mad.c | 188 +++-- drivers/infiniband/core/uverbs.h | 1 + drivers/infiniband/core/uverbs_cmd.c | 93 +++ drivers/infiniband/core/uverbs_main.c | 1 + drivers/infiniband/hw/amso1100/c2_cq.c | 7 +- drivers/infiniband/hw/cxgb4/ev.c | 1 + drivers/infiniband/hw/cxgb4/qp.c | 37 ++- drivers/infiniband
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA fixes for 3.16 - cxgb4 hardware driver regression fixes - mlx5 hardware driver regression fixes Hariprasad S (2): RDMA/cxgb4: Fix skb_leak in reject_cr() RDMA/cxgb4: Clean up connection on ARP error Or Gerlitz (1): IB/mlx5: Enable "block multicast loopback" for kernel consumers Roland Dreier (1): Merge branches 'cxgb4' and 'mlx5' into for-next Sagi Grimberg (1): mlx5_core: Fix possible race between mr tree insert/delete Steve Wise (2): RDMA/cxgb4: Initialize the device status page RDMA/cxgb4: Call iwpm_init() only once drivers/infiniband/hw/cxgb4/cm.c | 14 +++--- drivers/infiniband/hw/cxgb4/device.c | 18 +++--- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +- drivers/infiniband/hw/mlx5/qp.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/mr.c | 19 +++ 5 files changed, 39 insertions(+), 16 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA fixes for 3.16 - cxgb4 hardware driver regression fixes - mlx5 hardware driver regression fixes Hariprasad S (2): RDMA/cxgb4: Fix skb_leak in reject_cr() RDMA/cxgb4: Clean up connection on ARP error Or Gerlitz (1): IB/mlx5: Enable block multicast loopback for kernel consumers Roland Dreier (1): Merge branches 'cxgb4' and 'mlx5' into for-next Sagi Grimberg (1): mlx5_core: Fix possible race between mr tree insert/delete Steve Wise (2): RDMA/cxgb4: Initialize the device status page RDMA/cxgb4: Call iwpm_init() only once drivers/infiniband/hw/cxgb4/cm.c | 14 +++--- drivers/infiniband/hw/cxgb4/device.c | 18 +++--- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 +- drivers/infiniband/hw/mlx5/qp.c | 2 +- drivers/net/ethernet/mellanox/mlx5/core/mr.c | 19 +++ 5 files changed, 39 insertions(+), 16 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main batch of InfiniBand/RDMA changes for 3.16: - Add iWARP port mapper to avoid conflicts between RDMA and normal stack TCP connections. - Fixes for i386 / x86-64 structure padding differences (ABI compatibility for 32-on-64) from Yann Droneaud. - A pile of SRP initiator fixes from Bart Van Assche. - Fixes for a writeback / memory allocation deadlock with NFS over IPoIB connected mode from Jiri Kosina. - The usual fixes and cleanups to mlx4, mlx5, cxgb4 and other low-level drivers. Ariel Nahum (2): IB/iser: Simplify connection management IB/iser: Fix a possible race in iser connection states transition Bart Van Assche (11): IB/srp: Fix a sporadic crash triggered by cable pulling IB/srp: Fix kernel-doc warnings IB/srp: Introduce an additional local variable IB/srp: Introduce srp_map_fmr() IB/srp: Introduce srp_finish_mapping() IB/srp: Introduce the 'register_always' kernel module parameter IB/srp: One FMR pool per SRP connection IB/srp: Rename FMR-related variables IB/srp: Add fast registration support IB/umad: Fix error handling IB/umad: Fix use-after-free on close Christoph Jaeger (1): RDMA/cxgb4: Fix memory leaks in c4iw_alloc() error paths Colin Ian King (1): IB/mlx4: fix unitialised variable is_mcast Dan Carpenter (2): RDMA/cxgb3: Fix information leak in send_abort() RDMA/cxgb3: Remove a couple unneeded conditions Dennis Dalessandro (1): IB/ipath: Translate legacy diagpkt into newer extended diagpkt Dotan Barak (1): mlx4_core: Fix memory leaks in SR-IOV error paths Duan Jiong (1): RDMA/ocrdma: Convert to use simple_open() Haggai Eran (7): IB/mlx5: Fix error handling in reg_umr IB/mlx5: Add MR to radix tree in reg_mr_callback mlx5_core: Store MR attributes in mlx5_mr_core during creation and after UMR IB/mlx5: Set QP offsets and parameters for user QPs and not just for kernel QPs IB/core: Remove unneeded kobject_get/put calls IB/core: Fix port kobject deletion during error flow IB/core: Fix kobject leak on device register error flow Jack Morgenstein (5): mlx4_core: Fix incorrect FLAGS1 bitmap test in mlx4_QUERY_FUNC_CAP IB/mlx4: SET_PORT called by mlx4_ib_modify_port should be wrapped IB/mlx4: Preparation for VFs to issue/receive SMI (QP0) requests/responses mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs IB/mlx4: Add interface for selecting VFs to enable QP0 via MLX proxy QPs Jiri Kosina (2): IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO IB/mlx4: Fix gfp passing in create_qp_common() Joe Perches (1): IB/srp: Avoid problems if a header uses pr_fmt Manuel Schölling (1): IB/ipath: Use time_before()/_after() Mike Marciniszyn (1): IB/qib: Fix port in pkey change event Or Gerlitz (3): IB/iser: Bump version to 1.4 IB: Return error for unsupported QP creation flags IB: Add a QP creation flag to use GFP_NOIO allocations Roi Dayan (1): IB/iser: Add missing newlines to logging messages Roland Dreier (6): IB/mlx5: Fix warning about cast of wr_id back to pointer on 32 bits mlx4_core: Move handling of MLX4_QP_ST_MLX to proper switch statement IB/mad: Fix sparse warning about gfp_t use IB/core: Fix sparse warnings about redeclared functions mlx4_core: Fix GFP flags parameters to be gfp_t Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', 'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next Sagi Grimberg (3): mlx5_core: Fix signature handover operation for interleaved buffers mlx5_core: Simplify signature handover wqe for interleaved buffers mlx5_core: Copy DIF fields only when input and output space values match Shachar Raindel (1): IB/mlx5: Refactor UMR to have its own context struct Steve Wise (2): RDMA/cxgb4: Fix vlan support RDMA/cxgb4: Add support for iWARP Port Mapper user space service Tatyana Nikolova (2): RDMA/core: Add support for iWARP Port Mapper user space service RDMA/nes: Add support for iWARP Port Mapper user space service Upinder Malhi (1): IB/usnic: Fix source file missing copyright and license Vinit Agnihotri (1): IB/qib: Additional Intel branding changes Yann Droneaud (5): IB/mlx5: add missing padding at end of struct mlx5_ib_create_cq IB/mlx5: add missing padding at end of struct mlx5_ib_create_srq RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp IB: Allow build of hw/ and ulp/ subdirectories independently RDMA/cxgb4: add missing padding at end of struct
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main batch of InfiniBand/RDMA changes for 3.16: - Add iWARP port mapper to avoid conflicts between RDMA and normal stack TCP connections. - Fixes for i386 / x86-64 structure padding differences (ABI compatibility for 32-on-64) from Yann Droneaud. - A pile of SRP initiator fixes from Bart Van Assche. - Fixes for a writeback / memory allocation deadlock with NFS over IPoIB connected mode from Jiri Kosina. - The usual fixes and cleanups to mlx4, mlx5, cxgb4 and other low-level drivers. Ariel Nahum (2): IB/iser: Simplify connection management IB/iser: Fix a possible race in iser connection states transition Bart Van Assche (11): IB/srp: Fix a sporadic crash triggered by cable pulling IB/srp: Fix kernel-doc warnings IB/srp: Introduce an additional local variable IB/srp: Introduce srp_map_fmr() IB/srp: Introduce srp_finish_mapping() IB/srp: Introduce the 'register_always' kernel module parameter IB/srp: One FMR pool per SRP connection IB/srp: Rename FMR-related variables IB/srp: Add fast registration support IB/umad: Fix error handling IB/umad: Fix use-after-free on close Christoph Jaeger (1): RDMA/cxgb4: Fix memory leaks in c4iw_alloc() error paths Colin Ian King (1): IB/mlx4: fix unitialised variable is_mcast Dan Carpenter (2): RDMA/cxgb3: Fix information leak in send_abort() RDMA/cxgb3: Remove a couple unneeded conditions Dennis Dalessandro (1): IB/ipath: Translate legacy diagpkt into newer extended diagpkt Dotan Barak (1): mlx4_core: Fix memory leaks in SR-IOV error paths Duan Jiong (1): RDMA/ocrdma: Convert to use simple_open() Haggai Eran (7): IB/mlx5: Fix error handling in reg_umr IB/mlx5: Add MR to radix tree in reg_mr_callback mlx5_core: Store MR attributes in mlx5_mr_core during creation and after UMR IB/mlx5: Set QP offsets and parameters for user QPs and not just for kernel QPs IB/core: Remove unneeded kobject_get/put calls IB/core: Fix port kobject deletion during error flow IB/core: Fix kobject leak on device register error flow Jack Morgenstein (5): mlx4_core: Fix incorrect FLAGS1 bitmap test in mlx4_QUERY_FUNC_CAP IB/mlx4: SET_PORT called by mlx4_ib_modify_port should be wrapped IB/mlx4: Preparation for VFs to issue/receive SMI (QP0) requests/responses mlx4: Add infrastructure for selecting VFs to enable QP0 via MLX proxy QPs IB/mlx4: Add interface for selecting VFs to enable QP0 via MLX proxy QPs Jiri Kosina (2): IB/mlx4: Implement IB_QP_CREATE_USE_GFP_NOIO IB/mlx4: Fix gfp passing in create_qp_common() Joe Perches (1): IB/srp: Avoid problems if a header uses pr_fmt Manuel Schölling (1): IB/ipath: Use time_before()/_after() Mike Marciniszyn (1): IB/qib: Fix port in pkey change event Or Gerlitz (3): IB/iser: Bump version to 1.4 IB: Return error for unsupported QP creation flags IB: Add a QP creation flag to use GFP_NOIO allocations Roi Dayan (1): IB/iser: Add missing newlines to logging messages Roland Dreier (6): IB/mlx5: Fix warning about cast of wr_id back to pointer on 32 bits mlx4_core: Move handling of MLX4_QP_ST_MLX to proper switch statement IB/mad: Fix sparse warning about gfp_t use IB/core: Fix sparse warnings about redeclared functions mlx4_core: Fix GFP flags parameters to be gfp_t Merge branches 'core', 'cxgb3', 'cxgb4', 'iser', 'iwpm', 'misc', 'mlx4', 'mlx5', 'noio', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next Sagi Grimberg (3): mlx5_core: Fix signature handover operation for interleaved buffers mlx5_core: Simplify signature handover wqe for interleaved buffers mlx5_core: Copy DIF fields only when input and output space values match Shachar Raindel (1): IB/mlx5: Refactor UMR to have its own context struct Steve Wise (2): RDMA/cxgb4: Fix vlan support RDMA/cxgb4: Add support for iWARP Port Mapper user space service Tatyana Nikolova (2): RDMA/core: Add support for iWARP Port Mapper user space service RDMA/nes: Add support for iWARP Port Mapper user space service Upinder Malhi (1): IB/usnic: Fix source file missing copyright and license Vinit Agnihotri (1): IB/qib: Additional Intel branding changes Yann Droneaud (5): IB/mlx5: add missing padding at end of struct mlx5_ib_create_cq IB/mlx5: add missing padding at end of struct mlx5_ib_create_srq RDMA/cxgb4: Add missing padding at end of struct c4iw_create_cq_resp IB: Allow build of hw/ and ulp/ subdirectories independently RDMA/cxgb4: add missing padding at end of struct
Re: [PATCH v1 for-next 0/3] IB: Use GFP_NOIO calls in IPoIB connected mode TX path
On Sat, May 17, 2014 at 1:52 PM, Or Gerlitz wrote: > Roland, we're soon on -rc6 and there's no reason for this to miss > 3.16, could you please comment whether you want it to go through your > tree or net-next? I will pick it up. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v1 for-next 0/3] IB: Use GFP_NOIO calls in IPoIB connected mode TX path
On Sat, May 17, 2014 at 1:52 PM, Or Gerlitz or.gerl...@gmail.com wrote: Roland, we're soon on -rc6 and there's no reason for this to miss 3.16, could you please comment whether you want it to go through your tree or net-next? I will pick it up. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/mm] x86, ioremap: Speed up check for RAM pages
Commit-ID: c81c8a1eeede61e92a15103748c23d100880cc8a Gitweb: http://git.kernel.org/tip/c81c8a1eeede61e92a15103748c23d100880cc8a Author: Roland Dreier AuthorDate: Fri, 2 May 2014 11:18:41 -0700 Committer: H. Peter Anvin CommitDate: Fri, 2 May 2014 11:52:26 -0700 x86, ioremap: Speed up check for RAM pages In __ioremap_caller() (the guts of ioremap), we loop over the range of pfns being remapped and checks each one individually with page_is_ram(). For large ioremaps, this can be very slow. For example, we have a device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+ seconds -- sometimes long enough to trigger the soft lockup detector! Internally, page_is_ram() calls walk_system_ram_range() on a single page. Instead, we can make a single call to walk_system_ram_range() from __ioremap_caller(), and do our further checks only for any RAM pages that we find. For the common case of MMIO, this saves an enormous amount of work, since the range being ioremapped doesn't intersect system RAM at all. With this change, ioremap on our 256 GiB BAR takes less than 1 second. Signed-off-by: Roland Dreier Link: http://lkml.kernel.org/r/1399054721-1331-1-git-send-email-rol...@kernel.org Signed-off-by: H. Peter Anvin --- arch/x86/mm/ioremap.c | 26 +++--- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 597ac15..bc7527e 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -50,6 +50,21 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size, return err; } +static int __ioremap_check_ram(unsigned long start_pfn, unsigned long nr_pages, + void *arg) +{ + unsigned long i; + + for (i = 0; i < nr_pages; ++i) + if (pfn_valid(start_pfn + i) && + !PageReserved(pfn_to_page(start_pfn + i))) + return 1; + + WARN_ONCE(1, "ioremap on RAM pfn 0x%lx\n", start_pfn); + + return 0; +} + /* * Remap an arbitrary physical address space into the kernel virtual * address space. Needed when the kernel wants to access high addresses @@ -93,14 +108,11 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr, /* * Don't allow anybody to remap normal RAM that we're using.. */ + pfn = phys_addr >> PAGE_SHIFT; last_pfn = last_addr >> PAGE_SHIFT; - for (pfn = phys_addr >> PAGE_SHIFT; pfn <= last_pfn; pfn++) { - int is_ram = page_is_ram(pfn); - - if (is_ram && pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn))) - return NULL; - WARN_ON_ONCE(is_ram); - } + if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL, + __ioremap_check_ram) == 1) + return NULL; /* * Mappings have to be page-aligned -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86, ioremap: Speed up check for RAM pages
From: Roland Dreier In __ioremap_caller() (the guts of ioremap), we loop over the range of pfns being remapped and checks each one individually with page_is_ram(). For large ioremaps, this can be very slow. For example, we have a device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+ seconds -- sometimes long enough to trigger the soft lockup detector! Internally, page_is_ram() calls walk_system_ram_range() on a single page. Instead, we can make a single call to walk_system_ram_range() from __ioremap_caller(), and do our further checks only for any RAM pages that we find. For the common case of MMIO, this saves an enormous amount of work, since the range being ioremapped doesn't intersect system RAM at all. With this change, ioremap on our 256 GiB BAR takes less than 1 second. Signed-off-by: Roland Dreier --- arch/x86/mm/ioremap.c | 26 +++--- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 597ac155c91c..bc7527e109c8 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -50,6 +50,21 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size, return err; } +static int __ioremap_check_ram(unsigned long start_pfn, unsigned long nr_pages, + void *arg) +{ + unsigned long i; + + for (i = 0; i < nr_pages; ++i) + if (pfn_valid(start_pfn + i) && + !PageReserved(pfn_to_page(start_pfn + i))) + return 1; + + WARN_ONCE(1, "ioremap on RAM pfn 0x%lx\n", start_pfn); + + return 0; +} + /* * Remap an arbitrary physical address space into the kernel virtual * address space. Needed when the kernel wants to access high addresses @@ -93,14 +108,11 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr, /* * Don't allow anybody to remap normal RAM that we're using.. */ + pfn = phys_addr >> PAGE_SHIFT; last_pfn = last_addr >> PAGE_SHIFT; - for (pfn = phys_addr >> PAGE_SHIFT; pfn <= last_pfn; pfn++) { - int is_ram = page_is_ram(pfn); - - if (is_ram && pfn_valid(pfn) && !PageReserved(pfn_to_page(pfn))) - return NULL; - WARN_ON_ONCE(is_ram); - } + if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL, + __ioremap_check_ram) == 1) + return NULL; /* * Mappings have to be page-aligned -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] x86, ioremap: Speed up check for RAM pages
From: Roland Dreier rol...@purestorage.com In __ioremap_caller() (the guts of ioremap), we loop over the range of pfns being remapped and checks each one individually with page_is_ram(). For large ioremaps, this can be very slow. For example, we have a device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+ seconds -- sometimes long enough to trigger the soft lockup detector! Internally, page_is_ram() calls walk_system_ram_range() on a single page. Instead, we can make a single call to walk_system_ram_range() from __ioremap_caller(), and do our further checks only for any RAM pages that we find. For the common case of MMIO, this saves an enormous amount of work, since the range being ioremapped doesn't intersect system RAM at all. With this change, ioremap on our 256 GiB BAR takes less than 1 second. Signed-off-by: Roland Dreier rol...@purestorage.com --- arch/x86/mm/ioremap.c | 26 +++--- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 597ac155c91c..bc7527e109c8 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -50,6 +50,21 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size, return err; } +static int __ioremap_check_ram(unsigned long start_pfn, unsigned long nr_pages, + void *arg) +{ + unsigned long i; + + for (i = 0; i nr_pages; ++i) + if (pfn_valid(start_pfn + i) + !PageReserved(pfn_to_page(start_pfn + i))) + return 1; + + WARN_ONCE(1, ioremap on RAM pfn 0x%lx\n, start_pfn); + + return 0; +} + /* * Remap an arbitrary physical address space into the kernel virtual * address space. Needed when the kernel wants to access high addresses @@ -93,14 +108,11 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr, /* * Don't allow anybody to remap normal RAM that we're using.. */ + pfn = phys_addr PAGE_SHIFT; last_pfn = last_addr PAGE_SHIFT; - for (pfn = phys_addr PAGE_SHIFT; pfn = last_pfn; pfn++) { - int is_ram = page_is_ram(pfn); - - if (is_ram pfn_valid(pfn) !PageReserved(pfn_to_page(pfn))) - return NULL; - WARN_ON_ONCE(is_ram); - } + if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL, + __ioremap_check_ram) == 1) + return NULL; /* * Mappings have to be page-aligned -- 1.9.1 -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[tip:x86/mm] x86, ioremap: Speed up check for RAM pages
Commit-ID: c81c8a1eeede61e92a15103748c23d100880cc8a Gitweb: http://git.kernel.org/tip/c81c8a1eeede61e92a15103748c23d100880cc8a Author: Roland Dreier rol...@purestorage.com AuthorDate: Fri, 2 May 2014 11:18:41 -0700 Committer: H. Peter Anvin h...@linux.intel.com CommitDate: Fri, 2 May 2014 11:52:26 -0700 x86, ioremap: Speed up check for RAM pages In __ioremap_caller() (the guts of ioremap), we loop over the range of pfns being remapped and checks each one individually with page_is_ram(). For large ioremaps, this can be very slow. For example, we have a device with a 256 GiB PCI BAR, and ioremapping this BAR can take 20+ seconds -- sometimes long enough to trigger the soft lockup detector! Internally, page_is_ram() calls walk_system_ram_range() on a single page. Instead, we can make a single call to walk_system_ram_range() from __ioremap_caller(), and do our further checks only for any RAM pages that we find. For the common case of MMIO, this saves an enormous amount of work, since the range being ioremapped doesn't intersect system RAM at all. With this change, ioremap on our 256 GiB BAR takes less than 1 second. Signed-off-by: Roland Dreier rol...@purestorage.com Link: http://lkml.kernel.org/r/1399054721-1331-1-git-send-email-rol...@kernel.org Signed-off-by: H. Peter Anvin h...@linux.intel.com --- arch/x86/mm/ioremap.c | 26 +++--- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c index 597ac15..bc7527e 100644 --- a/arch/x86/mm/ioremap.c +++ b/arch/x86/mm/ioremap.c @@ -50,6 +50,21 @@ int ioremap_change_attr(unsigned long vaddr, unsigned long size, return err; } +static int __ioremap_check_ram(unsigned long start_pfn, unsigned long nr_pages, + void *arg) +{ + unsigned long i; + + for (i = 0; i nr_pages; ++i) + if (pfn_valid(start_pfn + i) + !PageReserved(pfn_to_page(start_pfn + i))) + return 1; + + WARN_ONCE(1, ioremap on RAM pfn 0x%lx\n, start_pfn); + + return 0; +} + /* * Remap an arbitrary physical address space into the kernel virtual * address space. Needed when the kernel wants to access high addresses @@ -93,14 +108,11 @@ static void __iomem *__ioremap_caller(resource_size_t phys_addr, /* * Don't allow anybody to remap normal RAM that we're using.. */ + pfn = phys_addr PAGE_SHIFT; last_pfn = last_addr PAGE_SHIFT; - for (pfn = phys_addr PAGE_SHIFT; pfn = last_pfn; pfn++) { - int is_ram = page_is_ram(pfn); - - if (is_ram pfn_valid(pfn) !PageReserved(pfn_to_page(pfn))) - return NULL; - WARN_ON_ONCE(is_ram); - } + if (walk_system_ram_range(pfn, last_pfn - pfn + 1, NULL, + __ioremap_check_ram) == 1) + return NULL; /* * Mappings have to be page-aligned -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA updates for 3.15-rc4: - cxgb4 hardware driver fixes Hariprasad S (1): RDMA/cxgb4: Update Kconfig to include Chelsio T5 adapter Steve Wise (3): RDMA/cxgb4: Fix endpoint mutex deadlocks RDMA/cxgb4: Force T5 connections to use TAHOE congestion control RDMA/cxgb4: Only allow kernel db ringing for T4 devs drivers/infiniband/hw/cxgb4/Kconfig | 6 ++--- drivers/infiniband/hw/cxgb4/cm.c | 39 ++- drivers/infiniband/hw/cxgb4/iw_cxgb4.h| 1 + drivers/infiniband/hw/cxgb4/qp.c | 13 +++ drivers/infiniband/hw/cxgb4/t4fw_ri_api.h | 14 +++ 5 files changed, 55 insertions(+), 18 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA updates for 3.15-rc4: - cxgb4 hardware driver fixes Hariprasad S (1): RDMA/cxgb4: Update Kconfig to include Chelsio T5 adapter Steve Wise (3): RDMA/cxgb4: Fix endpoint mutex deadlocks RDMA/cxgb4: Force T5 connections to use TAHOE congestion control RDMA/cxgb4: Only allow kernel db ringing for T4 devs drivers/infiniband/hw/cxgb4/Kconfig | 6 ++--- drivers/infiniband/hw/cxgb4/cm.c | 39 ++- drivers/infiniband/hw/cxgb4/iw_cxgb4.h| 1 + drivers/infiniband/hw/cxgb4/qp.c | 13 +++ drivers/infiniband/hw/cxgb4/t4fw_ri_api.h | 14 +++ 5 files changed, 55 insertions(+), 18 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA updates for 3.15-rc2: - Mostly cxgb4 fixes unblocked by the merge of some prerequisites via the net tree. - Drop deprecated MSI-X API use. - A couple other miscellaneous things. Alexander Gordeev (2): IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix() IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix() Eli Cohen (1): IB/mlx5: Add block multicast loopback support Hariprasad Shenai (1): RDMA/cxgb4: Use pr_warn_ratelimited Roland Dreier (1): Merge branches 'cxgb4', 'misc', 'mlx5' and 'qib' into for-next Steve Wise (9): RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices RDMA/cxgb4: Endpoint timeout fixes RDMA/cxgb4: rmb() after reading valid gen bit RDMA/cxgb4: SQ flush fix RDMA/cxgb4: Max fastreg depth depends on DSGL support RDMA/cxgb4: Initialize reserved fields in a FW work request RDMA/cxgb4: Add missing debug stats RDMA/cxgb4: Use uninitialized_var() RDMA/cxgb4: Fix over-dereference when terminating drivers/infiniband/hw/cxgb4/cm.c | 89 drivers/infiniband/hw/cxgb4/cq.c | 24 - drivers/infiniband/hw/cxgb4/device.c | 41 --- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 + drivers/infiniband/hw/cxgb4/mem.c| 6 ++- drivers/infiniband/hw/cxgb4/provider.c | 2 +- drivers/infiniband/hw/cxgb4/qp.c | 70 +++-- drivers/infiniband/hw/cxgb4/resource.c | 10 ++-- drivers/infiniband/hw/cxgb4/t4.h | 72 -- drivers/infiniband/hw/mlx5/main.c| 2 + drivers/infiniband/hw/mlx5/qp.c | 12 + drivers/infiniband/hw/mthca/mthca_main.c | 8 +-- drivers/infiniband/hw/qib/qib_pcie.c | 55 ++-- include/linux/mlx5/device.h | 1 + include/linux/mlx5/qp.h | 1 + 15 files changed, 270 insertions(+), 125 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus InfiniBand/RDMA updates for 3.15-rc2: - Mostly cxgb4 fixes unblocked by the merge of some prerequisites via the net tree. - Drop deprecated MSI-X API use. - A couple other miscellaneous things. Alexander Gordeev (2): IB/qib: Use pci_enable_msix_range() instead of pci_enable_msix() IB/mthca: Use pci_enable_msix_exact() instead of pci_enable_msix() Eli Cohen (1): IB/mlx5: Add block multicast loopback support Hariprasad Shenai (1): RDMA/cxgb4: Use pr_warn_ratelimited Roland Dreier (1): Merge branches 'cxgb4', 'misc', 'mlx5' and 'qib' into for-next Steve Wise (9): RDMA/cxgb4: Use the BAR2/WC path for kernel QPs and T5 devices RDMA/cxgb4: Endpoint timeout fixes RDMA/cxgb4: rmb() after reading valid gen bit RDMA/cxgb4: SQ flush fix RDMA/cxgb4: Max fastreg depth depends on DSGL support RDMA/cxgb4: Initialize reserved fields in a FW work request RDMA/cxgb4: Add missing debug stats RDMA/cxgb4: Use uninitialized_var() RDMA/cxgb4: Fix over-dereference when terminating drivers/infiniband/hw/cxgb4/cm.c | 89 drivers/infiniband/hw/cxgb4/cq.c | 24 - drivers/infiniband/hw/cxgb4/device.c | 41 --- drivers/infiniband/hw/cxgb4/iw_cxgb4.h | 2 + drivers/infiniband/hw/cxgb4/mem.c| 6 ++- drivers/infiniband/hw/cxgb4/provider.c | 2 +- drivers/infiniband/hw/cxgb4/qp.c | 70 +++-- drivers/infiniband/hw/cxgb4/resource.c | 10 ++-- drivers/infiniband/hw/cxgb4/t4.h | 72 -- drivers/infiniband/hw/mlx5/main.c| 2 + drivers/infiniband/hw/mlx5/qp.c | 12 + drivers/infiniband/hw/mthca/mthca_main.c | 8 +-- drivers/infiniband/hw/qib/qib_pcie.c | 55 ++-- include/linux/mlx5/device.h | 1 + include/linux/mlx5/qp.h | 1 + 15 files changed, 270 insertions(+), 125 deletions(-) -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main batch of InfiniBand/RDMA changes for 3.15: - The biggest change is core API extensions and mlx5 low-level driver support for handling DIF/DIX-style protection information, and the addition of PI support to the iSER initiator. Target support will be arriving shortly through the SCSI target tree. - A nice simplification to the "umem" memory pinning library now that we have chained sg lists. Kudos to Yishai Hadas for realizing our code didn't have to be so crazy. - Another nice simplification to the sg wrappers used by qib, ipath and ehca to handle their mapping of memory to adapter. - The usual batch of fixes to bugs found by static checkers etc. from intrepid people like Dan Carpenter and Yann Droneaud. - A large batch of cxgb4, ocrdma, qib driver updates. Alex Tabachnik (2): IB/iser: Introduce pi_enable, pi_guard module parameters IB/iser: Initialize T10-PI resources Ariel Nahum (1): IB/iser: Remove struct iscsi_iser_conn Bart Van Assche (7): IB/mlx4: Fix a sparse endianness warning scsi_transport_srp: Fix two kernel-doc warnings IB/srp: Add more logging IB/srp: Avoid duplicate connections IB/srp: Make writing into the "add_target" sysfs attribute interruptible IB/srp: Avoid that writing into "add_target" hangs due to a cable pull IB/srp: Fix a race condition between failing I/O and I/O completion CQ Tang (1): IB/qib: Change SDMA progression mode depending on single- or multi-rail Dan Carpenter (7): IB/qib: Remove duplicate check in get_a_ctxt() RDMA/nes: Clean up a condition RDMA/cxgb4: Fix underflows in c4iw_create_qp() RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq() IB/qib: Cleanup qib_register_observer() mlx4_core: Fix some indenting in mlx4_ib_add() mlx4_core: Make buffer larger to avoid overflow warning Dennis Dalessandro (3): IB/qib: Fix potential buffer overrun in sending diag packet routine IB/ipath: Fix potential buffer overrun in sending diag packet routine IB/qib: Fix memory leak of recv context when driver fails to initialize. Devesh Sharma (9): RDMA/ocrdma: EQ full catastrophe avoidance RDMA/ocrdma: SQ and RQ doorbell offset clean up RDMA/ocrdma: Read ASIC_ID register to select asic_gen RDMA/ocrdma: Allow DPP QP creation RDMA/ocrdma: ABI versioning between ocrdma and be2net be2net: Add abi version between be2net and ocrdma RDMA/ocrdma: Update version string RDMA/ocrdma: Increment abi version count RDMA/ocrdma: Code clean-up Fabio Estevam (1): IB/usnic: Remove '0x' when using %pa format Mike Marciniszyn (7): IB/qib: Fix debugfs ordering issue with multiple HCAs IB/qib: Add percpu counter replacing qib_devdata int_counter IB/qib: Modify software pma counters to use percpu variables IB/qib: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads IB/ipath: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads IB/ehca: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads IB/core: Remove overload in ib_sg_dma* Moni Shoua (1): IB/core: Don't resolve passive side RoCE L2 address in CMA REQ handler Or Gerlitz (3): IB/iser: Print QP information once connection is established IB/iser: Update Mellanox copyright note IB/iser: Bump driver version to 1.3 Prarit Bhargava (1): RDMA/ocrdma: Fix compiler warning Randy Dunlap (1): IB/iser: Fix sector_t format warning Roi Dayan (1): IB/iser: Drain the tx cq once before looping on the rx cq Roland Dreier (2): RDMA/ocrdma: Fix warnings about pointer <-> integer casts Merge branches 'core', 'cxgb4', 'ip-roce', 'iser', 'misc', 'mlx4', 'nes', 'ocrdma', 'qib', 'sgwrapper', 'srp' and 'usnic' into for-next Sagi Grimberg (23): IB/core: Introduce protected memory regions IB/core: Introduce signature verbs API mlx5: Implement create_mr and destroy_mr IB/mlx5: Initialize mlx5_ib_qp signature-related members IB/mlx5: Break up wqe handling into begin & finish routines IB/mlx5: Remove MTT access mode from umr flags helper function IB/mlx5: Keep mlx5 MRs in a radix tree under device IB/mlx5: Support IB_WR_REG_SIG_MR IB/mlx5: Collect signature error completion IB/mlx5: Expose support for signature MR feature IB/iser: Suppress completions for fast registration work requests IB/iser: Avoid FRWR notation, use fastreg instead IB/iser: Push the decision what memory key to use into fast_reg_mr routine IB/iser: Move fast_reg_descriptor initialization to a function
[GIT PULL] please pull infiniband.git
Hi Linus, Please pull from git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband.git tags/rdma-for-linus Main batch of InfiniBand/RDMA changes for 3.15: - The biggest change is core API extensions and mlx5 low-level driver support for handling DIF/DIX-style protection information, and the addition of PI support to the iSER initiator. Target support will be arriving shortly through the SCSI target tree. - A nice simplification to the umem memory pinning library now that we have chained sg lists. Kudos to Yishai Hadas for realizing our code didn't have to be so crazy. - Another nice simplification to the sg wrappers used by qib, ipath and ehca to handle their mapping of memory to adapter. - The usual batch of fixes to bugs found by static checkers etc. from intrepid people like Dan Carpenter and Yann Droneaud. - A large batch of cxgb4, ocrdma, qib driver updates. Alex Tabachnik (2): IB/iser: Introduce pi_enable, pi_guard module parameters IB/iser: Initialize T10-PI resources Ariel Nahum (1): IB/iser: Remove struct iscsi_iser_conn Bart Van Assche (7): IB/mlx4: Fix a sparse endianness warning scsi_transport_srp: Fix two kernel-doc warnings IB/srp: Add more logging IB/srp: Avoid duplicate connections IB/srp: Make writing into the add_target sysfs attribute interruptible IB/srp: Avoid that writing into add_target hangs due to a cable pull IB/srp: Fix a race condition between failing I/O and I/O completion CQ Tang (1): IB/qib: Change SDMA progression mode depending on single- or multi-rail Dan Carpenter (7): IB/qib: Remove duplicate check in get_a_ctxt() RDMA/nes: Clean up a condition RDMA/cxgb4: Fix underflows in c4iw_create_qp() RDMA/cxgb4: Fix four byte info leak in c4iw_create_cq() IB/qib: Cleanup qib_register_observer() mlx4_core: Fix some indenting in mlx4_ib_add() mlx4_core: Make buffer larger to avoid overflow warning Dennis Dalessandro (3): IB/qib: Fix potential buffer overrun in sending diag packet routine IB/ipath: Fix potential buffer overrun in sending diag packet routine IB/qib: Fix memory leak of recv context when driver fails to initialize. Devesh Sharma (9): RDMA/ocrdma: EQ full catastrophe avoidance RDMA/ocrdma: SQ and RQ doorbell offset clean up RDMA/ocrdma: Read ASIC_ID register to select asic_gen RDMA/ocrdma: Allow DPP QP creation RDMA/ocrdma: ABI versioning between ocrdma and be2net be2net: Add abi version between be2net and ocrdma RDMA/ocrdma: Update version string RDMA/ocrdma: Increment abi version count RDMA/ocrdma: Code clean-up Fabio Estevam (1): IB/usnic: Remove '0x' when using %pa format Mike Marciniszyn (7): IB/qib: Fix debugfs ordering issue with multiple HCAs IB/qib: Add percpu counter replacing qib_devdata int_counter IB/qib: Modify software pma counters to use percpu variables IB/qib: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads IB/ipath: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads IB/ehca: Remove ib_sg_dma_address() and ib_sg_dma_len() overloads IB/core: Remove overload in ib_sg_dma* Moni Shoua (1): IB/core: Don't resolve passive side RoCE L2 address in CMA REQ handler Or Gerlitz (3): IB/iser: Print QP information once connection is established IB/iser: Update Mellanox copyright note IB/iser: Bump driver version to 1.3 Prarit Bhargava (1): RDMA/ocrdma: Fix compiler warning Randy Dunlap (1): IB/iser: Fix sector_t format warning Roi Dayan (1): IB/iser: Drain the tx cq once before looping on the rx cq Roland Dreier (2): RDMA/ocrdma: Fix warnings about pointer - integer casts Merge branches 'core', 'cxgb4', 'ip-roce', 'iser', 'misc', 'mlx4', 'nes', 'ocrdma', 'qib', 'sgwrapper', 'srp' and 'usnic' into for-next Sagi Grimberg (23): IB/core: Introduce protected memory regions IB/core: Introduce signature verbs API mlx5: Implement create_mr and destroy_mr IB/mlx5: Initialize mlx5_ib_qp signature-related members IB/mlx5: Break up wqe handling into begin finish routines IB/mlx5: Remove MTT access mode from umr flags helper function IB/mlx5: Keep mlx5 MRs in a radix tree under device IB/mlx5: Support IB_WR_REG_SIG_MR IB/mlx5: Collect signature error completion IB/mlx5: Expose support for signature MR feature IB/iser: Suppress completions for fast registration work requests IB/iser: Avoid FRWR notation, use fastreg instead IB/iser: Push the decision what memory key to use into fast_reg_mr routine IB/iser: Move fast_reg_descriptor initialization to a function IB/iser: Keep IB device attributes under
Re: linux rdma 3.14 merge plans
Sure, no problem. Do you have a git tree with the latest versions of all the changes you want for 3.15 in a branch? That would be helpful as I catch up on applying things, so that I don't miss anything. If you don't have one, taking a little time to set one up on github or wherever would be nice. You can base your set of changes on Linus's latest tree. Thanks! Roland On Thu, Mar 6, 2014 at 9:07 PM, Devesh Sharma wrote: > Hi Roland, > > Is it okay to send next series of patches even if previous series is not > accepted yet in your tree? Off-course I will cut patches on top of previous > series of patches. > > -Regards > Devesh > > -Original Message- > From: linux-rdma-ow...@vger.kernel.org > [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Nicholas A. Bellinger > Sent: Thursday, March 06, 2014 12:34 AM > To: Roland Dreier > Cc: Or Gerlitz; Hefty Sean; linux-rdma; Martin K. Petersen; target-devel; > Sagi Grimberg; linux-kernel > Subject: Re: linux rdma 3.14 merge plans > > On Wed, 2014-03-05 at 07:18 -0800, Roland Dreier wrote: >> On Wed, Mar 5, 2014 at 1:54 AM, Nicholas A. Bellinger >> wrote: >> > That all said, do you have an objection wrt taking this bits through >> > target-pending..? Given the dependencies involved, that would seem >> > the most logical path to take. >> >> Perhaps not surprisingly, I would prefer to get a chance to review a >> major change to the core RDMA midlayer rather than having you merge it >> through your tree. So yes I do object. Please give me a chance to >> review and merge it. I am working on that this week. >> > > Great. We'll be looking for a response by the end of the week. > > Otherwise if you end up not having time, we'd still like to move forward for > v3.15 given the amount of review the series has already gotten on the list. > > Thank you, > > --nab > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the > body of a message to majord...@vger.kernel.org More majordomo info at > http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: linux rdma 3.14 merge plans
Sure, no problem. Do you have a git tree with the latest versions of all the changes you want for 3.15 in a branch? That would be helpful as I catch up on applying things, so that I don't miss anything. If you don't have one, taking a little time to set one up on github or wherever would be nice. You can base your set of changes on Linus's latest tree. Thanks! Roland On Thu, Mar 6, 2014 at 9:07 PM, Devesh Sharma devesh.sha...@emulex.com wrote: Hi Roland, Is it okay to send next series of patches even if previous series is not accepted yet in your tree? Off-course I will cut patches on top of previous series of patches. -Regards Devesh -Original Message- From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Nicholas A. Bellinger Sent: Thursday, March 06, 2014 12:34 AM To: Roland Dreier Cc: Or Gerlitz; Hefty Sean; linux-rdma; Martin K. Petersen; target-devel; Sagi Grimberg; linux-kernel Subject: Re: linux rdma 3.14 merge plans On Wed, 2014-03-05 at 07:18 -0800, Roland Dreier wrote: On Wed, Mar 5, 2014 at 1:54 AM, Nicholas A. Bellinger n...@linux-iscsi.org wrote: That all said, do you have an objection wrt taking this bits through target-pending..? Given the dependencies involved, that would seem the most logical path to take. Perhaps not surprisingly, I would prefer to get a chance to review a major change to the core RDMA midlayer rather than having you merge it through your tree. So yes I do object. Please give me a chance to review and merge it. I am working on that this week. Great. We'll be looking for a response by the end of the week. Otherwise if you end up not having time, we'd still like to move forward for v3.15 given the amount of review the series has already gotten on the list. Thank you, --nab -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/