Re: [PATCH v1 1/3] IB/srp: Fix crash when unmapping data loop

2014-02-24 Thread Sebastian Riemer
On 24.02.2014 15:30, Sagi Grimberg wrote: When unmapping request data, it is unsafe automatically decrement req-nfmr regardless of it's value. This may happen since IO and reconnect flow may run concurrently resulting in req-nfmr = -1 and falsely call ib_fmr_pool_unmap. Something is still

Re: [PATCH 1/6] scsi_transport_srp: Fix two kernel-doc warnings

2014-02-20 Thread Sebastian Riemer
' description in 'srp_rport' Signed-off-by: Bart Van Assche bvanass...@acm.org Reported-by: Masanari Iida standby2...@gmail.com Cc: Sagi Grimberg sa...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com Cc: James Bottomley jbottom...@parallels.com Cc: Roland Dreier rol

IB/srp: merge fixes from MLNX_OFED

2014-02-18 Thread Sebastian Riemer
Hi Sagi, is that /mswg/git/mlnx_ofed/mlnx-ofed-2.x-kernel.git tree from the MLNX_OFED public by any chance? There are fixes included relevant for the mainline. Would be strange if I would send the patches as somebody at Mellanox discovered and fixed the issues. I've hit a kernel panic today

Re: SRP initiator driver maintainership

2014-01-21 Thread Sebastian Riemer
On 21.01.2014 11:03, Sagi Grimberg wrote: On 1/20/2014 7:37 PM, Bart Van Assche wrote: On 01/03/14 22:16, David Dillow wrote: Today was my last day at ORNL, and my future endeavors will leave even less time to maintain the SRP initiator. My thanks especially go to Bart, for keeping the

OpenSM 3.3.16 at 100% CPU load, console off

2013-10-09 Thread Sebastian Riemer
Hi Hal, we've encountered an issue with OpenSM 3.3.16 and the config option console off. OpenSM processes are at 100% CPU load. From strace: poll([{fd=0, events=POLLIN}], 1, 1000) = 1 ([{fd=0, revents=POLLIN}]) read(0, , 4096) = 0 poll([{fd=0, events=POLLIN}], 1, 1000) =

Re: OpenSM 3.3.16 at 100% CPU load, console off

2013-10-09 Thread Sebastian Riemer
On 09.10.2013 15:30, David Dillow wrote: On Wed, 2013-10-09 at 09:28 -0400, Hal Rosenstock wrote: From strace: poll([{fd=0, events=POLLIN}], 1, 1000) = 1 ([{fd=0, revents=POLLIN}]) read(0, , 4096) = 0 poll([{fd=0, events=POLLIN}], 1, 1000) = 1 ([{fd=0,

Re: OpenSM 3.3.16 at 100% CPU load, console off

2013-10-09 Thread Sebastian Riemer
On 09.10.2013 16:00, Hal Rosenstock wrote: Do you recall the sequence to get to this ? Was console option changed to off and then OpenSM SIGHUP'd ? Something else ? Is this reproducible ? Yes, now I can reproduce it. The opensm has been initially started with console off and I activate

Re: OpenSM 3.3.16 at 100% CPU load, console off

2013-10-09 Thread Sebastian Riemer
On 09.10.2013 17:15, Hal Rosenstock wrote: What does service restart do in terms of OpenSM ? Note that the console parameter is _not_ changeable on the fly right now so if OpenSM is being SIGHUP'd by service restart then this is a current limitation (and is clearly not detected/protected

Re: [PATCH] IB/srp: Let srp_abort() return FAST_IO_FAIL if TL offline

2013-07-10 Thread Sebastian Riemer
instead of SUCCESS. Signed-off-by: Bart Van Assche bvanass...@acm.org Reported-by: Sebastian Riemer sebastian.rie...@profitbricks.com Cc: David Dillow dillo...@ornl.gov Cc: Roland Dreier rol...@purestorage.com Cc: Vu Pham v...@mellanox.com --- drivers/infiniband/ulp/srp/ib_srp.c |3

Re: [PATCH v2 04/15] IB/srp: Fail I/O fast if target offline

2013-07-02 Thread Sebastian Riemer
On 28.06.2013 14:49, Bart Van Assche wrote: If reconnecting failed we know that no command completion will be received anymore. Hence let the SCSI error handler fail such commands immediately. Acked-by: Sebastian Riemer sebastian.rie...@profitbricks.com -- To unsubscribe from this list: send

Re: [PATCH v2 04/15] IB/srp: Fail I/O fast if target offline

2013-07-01 Thread Sebastian Riemer
: David Dillow dillo...@ornl.gov Cc: Sebastian Riemer sebastian.rie...@profitbricks.com Cc: Vu Pham v...@mellanox.com --- drivers/infiniband/ulp/srp/ib_srp.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c

Re: [PATCH v2 04/15] IB/srp: Fail I/O fast if target offline

2013-07-01 Thread Sebastian Riemer
: David Dillow dillo...@ornl.gov Cc: Sebastian Riemer sebastian.rie...@profitbricks.com Cc: Vu Pham v...@mellanox.com --- drivers/infiniband/ulp/srp/ib_srp.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c

Re: [PATCH v2 04/15] IB/srp: Fail I/O fast if target offline

2013-07-01 Thread Sebastian Riemer
On 01.07.2013 13:33, Bart Van Assche wrote: --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1755,6 +1755,8 @@ static int srp_abort(struct scsi_cmnd *scmnd) if (srp_send_tsk_mgmt(target, req-index, scmnd-device-lun,

Re: [PATCH v2 04/15] IB/srp: Fail I/O fast if target offline

2013-07-01 Thread Sebastian Riemer
On 01.07.2013 13:38, Bart Van Assche wrote: --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -1755,6 +1755,8 @@ static int srp_abort(struct scsi_cmnd *scmnd) if (srp_send_tsk_mgmt(target, req-index, scmnd-device-lun,

Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion

2013-06-28 Thread Sebastian Riemer
On 28.06.2013 01:45, Roland Dreier wrote: On Thu, Jun 27, 2013 at 2:01 PM, David Dillow dillo...@ornl.gov wrote: On Wed, 2013-06-12 at 15:20 +0200, Bart Van Assche wrote: If the add_one callback fails during driver load no resources are allocated so there isn't a need to release any resources.

Re: [PATCH v2 02/15] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()

2013-06-28 Thread Sebastian Riemer
On 28.06.2013 14:48, Bart Van Assche wrote: Avoid that srp_claim_command() can claim a command while srp_queuecommand() is still busy queueing the same command. Found this via source reading. Nice, that's much less re-acquiring of the target lock in error case in srp_queuecommand(). But if we

Re: [PATCH v2 02/15] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()

2013-06-28 Thread Sebastian Riemer
On 28.06.2013 16:51, Bart Van Assche wrote: Nice, that's much less re-acquiring of the target lock in error case in srp_queuecommand(). But if we have to change that many locations for srp_put_tx_iu() anyway, wouldn't it make sense to rename it into __srp_put_tx_iu() as well? Then we can

Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus

2013-06-17 Thread Sebastian Riemer
On 14.06.2013 19:07, Vu Pham wrote: [...] For what do you need the same target with multiple pkeys on the same local SRP port? There is no need, it's just a gray area that you can choose to have multiple connections to same target using different pkeys (same as dgid) Which other SRP

Re: [PATCH 07/14] scsi_transport_srp: Add transport layer error handling

2013-06-17 Thread Sebastian Riemer
On 17.06.2013 09:29, Bart Van Assche wrote: On 06/17/13 09:14, Hannes Reinecke wrote: On 06/17/2013 09:04 AM, Bart Van Assche wrote: I agree that the value of fast_io_fail_tmo should be kept small. Although as you explained changing the SCSI device state into SDEV_BLOCK doesn't help for I/O

Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus

2013-06-14 Thread Sebastian Riemer
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 14.06.2013 01:27, Vu Pham wrote: Bart Van Assche wrote: On 06/13/13 19:50, Vu Pham wrote: Hello Bart, +/** + * srp_conn_unique() - check whether the connection to a target is unique + */ +static bool srp_conn_unique(struct srp_host *host, +

Re: [PATCH 03/14] IB/srp: Avoid that srp_reset_host() is skipped after a TL error

2013-06-13 Thread Sebastian Riemer
error handler skips the srp_reset_host() call after a transport layer error. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com

Re: [PATCH 04/14] IB/srp: Skip host settle delay

2013-06-13 Thread Sebastian Riemer
Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |1 + 1 file changed, 1 insertion(+) diff --git a/drivers/infiniband/ulp/srp

Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus

2013-06-13 Thread Sebastian Riemer
Cc: Roland Dreier rol...@kernel.org Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c | 38 +++ 1 file changed, 38 insertions(+) diff --git

Re: [PATCH] IB/srp: Maintain a single connection per I_T nexus

2013-06-13 Thread Sebastian Riemer
Bart's version also has the printing of the connection string if the double login fails. So forget about this version here. On 12.06.2013 13:51, Sebastian Riemer wrote: Hi all, as proposed by Or, let's discuss this on the mailing list. This is a fundamental change required for everything

Re: [PATCH 05/14] IB/srp: Maintain a single connection per I_T nexus

2013-06-13 Thread Sebastian Riemer
On 13.06.2013 17:07, Bart Van Assche wrote: [...] The %.*s should only copy the data provided by the user, even if it is not '\0' terminated. Stripping the trailing newline is probably possible with something like the (untested) code below (will only work if there is only one newline in the

Re: [PATCH] IB/srp: Maintain a single connection per I_T nexus

2013-06-12 Thread Sebastian Riemer
the srp-tools. Please compare with Bart's version and let's discuss this here. https://github.com/bvanassche/ib_srp-backport/commit/7d8774ff58d489858b1c046b2bf01b4e84e8dd9b Cheers, Sebastian On 12.06.2013 13:29, Sebastian Riemer wrote: The sysfs attribute 'add_target' may not be used for multiple

Re: [PATCH 01/14] IB/srp: Fix remove_one crash due to resource exhaustion

2013-06-12 Thread Sebastian Riemer
...@dev.mellanox.co.il Reviewed-by: Eli Cohen e...@mellanox.co.il Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers

Re: [PATCH 02/14] IB/srp: Fix race between srp_queuecommand() and srp_claim_req()

2013-06-12 Thread Sebastian Riemer
...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com --- drivers/infiniband/ulp/srp/ib_srp.c |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index 368d160..9c638dd 100644 --- a/drivers

Re: How to do replication right with SRP or remote storage?

2013-06-10 Thread Sebastian Riemer
On 08.06.2013 04:31, Bruce McKenzie wrote: Hi Bart. any advice on using this fix with MD raid 1? a guide or site you know of? ive compiled ubuntu 13.04 to kernel 3.6.11 with OFED 2 from Mellanox, and it works ok, performance is a little better with SRP. Some packages dont seem to work,

Re: How to do replication right with SRP or remote storage?

2013-06-10 Thread Sebastian Riemer
On 10.06.2013 14:44, Bart Van Assche wrote: On 06/10/13 14:05, Sebastian Riemer wrote: Perhaps, I should collect all guys who require MD RAID-1 for remote storage replication in order to put some pressure on Neil. If I remember correctly one of the things Neil is trying to explain to md

Re: BUG: unable to handle kernel paging request at 0000000000070a78 IPoIB

2013-05-21 Thread Sebastian Riemer
On 17.05.2013 16:16, Jack Wang wrote: unable to handle kernel paging request Hi Jack, this should be related to the list corruption in IPoIB as list_del() sets the LIST_POISON1 and LIST_POISON2 pointers. Referencing these results in page faults according to the documentation in the code.

Re: [ANNOUNCE] SRP: ProfitBricks publishes its SRP Initiator patches

2013-05-15 Thread Sebastian Riemer
On 15.05.2013 07:12, Vasiliy Tolstov wrote: 2013/5/14 Bart Van Assche bvanass...@acm.org: The ability to close a session from the initiator side went upstream in kernel 3.8 (/sys/class/srp_remote_ports/port-h:n/delete). Regarding faster reconnects: please keep in mind that after a cable pull

Re: tune ib stack

2013-05-14 Thread Sebastian Riemer
On 14.05.2013 12:02, Vasiliy Tolstov wrote: Sorry for bumping old thread, i'm solve my problems with new firmware. I have supermicro servers that rebrand mellanox firmware (recompile and change some bits) Now all works fine i have 40 gb/s QDR instead of 10 Gb/s Thanks, sharing lesson learned

Re: [ANNOUNCE] SRP: ProfitBricks publishes its SRP Initiator patches

2013-05-14 Thread Sebastian Riemer
reconnects and ability to close session from initiator side under qlogic hardware, does it possible? Or this patches only covers mallanox cards? 2013/5/8 Sebastian Riemer sebastian.rie...@profitbricks.com: FYI: I've released version 0.6 of my SRP patches today. The automatic reconnect is included now

Re: Infiniband HA

2013-05-08 Thread Sebastian Riemer
Hi Gandalf, just build up two separate fabrics. This means that you don't interconnect both switches. Otherwise, issues on one port also affect the other port. What do you use for storage? SRP? This requires dm-multipath and fast IO failing + automatic reconnect patches from Bart or from me.

Re: [ANNOUNCE] SRP: ProfitBricks publishes its SRP Initiator patches

2013-05-08 Thread Sebastian Riemer
FYI: I've released version 0.6 of my SRP patches today. The automatic reconnect is included now. The tests for that will follow in the next version. But we already did quite intensive testing for that. Hard reboot and also soft reboot of the target are possible with that reconnect. It just

[ANNOUNCE] SRP: ProfitBricks publishes its SRP Initiator patches

2013-04-12 Thread Sebastian Riemer
a technical talk there about SRP: http://www.linuxtag.org/2013/en/program/thursday-may-23-2013.html?eventid=208 Cheers, Sebastian -- Sebastian Riemer Linux Kernel Developer - Storage ProfitBricks GmbH • Greifswalder Str. 207 • 10405 Berlin, Germany www.profitbricks.com • sebastian.rie

Re: tune ib stack

2013-04-09 Thread Sebastian Riemer
On 09.04.2013 13:51, Vasiliy Tolstov wrote: Something like this: echo 4096 /sys/class/infiniband/mlx4_0/device/mlx4_port1_mtu After doing this all srp connections down and port is down. I need to restart openibd Sorry for that! It's much easier to set the IP MTU. Managed switches support

Re: tune ib stack

2013-04-09 Thread Sebastian Riemer
On 09.04.2013 14:49, Hal Rosenstock wrote: On 4/9/2013 7:12 AM, Vasiliy Tolstov wrote: Hello. I have some servers, with mellanox ConnectX-3 and have some questions: Why max_mtu differs with active_mtu? What does peer port say for max MTU ? How can i set active mtu? SM sets active MTU

Re: tune ib stack

2013-04-09 Thread Sebastian Riemer
On 09.04.2013 15:34, Hal Rosenstock wrote: On 4/9/2013 9:16 AM, Sebastian Riemer wrote: On 09.04.2013 14:49, Hal Rosenstock wrote: On 4/9/2013 7:12 AM, Vasiliy Tolstov wrote: Hello. I have some servers, with mellanox ConnectX-3 and have some questions: Why max_mtu differs with active_mtu

Re: tune ib stack

2013-04-09 Thread Sebastian Riemer
On 09.04.2013 16:23, Hal Rosenstock wrote: So these values are exactly the same as in ibv_devinfo and can be set in /sys/class/infiniband/mlx4_0/device/mlx4_port1_mtu. I've found the PortInfo with the command smpquery portinfo -C mlx4_0 3 1 where I'm using the first HCA to contact the SM. I

[RFC ib_srp-backport] ib_srp: bind fast IO failing to QP timeout

2013-03-19 Thread Sebastian Riemer
, Sebastian Btw.: Before, I've hacked MD RAID-1 for high-performance replication as DRBD is crap for our purposes. But that's worthless without a reliably working transport. From c101d00fe529d845192dd6d5930a1b9c16c99b81 Mon Sep 17 00:00:00 2001 From: Sebastian Riemer sebastian.rie...@profitbricks.com

Re: [RFC ib_srp-backport] ib_srp: bind fast IO failing to QP timeout

2013-03-19 Thread Sebastian Riemer
On 19.03.2013 12:22, Or Gerlitz wrote: On 19/03/2013 12:16, Sebastian Riemer wrote: Hi Bart, now I've got my priority on SRP again. Hi Sebastian, Are these patches targeted to upstream or backports to some OS/kernel? if the former, can you please send them inline so we can have proper

Re: [RFC ib_srp-backport] ib_srp: bind fast IO failing to QP timeout

2013-03-19 Thread Sebastian Riemer
On 19.03.2013 12:45, Bart Van Assche wrote: On 03/19/13 11:16, Sebastian Riemer wrote: What are your thought regarding this? Attached patches: ib_srp: register srp_fail_rport_io as terminate_rport_io ib_srp: be quiet when failing SCSI commands scsi_transport_srp: disable

Re: [PATCH/RFC] IPoIB: Free ipoib neigh on path record failure so path rec queries are retried

2013-02-27 Thread Sebastian Riemer
On 26.02.2013 17:55, Roland Dreier wrote: [...] In fact I bet this is why the bug has been there as long as it has been: almost no one is using IPv6 on IPoIB seriously, and IPv4 should work OK as you point out. Thanks a lot, Unfortunately, we are using IPoIB with IPv6 in production for the

Re: [LSF/MM TOPIC] Reducing the SRP initiator failover time

2013-02-08 Thread Sebastian Riemer
On 08.02.2013 10:24, Sagi Grimberg wrote: On 2/8/2013 12:42 AM, Vu Pham wrote: Hello Bart, Thank you for taking the initiative. Mellanox think that this should be discussed. We'd be happy to attend. We also would like to discuss: * How and how fast does SRP detect a path failure besides RC

Re: Virtual ibnetdiscover command fails

2013-02-06 Thread Sebastian Riemer
On 06.02.2013 10:22, Or Gerlitz wrote: On 06/02/2013 11:17, Mathis GAVILLON wrote: Ok. But what is it possible to do with Infiniband VFs if QP0 is not available ? EVERYTHING, e.g run IPoIB, iSER, RDS, MPI, etc, etc - except for what requires QP0, such as running SM or issuing SMPs for

Re: Virtual ibnetdiscover command fails

2013-02-06 Thread Sebastian Riemer
On 06.02.2013 11:20, Or Gerlitz wrote: On 06/02/2013 12:04, Mathis GAVILLON wrote: Just a last question : is that possible VFs lid to be different from PF one ? NO, we've implemented a shared port model, so all functions on the same IB port use the same lid, each function has its own

Re: [LSF/MM TOPIC] Reducing the SRP initiator failover time

2013-02-04 Thread Sebastian Riemer
Hi Bart, thanks for approaching this! We're not the best mainline developers so I guess we won't be there. But we have the big SRP setups and our sysadmins really don't like reconnecting SRP hosts manually and putting their devices complicated to the related dm-multipath devices again. Think

Re: [ANNOUNCE] OFED-3.5-rc2 is available

2012-10-04 Thread Sebastian Riemer
Hi Vladimir, why do you put OFED together for a kernel nobody uses? Perhaps SLES and Red Hat do it like this but nobody else. Have a look at http://en.wikipedia.org/wiki/Linux_kernel - 3.0, 3.2 and 3.4 are the long-term stable releases. This approach is worse than the approach before IMHO.

Re: [PATCH 11/20] ib_srp: Make srp_disconnect_target() wait for IB completions

2012-08-23 Thread Sebastian Riemer
Hi Bart, we've triggered the WARN_ON() in srp_wait_last_send_wqe() by connecting to a disabled SCST SRP target. I would remove that one. Cheers, Sebastian On 09.08.2012 17:53, Bart Van Assche wrote: Modify srp_disconnect_target() such that it waits until it is sure that no new IB

Basics of congestion control?

2012-07-31 Thread Sebastian Riemer
. ;-) Cheers, Sebastian -- Sebastian Riemer Linux Kernel Developer ProfitBricks GmbH • Greifswalder Str. 207 • 10405 Berlin, Germany www.profitbricks.com • sebastian.rie...@profitbricks.com Tel.: +49 - 30 - 60 98 56 991 - 915 Sitz der Gesellschaft: Berlin Registergericht: Amtsgericht

Re: Basics of congestion control?

2012-07-31 Thread Sebastian Riemer
On 31.07.2012 13:08, Alex Netes wrote: Congestion control isn't a credit based mechanism. While InfiniBand flow control is defined between two ports of the same link, congestion control is working across the fabric between a congestion point (a switch) and a reaction point (source node).

Re: mlx4_ib_create_qp failed - OOM with call trace

2012-07-20 Thread Sebastian Riemer
On 19.07.2012 22:31, Roland Dreier wrote: I have to think about the best way to fix this. We could just convert to vmalloc() here but I'm not thrilled about consuming vmalloc() space (on modern 64-bit architectures it's a non-issue but it's going to cause issues for people on smaller

mlx4_ib_create_qp failed - OOM with call trace

2012-07-18 Thread Sebastian Riemer
Cheers, Sebastian -- Sebastian Riemer Linux Kernel Developer ProfitBricks GmbH • Greifswalder Str. 207 • 10405 Berlin, Germany www.profitbricks.com • sebastian.rie...@profitbricks.com Sitz der Gesellschaft: Berlin Registergericht: Amtsgericht Charlottenburg, HRB 125506 B Geschäftsführer: Andreas

Re: OFED 1.5.4.1 on Ubuntu 10.04 with Mellanox cards?

2012-06-25 Thread Sebastian Riemer
Hi Chet, On 22/06/12 21:02, Chet Murthy wrote: Sebastian, Thank you for taking the time to explain these things! It's a little confusing Here a simple list of matching code: OFED-1.5.4 --- kernel 3.2.x OFED-1.5.4.1 --- kernel 3.3.x (1) Is there a more-exhaustive list of the

Re: OFED 1.5.4.1 on Ubuntu 10.04 with Mellanox cards?

2012-06-22 Thread Sebastian Riemer
Hi Chet, the trick is to check out the latest pkg-ofed source from debian SVN (svn://svn.debian.org/svn/pkg-ofed/) and to update the upstream source by merging the stuff by extracting the source RPMs or even better by importing the source directly from the git repos of the OFED user space. In the

Re: IB/iSER problems with Linux 3.0

2012-01-19 Thread Sebastian Riemer
On 17/01/12 15:56, Or Gerlitz wrote: could you try and patch your 3.0.15 kernel with commit 52439540ea30396982b69662dd21aede6b336288 IB/iser: DMA unmap TX bufs used for iSCSI/iSER headers from upstream, this could help here. Hi Or, unfortunately, just cherry-picking that commit didn't do the

Solved: IB/iSER problems with Linux 3.0

2012-01-19 Thread Sebastian Riemer
On 19/01/12 13:18, Or Gerlitz wrote: [...] Or Gerlitz (4): IB/iser: Fix wrong mask when sizeof (dma_addr_t) sizeof (unsigned long) IB/iser: Support iSCSI PDU padding IB/iser: Use separate buffers for the login request/response IB/iser: DMA unmap TX bufs used for

Re: IB/iSER problems with Linux 3.0

2012-01-17 Thread Sebastian Riemer
On 16/01/12 22:16, Or Gerlitz wrote: Sebastian, I asked for the **iser** (ib_iser) and not mlx4_core debug_level=2 Yes, I did! I've enabled that additionally. And I've checked these settings in /sys/module/*/parameters. They were set. The libiscsi from OFED had only the option debug_libiscsi

Re: IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-16 Thread Sebastian Riemer
On 12/01/12 17:14, Or Gerlitz wrote: you didn't send the kernel logs from the failure after opening the iser (debug_level=2) and libiscsi (debug_libiscsi_session=1 debug_libiscsi_conn=1) debug prints OK, I've also set mlx4_core debug_level=2 and have verified in /sys/module that the

Re: IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-12 Thread Sebastian Riemer
On 12/01/12 10:29, Or Gerlitz wrote: If you have build the kernel IB user space support (uverbs) and the IB libs, do ibv_devinfo if not, just ossi cat /sys/class/infiniband/mlx4_0/* and send the output. To be clear, iser does work for you on the productive servers but not on this server?

Re: IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-12 Thread Sebastian Riemer
On 12/01/12 11:16, Sebastian Riemer wrote: On 12/01/12 10:29, Or Gerlitz wrote: If you have build the kernel IB user space support (uverbs) and the IB libs, do ibv_devinfo if not, just ossi cat /sys/class/infiniband/mlx4_0/* and send the output. To be clear, iser does work for you

IB/iSER major problems with Linux 3.0 and Solaris targets

2012-01-11 Thread Sebastian Riemer
88402391f898 status 4 vend_err 57 Or, could you please investigate/explain? It is a pain that we need both: working iSER and IPoIB traffic with good performance. Cheers, Sebastian On 19/12/11 10:14, Sebastian Riemer wrote: Hi list, I've already sent this to the open-iscsi mailing list

Re: IB/iSER with Linux 3.0 and Debian: Lesson learned

2011-12-21 Thread Sebastian Riemer
you wrote long emails, I'm asking for one concrete example for that enum crunching  of adding entries not at the end, can you, please? I've meant e.g. the iscsi tasks in libiscsi.h between 2.6.30 and 2.6.32. But I've meant this for OFED and not the mainline kernel. 2.6.30: enum {

Re: IB/iSER with Linux 3.0 and Debian: Lesson learned

2011-12-21 Thread Sebastian Riemer
2011/12/21 Or Gerlitz ogerl...@mellanox.com: I tested the upstream kernel iser against the upstream iscsi tools  from git://github.com/mikechristie/open-iscsi (commit 4323e342d2c9fb8ed7233ce855001c189ec55b23), it works To bring this to an end: I believe you. Most likely I had that much

Re: IB/iSER with Linux 3.0 and Debian: Lesson learned

2011-12-20 Thread Sebastian Riemer
2011/12/20 Or Gerlitz ogerl...@mellanox.com: Beep, I'd like to better/understand the problem before looking on your struggle for solution... I understand that your Debian system runs kernel 3.0 - however, you didn't say what version of the iscsi initiator utils is provided with that distro

Re: IB/iSER with Linux 3.0 and Debian: Lesson learned

2011-12-20 Thread Sebastian Riemer
2011/12/20 Or Gerlitz ogerl...@mellanox.com: Beep(2), so your system has distro which is based on kernel 2.6.32 and iscsi initiator tools version 2.0.871 and per your needs, you've booted it with kernel 3.0 . At this point should you have stop and make sure that this combo works, iscsi wise

Re: IB/iSER with Linux 3.0 and Debian: Lesson learned

2011-12-20 Thread Sebastian Riemer
Would it help, if we provide our patches for open-iscsi and IB/iSER 2.6.32 to bring that into mainline OFED? As Or notes, OFED is providing the kernel modules more than the iscsi code drop.  Would be better for all (cough cough) to push changes back to the iscsi initiator maintainer (Mike

Re: IB/iSER with Linux 3.0 and Debian: Lesson learned

2011-12-20 Thread Sebastian Riemer
2011/12/20 Or Gerlitz or.gerl...@gmail.com: horses, please, stay at home, or at least run a little bit slower, just for you - from 2 minutes ago - iser works well with 3.2.0-rc5 (its say -dirty b/c its a development system and the kernel has some patches, but not iser ones) and

IB/iSER with Linux 3.0 and Debian: Lesson learned

2011-12-19 Thread Sebastian Riemer
be found. After fixing that, it worked for me. Cheers, Sebastian -- Sebastian Riemer Linux Kernel Developer ProfitBricks GmbH Greifswalder Str. 207 10405 Berlin, Germany Tel.:  +49 - 30 - 51 64 09 20 Fax:   +49 - 30 - 51 64 09 22 Email: sebastian.rie...@profitbricks.com Web:   http