Re: Error while performing writes using dd or mkfs on iSCSI initiator.

iamlinonymous Mon, 11 Feb 2019 15:12:00 -0800


On Wednesday, February 6, 2019 at 10:41:00 PM UTC+5:30, The Lee-Man wrote:
>
> On Wednesday, January 23, 2019 at 1:48:19 PM UTC-8, [email protected] 
> <javascript:> wrote:
>>
>> We have a LIO target on RHEL 7.5 with the lun created using fileio 
>> through targetcli. We exported it
>> to RHEL initiator on the same box (Tried with other box as well). 
>> On the lun, when we do mkfs for ext3/ext4, it fails with following 
>> message and can not be mounted.
>>
>>
>> -------------------------------------------------------------------------------------------------
>> [root@linux_machine /]# mkfs -t ext4 /dev/sdh
>> mke2fs 1.42.9 (28-Dec-2013)
>> /dev/sdh is entire device, not just one partition!
>> Proceed anyway? (y,n) y
>> Filesystem label=
>> OS type: Linux
>> Block size=4096 (log=2)
>> Fragment size=4096 (log=2)
>> Stride=0 blocks, Stripe width=1024 blocks
>> 2621440 inodes, 10485760 blocks
>> 524288 blocks (5.00%) reserved for the super user
>> First data block=0
>> Maximum filesystem blocks=2157969408
>> 320 block groups
>> 32768 blocks per group, 32768 fragments per group
>> 8192 inodes per group
>> Superblock backups stored on blocks:
>>         32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 
>> 2654208,
>>         4096000, 7962624
>>
>> Allocating group tables: done
>> Writing inode tables: done
>> Creating journal (32768 blocks): done
>> Writing superblocks and filesystem accounting information:
>> Warning, had trouble writing out superblocks.
>>
>> -----------------------------------------------------------------------------------------------------
>> while above task fails, /var/log/messages on initiator has following 
>> errors.
>>
>>
>> -------------------------------------------------------------------------------------------------------------
>> kernel: connection1:0: detected conn error (1020)
>> Kernel reported iSCSI connection 1:0 error (1020 - 
>> ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
>> connection1:0 is operational after recovery (1 attempts)
>> connection1:0: detected conn error (1020)
>> Kernel reported iSCSI connection 1:0 error (1020 - 
>> ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
>> connection1:0 is operational after recovery (1 attempts)
>> connection1:0: detected conn error (1020)
>> kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 54 00 10 00 10 00 00
>> kernel: blk_update_request: I/O error, dev sdf, sector 5505040
>> Kernel: Buffer I/O error on dev sdf, logical block 688130, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688131, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688132, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688133, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688134, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688135, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688136, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688137, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688138, lost async 
>> page write
>> kernel: Buffer I/O error on dev sdf, logical block 688139, lost async 
>> page write
>> kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 50 00 10 00 10 00 00
>> kernel: blk_update_request: I/O error, dev sdf, sector 5242896
>> kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 4c 00 10 00 10 00 00
>> kernel: blk_update_request: I/O error, dev sdf, sector 4980752
>> kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 48 00 10 00 10 00 00
>> blk_update_request: I/O error, dev sdf, sector 4718608
>> sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 44 00 10 00 10 00 00
>> blk_update_request: I/O error, dev sdf, sector 4456464
>> sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 40 00 10 00 10 00 00
>> blk_update_request: I/O error, dev sdf, sector 4194320
>> kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 3c 00 10 00 10 00 00
>> kernel: blk_update_request: I/O error, dev sdf, sector 3932176
>> kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 38 00 10 00 10 00 00
>> kernel: blk_update_request: I/O error, dev sdf, sector 3670032
>> kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 34 00 10 00 10 00 00
>> kernel: blk_update_request: I/O error, dev sdf, sector 3407888
>> kernel: sd 7:0:0:1: [sdf] FAILED Result: hostbyte=DID_TRANSPORT_DISRUPTED 
>> driverbyte=DRIVER_OK
>> kernel: sd 7:0:0:1: [sdf] CDB: Write(10) 2a 00 00 30 00 10 00 10 00 00
>> kernel: blk_update_request: I/O error, dev sdf, sector 3145744
>> iscsid: Kernel reported iSCSI connection 1:0 error (1020 - 
>> ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
>> iscsid: connection1:0 is operational after recovery (1 attempts)
>>
>> ------------------------------------------------------------------------------------------------------------------------------------
>>
>> Upon further debugging we found out that target TCP window is becoming 
>> full because of the writes on initiator side due to mkfs.
>> We then tried dd command on initiator with oflag=direct to perform 
>> synchronous writes. This time we did not face any issue. 
>> If we try dd without oflag=direct, we see the same error messages in 
>> /var/log/messages as in case of mkfs
>>
>
> So that shows this has nothing to do with ext3/ext4, but instead has to do 
> with your network.
>


    > As we do face this issue even with localhost as target and initiator, 
could you suggest just some tunables to deal with TCP window and network 
congestion in between initiator and target ?
 

>
>>
>> Following are the things we tried:
>> 1) We tried increasing the TCP RECV window on Target to more than the 
>> existing
>> window size. But it did not help.
>> 2) We tried increasing MaxRecvDataSegmentLength, MaxBurstLength, 
>> FirstBurstLengt on
>> target side. This helped in a sense that it delayed the occurance of the 
>> error
>> but still the errors were seen.
>> 3) We also changed the and node.session.timeo.replacement_timeout,
>> node.conn[0].timeo.noop_out_interval, 
>> node.conn[0].timeo.noop_out_timeout, node.session.err_timeo.abort_timeout 
>> on initiator side.
>> They were not effective in solving the problem
>>
>> Following is the query:
>> 1) What could be the causes of this issue? Why is the target deamon so 
>> slow?
>> 2) what other tunables could we try to solve the problem?
>>
>> Environment Details:
>> OS: Red Hat Enterprise Linux Server release 7.5 (Maipo)
>> Kernel Version: 3.10.0-862.el7.x86_64
>>
>> PFA image:
>> In the Wireshark image, 10.182.110.221 is the target and 10.182.111.167 
>> is the
>> initiator
>>
>> [image: tcp_reset (002).jpg]
>>
>
> And you say you get the same TCP congestion when initiator and target on 
> are on the same system? If so, can you try using 127.0.0.1.
>
> Your distro packages look quite old. Are they all up to date with current 
> patches/fixes? What version of targetcli-fb do you have?
>
> I'm afraid I know little about networking issues, but if the issue 
> persists using loopback that would seem to eliminate any issues with your 
> switches.
>

   > We even checked it with 127.0.0.1 and the same issue persist. This 
excludes the network too. And what are distro packages ? could you please 
add more to it? We are using 'Red Hat Enterprise Linux Server release 7.5' 
and targetcli rpm is 'targetcli-2.1.fb46-7.el7.noarch'!  

 

>  
>
 
>
 
>
 
>

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Re: Error while performing writes using dd or mkfs on iSCSI initiator.

Reply via email to