Re: [PATCH][V2][target-devel-next] tcmu: make array tcmu_attrib_attrs static const

2017-06-25 Thread Nicholas A. Bellinger
On Tue, 2017-06-13 at 14:29 +0100, Colin King wrote:
> From: Colin Ian King 
> 
> The array tcmu_attrib_attrs does not need to be in global scope, so make
> it static.
> 
> Cleans up sparse warning:
> "symbol 'tcmu_attrib_attrs' was not declared. Should it be static?"
> 
> Signed-off-by: Colin Ian King 
> ---
>  drivers/target/target_core_user.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/target/target_core_user.c 
> b/drivers/target/target_core_user.c
> index afc1fd6bacaf..04fb3f720895 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -1672,7 +1672,7 @@ static ssize_t tcmu_emulate_write_cache_store(struct 
> config_item *item,
>  }
>  CONFIGFS_ATTR(tcmu_, emulate_write_cache);
>  
> -struct configfs_attribute *tcmu_attrib_attrs[] = {
> +static struct configfs_attribute *tcmu_attrib_attrs[] = {
>   _attr_cmd_time_out,
>   _attr_dev_path,
>   _attr_dev_size,

Applied.

Thanks Colin.



Re: [PATCH] tcmu: Fix module removal due to stuck unmap_thread thread again

2017-06-25 Thread Nicholas A. Bellinger
On Thu, 2017-06-15 at 15:05 +0800, lixi...@cmss.chinamobile.com wrote:
> From: Xiubo Li 
> 
> Because the unmap code just after the schdule() returned may take
> a long time and if the kthread_stop() is fired just when in this
> routine, the module removal maybe stuck too.
> 
> Signed-off-by: Xiubo Li 
> ---
>  drivers/target/target_core_user.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/target/target_core_user.c 
> b/drivers/target/target_core_user.c
> index beb5f09..203bff1 100644
> --- a/drivers/target/target_core_user.c
> +++ b/drivers/target/target_core_user.c
> @@ -1573,7 +1573,7 @@ static int unmap_thread_fn(void *data)
>   struct page *page;
>   int i;
>  
> - while (1) {
> + while (!kthread_should_stop()) {
>   DEFINE_WAIT(__wait);
>  
>   prepare_to_wait(_wait, &__wait, TASK_INTERRUPTIBLE);

Applied to target-pending/for-next.

Thanks Xiubo + MNC.



[Bug 196191] New: NULL pointer dereference at beiscsi_process_cq+0x6f6

2017-06-25 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=196191

Bug ID: 196191
   Summary: NULL pointer dereference at beiscsi_process_cq+0x6f6
   Product: SCSI Drivers
   Version: 2.5
Kernel Version: 4.9.30-2~bpo8+1 (Debian)
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: Other
  Assignee: scsi_drivers-ot...@kernel-bugs.osdl.org
  Reporter: wf...@niif.hu
Regression: No

An iSCSI-booted server occasionally halts with following console log during
bootup:

[  OK  ] Reached target Network is Online.
 Starting LSB: Starts and stops the iSCSI initiator s...ault
targets...
[  214.044354] iscsi: registered transport (tcp)
[  214.316086] iscsi: registered transport (iser)
[  217.293303] hb: link status definitely up for interface enp5s0f5, 200
Mbps full duplex
[  217.709239] cloud: link status definitely up for interface enp5s0f7,
1 Mbps full duplex
[  218.438152] BUG: unable to handle kernel NULL pointer dereference at
  (null)
[  218.463932] IP: [] beiscsi_process_cq+0x6f6/0xa00
[be2iscsi]
[  218.487703] PGD 0 [  218.493722] 
[  218.498607] Oops:  [#1] SMP
[  218.508917] Modules linked in: ib_iser rdma_cm iw_cm ib_cm ib_core
iscsi_tcp libiscsi_tcp bonding ext4 crc16 jbd2 crc32c_generic fscrypto ecb
mbcache intel_rapl sb_edac edac_core x86_pkg_temp_thermal intel_powerclamp
coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel intel_cstate joydev iTCO_wdt iTCO_vendor_support evdev
intel_uncore mgag200 ttm drm_kms_helper pcspkr mei_me intel_rapl_perf drm
lpc_ich i2c_algo_bit mei mfd_core shpchp wmi ac button acpi_pad
acpi_power_meter ip6t_REJECT nf_reject_ipv6 nf_log_ipv6 nf_conntrack_ipv6
nf_defrag_ipv6 ip6table_filter ipmi_watchdog ip6_tables ipt_REJECT
nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp nf_log_ipv4
nf_log_common xt_LOG xt_limit xt_recent xt_multiport xt_conntrack ipmi_si
nf_conntrack ipmi_poweroff iptable_filter ipmi_devintf ip_tables
ipmi_msghandler x_tables configfs autofs4 xfs libcrc32c dm_service_time
dm_multipath dm_mod sg sd_mod hid_generic usbhid hid be2iscsi crc32c_intel
libiscsi ehci_pci aesni_intel aes_x86_64 scsi_transport_iscsi glue_helper
ehci_hcd iscsi_boot_sysfs lrw gf128mul ablk_helper i2c_i801 cryptd usbcore
i2c_smbus usb_common scsi_mod be2net
[  218.847642] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
4.9.0-0.bpo.3-amd64 #1 Debian 4.9.30-2~bpo8+1
[  218.877399] Hardware name: FUJITSU PRIMERGY BX924 S4/D3143-B1, BIOS
V4.6.5.4 R1.13.0 for D3143-B1x 07/16/2015
[  218.910016] task: b640e540 task.stack: b640
[  218.929476] RIP: 0010:[]  []
beiscsi_process_cq+0x6f6/0xa00 [be2iscsi]
[  218.961249] RSP: :95bfbfa03e48  EFLAGS: 00010297
[  218.978707] RAX: 95bfb5100e00 RBX: 0001 RCX:
0005
[  219.002171] RDX: 0001 RSI:  RDI:
95bfb532d504
[  219.025636] RBP: 0001 R08: 03fb0001 R09:
00011bcd
[  219.049097] R10: 594e5a87 R11: 95bfb551fd18 R12:
95bfb2822208
[  219.072562] R13: 0005 R14: 95bfb2818810 R15:
95bfb2557d40
[  219.096026] FS:  () GS:95bfbfa0()
knlGS:
[  219.122634] CS:  0010 DS:  ES:  CR0: 80050033
[  219.141521] CR2:  CR3: 0017c5a07000 CR4:
001406f0
[  219.164986] Stack:
[  219.171579]  80202f00 000ab86fc200 95bfb2818d70
95bfb87000e2
[  219.195943]  95bfb532d504 95bfb5100e00 95bfb551fd18
01ff95bf00011bcd
[  219.220310]  00fe  95bf8100
95bfb86fa018
[  219.244675] Call Trace:
[  219.252699]   [  219.259008]  [] ?
be_iopoll+0xc8/0x170 [be2iscsi]
[  219.279627]  [] ? irq_poll_softirq+0xa4/0xd0
[  219.298813]  [] ? __do_softirq+0x106/0x292
[  219.317430]  [] ? irq_exit+0x98/0xa0
[  219.334317]  [] ? do_IRQ+0x4f/0xd0
[  219.334317]  [] ? do_IRQ+0x4f/0xd0
[  219.350635]  [] ? common_interrupt+0x82/0x82
[  219.369810]   [  219.376123]  [] ?
cpuidle_enter_state+0x113/0x260
[  219.396741]  [] ? cpu_startup_entry+0x17e/0x260
[  219.416785]  [] ? start_kernel+0x46d/0x48d
[  219.435392]  [] ? early_idt_handler_array+0x120/0x120
[  219.457141]  [] ? x86_64_start_kernel+0x152/0x176
[  219.477745] Code: 00 0f b6 44 24 50 c6 06 22 88 56 01 88 46 02 44 89 c8
0f c8 89 46 1c 0f b6 44 24 40 41 8d 44 01 ff 0f c8 89 46 20 eb ac 48 8b 30 
06 3f 0f 85 bb 00 00 00 49 8b 3b e9 74 ff ff ff 41 f6 86 10 
[  219.540521] RIP  [] beiscsi_process_cq+0x6f6/0xa00
[be2iscsi]
[  219.564571]  RSP 
[  219.576023] CR2: 
[  219.586917] ---[ end trace f9ec7ddab1cda260 ]---
[  

Re: [PATCH v2 0/5] g_NCR5380: PDMA fixes and cleanup

2017-06-25 Thread Ondrej Zary
On Saturday 24 June 2017 08:37:36 Finn Thain wrote:
> Ondrej, would you please test this new series?
>
> Changed since v1:
> - PDMA transfer residual is calculated earlier.
> - End of DMA flag check is now polled (if there is any residual).
>
>
> Finn Thain (2):
>   g_NCR5380: Limit sg_tablesize to avoid PDMA read overruns on DTC436
>   g_NCR5380: Cleanup comments and whitespace
>
> Ondrej Zary (3):
>   g_NCR5380: Fix PDMA transfer size
>   g_NCR5380: End PDMA transfer correctly on target disconnection
>   g_NCR5380: Re-work PDMA loops
>
>  drivers/scsi/g_NCR5380.c | 231
> ++- 1 file changed, 107
> insertions(+), 124 deletions(-)

It mostly works, but there are some problems:

It's not reliable - we continue the data transfer after poll_politely2
returns zero but we don't know if it returned because of host buffer
being ready of because of an IRQ. So if a device disconnects during write,
we continue to fill the buffer and only then find out that wait for
53c80 registers timed out. Then PDMA gets disabled:
[ 3458.774251] sd 2:0:1:0: [sdb] tag#0 53c80 registers not accessible, device 
will be reset
[ 3458.774251] sd 2:0:1:0: [sdb] tag#0 switching to slow handshake

We can just reset and continue with a new PDMA transfer. Found no problems
with reads. But when this happens during a write, we might have lost some
data buffers that we need to transfer again. The chip's PDMA block counter
does not seem to be very helpful here - testing shows that either one buffer
is missing in the file or is duplicated.

That's why my code had separate host buffer ready and IRQ checks. Host
buffer first - if it's ready, transfer the data. If not, check for IRQ - 
if it was an error, rollback 2 buffers (the same if the host buffer is not
ready in time).



There's also a performance regression on DTC436 - the sg_tablesize limit
affects performance badly.
Before:
# hdparm -t --direct /dev/sdb

/dev/sdb:
 Timing O_DIRECT disk reads:   4 MB in  3.21 seconds =   1.25 MB/sec

Now:
# hdparm -t --direct /dev/sdb

/dev/sdb:
 Timing O_DIRECT disk reads:   4 MB in  4.89 seconds = 837.69 kB/sec

We should limit the transfer size instead:
--- a/drivers/scsi/g_NCR5380.c
+++ b/drivers/scsi/g_NCR5380.c
@@ -45,7 +45,8 @@
int c400_blk_cnt; \
int c400_host_buf; \
int io_width; \
-   int pdma_residual
+   int pdma_residual; \
+   int board;

 #define NCR5380_dma_xfer_lengeneric_NCR5380_dma_xfer_len
 #define NCR5380_dma_recv_setup  generic_NCR5380_pread
@@ -247,7 +248,6 @@ static int generic_NCR5380_init_one(struct 
scsi_host_template *tpnt,
case BOARD_DTC3181E:
ports = dtc_3181e_ports;
magic = ncr_53c400a_magic;
-   tpnt->sg_tablesize = 1;
break;
}

@@ -317,6 +317,7 @@ static int generic_NCR5380_init_one(struct 
scsi_host_template *tpnt,
}
hostdata = shost_priv(instance);

+   hostdata->board = board;
hostdata->io = iomem;
hostdata->region_size = region_size;

@@ -625,6 +626,9 @@ static int generic_NCR5380_dma_xfer_len(struct 
NCR5380_hostdata *hostdata,
/* 53C400 datasheet: non-modulo-128-byte transfers should use PIO */
if (transfersize % 128)
transfersize = 0;
+   /* Limit transfers to 512B to prevent random write corruption on DTC */
+   if (hostdata->board == BOARD_DTC3181E && transfersize > 512)
+   transfersize = 512;

return min(transfersize, DMA_MAX_SIZE);
 }


No data corruption and no performance regression:
# hdparm -t --direct /dev/sdb

/dev/sdb:
 Timing O_DIRECT disk reads:   4 MB in  3.25 seconds =   1.23 MB/sec

As the data corruption affects only writes, we could keep transfersize
unlimited for reads:
+   /* Limit write transfers to 512B to prevent random corruption on DTC */
+   if (hostdata->board == BOARD_DTC3181E &&
+   cmd->sc_data_direction == DMA_TO_DEVICE && transfersize > 512)
+   transfersize = 512;

So we can get some performance gain at least for reads:
# hdparm -t --direct /dev/sdb

/dev/sdb:
 Timing O_DIRECT disk reads:   6 MB in  4.17 seconds =   1.44 MB/sec


-- 
Ondrej Zary


Re: [PATCH 17/35] NCR5380: Move bus reset to host reset

2017-06-25 Thread Hannes Reinecke
On 06/24/2017 09:24 AM, Finn Thain wrote:
> On Fri, 23 Jun 2017, Hannes Reinecke wrote:
> 
>> The bus reset handler really is a host reset, so move it to
>> eh_bus_reset_handler.
>>
>> Signed-off-by: Hannes Reinecke 
>> ---
>>  drivers/scsi/NCR5380.c  | 13 -
>>  drivers/scsi/arm/cumana_1.c |  2 +-
>>  drivers/scsi/arm/oak.c  |  2 +-
>>  drivers/scsi/atari_scsi.c   |  6 +++---
>>  drivers/scsi/dmx3191d.c |  2 +-
>>  drivers/scsi/g_NCR5380.c|  4 ++--
>>  drivers/scsi/mac_scsi.c |  4 ++--
>>  drivers/scsi/sun3_scsi.c|  4 ++--
>>  8 files changed, 20 insertions(+), 17 deletions(-)
>>
>> diff --git a/drivers/scsi/NCR5380.c b/drivers/scsi/NCR5380.c
>> index acc3344..e877fb9 100644
>> --- a/drivers/scsi/NCR5380.c
>> +++ b/drivers/scsi/NCR5380.c
>> @@ -2296,24 +2296,24 @@ static int NCR5380_abort(struct scsi_cmnd *cmd)
>>  
>>  
>>  /**
>> - * NCR5380_bus_reset - reset the SCSI bus
>> + * NCR5380_host_reset - reset the SCSI host
>>   * @cmd: SCSI command undergoing EH
>>   *
>>   * Returns SUCCESS
>>   */
>>  
>> -static int NCR5380_bus_reset(struct scsi_cmnd *cmd)
>> +static int NCR5380_host_reset(struct scsi_cmnd *cmd)
>>  {
>>  struct Scsi_Host *instance = cmd->device->host;
>>  struct NCR5380_hostdata *hostdata = shost_priv(instance);
>>  int i;
>>  unsigned long flags;
>> -struct NCR5380_cmd *ncmd;
>> +struct NCR5380_cmd *ncmd, *tmp;
>>  
> 
> Do you need to introduce another temporary command pointer for this?
> 
>>  spin_lock_irqsave(>lock, flags);
>>  
>>  #if (NDEBUG & NDEBUG_ANY)
>> -scmd_printk(KERN_INFO, cmd, __func__);
>> +shost_printk(KERN_INFO, instance, __func__);
>>  #endif
>>  NCR5380_dprint(NDEBUG_ANY, instance);
>>  NCR5380_dprint_phase(NDEBUG_ANY, instance);
>> @@ -2331,7 +2331,10 @@ static int NCR5380_bus_reset(struct scsi_cmnd *cmd)
>>   * commands!
>>   */
>>  
>> -if (list_del_cmd(>unissued, cmd)) {
>> +list_for_each_entry_safe(ncmd, tmp, >unissued, list) {
>> +struct scsi_cmnd *cmd = NCR5380_to_scmd(ncmd);
>> +
>> +list_del_init(>list);
>>  cmd->result = DID_RESET << 16;
>>  cmd->scsi_done(cmd);
>>  }
> 
> For the sake of consistency, why didn't you use the same style that is 
> used later in this routine? I.e.
> 
>   list_for_each_entry(ncmd, >unissued, list) {
>   struct scsi_cmnd *cmd = NCR5380_to_scmd(ncmd);
> 
>   cmd->result = DID_RESET << 16;
>   cmd->scsi_done(cmd);
>   }
>   INIT_LIST_HEAD(>unissued);
> 
> Either way,
> 
> Acked-by: Finn Thain 
> 
> Thanks.
> 
As the driver was switch to using embedded private data area we need to
ensure that we're not touching the data area anymore once we call
->done(); the command might be reused after that.
Once the command is reused the 'list' structure in the embedded data
area will be reset, resulting in a list corruption for the 'unissued' list.

But as we're running under EH here where we don't have asynchronous I/O
submissions I guess we should be fine with your suggestion.
Will be updating the patch.

Cheers,

Hannes
-- 
Dr. Hannes ReineckeTeamlead Storage & Networking
h...@suse.de   +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)