Re: [PATCH V2 11/12] scsi: sd: Introduce scsi_disk_from_queue()

2017-09-09 Thread Ming Lei
On Fri, Sep 08, 2017 at 01:16:39AM +0900, Damien Le Moal wrote:
> Using scsi_device_from_queue(), return the scsi_disk structure
> associated with a request queue if the device is a disk.
> Export this function to make it available to modules.
> 

This approach may be a little over-kill, actually gendisk is the
parent of request queue in kobject tree(see blk_register_queue()),
that might be an easy way to retrieve disk attached to the queue.

-- 
Ming


Re: [PATCH V2 10/12] scsi: sd_zbc: Disable zone write locking with scsi-mq

2017-09-09 Thread Ming Lei
On Fri, Sep 08, 2017 at 09:53:53AM -0700, Damien Le Moal wrote:
> Ming,
> 
> On 9/8/17 05:43, Ming Lei wrote:
> > Hi Damien,
> > 
> > On Fri, Sep 08, 2017 at 01:16:38AM +0900, Damien Le Moal wrote:
> >> In the case of a ZBC disk used with scsi-mq, zone write locking does
> >> not prevent write reordering in sequential zones. Unlike the legacy
> >> case, zone locking can only be done after the command request is
> >> removed from the scheduler dispatch queue. That is, at the time of
> >> zone locking, the write command may already be out of order.
> > 
> > Per my understanding, for legacy case, it can be quite tricky to let
> > the existed I/O scheduler guarantee the write order for ZBC disk.
> > I guess requeue still might cause write reorder even in legacy path,
> > since requeue can happen in both scsi_request_fn() and scsi_io_completion()
> > with q->queue_lock released, meantime new rq belonging to the same
> > zone can come and be inserted to queue.
> 
> Yes, the write ordering will always depend on the scheduler doing the
> right thing. But both cfq, deadline and even noop do the right thing
> there, even considering the aging case. The next write for a zone will
> always be the oldest in the queue for that zone, if it is not, it means
> that the application did not write sequentially. Extensive testing in
> the legacy case never showed a problem due to the scheduler itself.

OK, I suggest to document this guarantee of no write reorder for ZBC
somewhere, so that people will keep it in mind when trying to change
the current code.

> 
> scsi_requeue_command() does the unprep (zone unlock) and requeue while
> holding the queue lock. So this is atomic with new write command
> insertion. Requeued commands are added to the dispatch queue head, and
> since a zone will only have a single write in-flight, there is no
> reordering possible. The next write command for a zone to go again is
> the last requeued one or the next in lba order. It works.

One special case is write with FLUSH/FUA, which may be added to
front of q->queue_head directly. Suppose one write with FUA is
just comes between requeue and run queue, write reorder may be
triggered.

If this assumption is true, there might be such issue on blk-mq
too.

-- 
Ming


Re: [PATCH 2/2] scsi: Align queue to ARCH_DMA_MINALIGN in non-coherent DMA mode

2017-09-09 Thread kbuild test robot
Hi Huacai,

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.13 next-20170908]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Huacai-Chen/mm-dmapool-Align-to-ARCH_DMA_MINALIGN-in-non-coherent-DMA-mode/20170909-230504
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: m68k-sun3_defconfig (attached as .config)
compiler: m68k-linux-gcc (GCC) 4.9.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=m68k 

All errors (new ones prefixed by >>):

   drivers/scsi/scsi_lib.c: In function '__scsi_init_queue':
   drivers/scsi/scsi_lib.c:2139:2: error: implicit declaration of function 
'plat_device_is_coherent' [-Werror=implicit-function-declaration]
 if (plat_device_is_coherent(dev))
 ^
>> drivers/scsi/scsi_lib.c:2142:3: error: implicit declaration of function 
>> 'dma_get_cache_alignment' [-Werror=implicit-function-declaration]
  blk_queue_dma_alignment(q, dma_get_cache_alignment() - 1);
  ^
   cc1: some warnings being treated as errors

vim +/dma_get_cache_alignment +2142 drivers/scsi/scsi_lib.c

  2103  
  2104  void __scsi_init_queue(struct Scsi_Host *shost, struct request_queue *q)
  2105  {
  2106  struct device *dev = shost->dma_dev;
  2107  
  2108  queue_flag_set_unlocked(QUEUE_FLAG_SCSI_PASSTHROUGH, q);
  2109  
  2110  /*
  2111   * this limit is imposed by hardware restrictions
  2112   */
  2113  blk_queue_max_segments(q, min_t(unsigned short, 
shost->sg_tablesize,
  2114  SG_MAX_SEGMENTS));
  2115  
  2116  if (scsi_host_prot_dma(shost)) {
  2117  shost->sg_prot_tablesize =
  2118  min_not_zero(shost->sg_prot_tablesize,
  2119   (unsigned 
short)SCSI_MAX_PROT_SG_SEGMENTS);
  2120  BUG_ON(shost->sg_prot_tablesize < shost->sg_tablesize);
  2121  blk_queue_max_integrity_segments(q, 
shost->sg_prot_tablesize);
  2122  }
  2123  
  2124  blk_queue_max_hw_sectors(q, shost->max_sectors);
  2125  blk_queue_bounce_limit(q, scsi_calculate_bounce_limit(shost));
  2126  blk_queue_segment_boundary(q, shost->dma_boundary);
  2127  dma_set_seg_boundary(dev, shost->dma_boundary);
  2128  
  2129  blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
  2130  
  2131  if (!shost->use_clustering)
  2132  q->limits.cluster = 0;
  2133  
  2134  /*
  2135   * set a reasonable default alignment on word/cacheline 
boundaries:
  2136   * the host and device may alter it using
  2137   * blk_queue_update_dma_alignment() later.
  2138   */
> 2139  if (plat_device_is_coherent(dev))
  2140  blk_queue_dma_alignment(q, 0x04 - 1);
  2141  else
> 2142  blk_queue_dma_alignment(q, dma_get_cache_alignment() - 
> 1);
  2143  }
  2144  EXPORT_SYMBOL_GPL(__scsi_init_queue);
  2145  

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip


Re: [PATCH 2/2] scsi: Align queue to ARCH_DMA_MINALIGN in non-coherent DMA mode

2017-09-09 Thread kbuild test robot
Hi Huacai,

[auto build test ERROR on mmotm/master]
[also build test ERROR on v4.13 next-20170908]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Huacai-Chen/mm-dmapool-Align-to-ARCH_DMA_MINALIGN-in-non-coherent-DMA-mode/20170909-230504
base:   git://git.cmpxchg.org/linux-mmotm.git master
config: i386-randconfig-x000-201736 (attached as .config)
compiler: gcc-6 (Debian 6.2.0-3) 6.2.0 20160901
reproduce:
# save the attached .config to linux build tree
make ARCH=i386 

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/linkage.h:4:0,
from include/linux/fs.h:4,
from include/linux/highmem.h:4,
from include/linux/bio.h:21,
from drivers/scsi/scsi_lib.c:11:
   drivers/scsi/scsi_lib.c: In function '__scsi_init_queue':
>> drivers/scsi/scsi_lib.c:2139:6: error: implicit declaration of function 
>> 'plat_device_is_coherent' [-Werror=implicit-function-declaration]
 if (plat_device_is_coherent(dev))
 ^
   include/linux/compiler.h:156:30: note: in definition of macro '__trace_if'
 if (__builtin_constant_p(!!(cond)) ? !!(cond) :   \
 ^~~~
>> drivers/scsi/scsi_lib.c:2139:2: note: in expansion of macro 'if'
 if (plat_device_is_coherent(dev))
 ^~
   drivers/scsi/scsi_lib.c: At top level:
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'memcpy_and_pad' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:451:2: note: in expansion of macro 'if'
 if (dest_len > count) {
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'memcpy_and_pad' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:449:2: note: in expansion of macro 'if'
 if (dest_size < dest_len)
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'memcpy_and_pad' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:446:8: note: in expansion of macro 'if'
  else if (src_size < dest_len && src_size < count)
   ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'memcpy_and_pad' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:444:3: note: in expansion of macro 'if'
  if (dest_size < dest_len && dest_size < count)
  ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'memcpy_and_pad' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:443:2: note: in expansion of macro 'if'
 if (__builtin_constant_p(dest_len) && __builtin_constant_p(count)) {
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'strcpy' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )
  ^~
   include/linux/string.h:421:2: note: in expansion of macro 'if'
 if (p_size == (size_t)-1 && q_size == (size_t)-1)
 ^~
   include/linux/compiler.h:162:4: warning: '__f' is static but declared in 
inline function 'kmemdup' which is not static
   __f = { \
   ^
   include/linux/compiler.h:154:23: note: in expansion of macro '__trace_if'
#define if(cond, ...) __trace_if( (cond , ## __VA_ARGS__) )

Re: [PATCH] sd: preserve sysfs updates to max_sectors_kb

2017-09-09 Thread Martin Wilck
Hello Martin,

On Tue, 2017-08-29 at 21:24 -0400, Martin K. Petersen wrote:

> I looked at this for a bit last week to see if I could come up with
> an
> elegant way to accommodate values overridden in sysfs and at the same
> time honor hardware limits changing. However, it quickly gets messy
> since some parameters are under the request_queue and some are
> scsi_disk
> specific. So that involves having override flags several places. Plus
> there also the whole re-stacking debacle.
> 
> So I think I prefer udev for this.

Could you please explain why you think Don's patch is wrong? User
settings being discarded because of a BLKRRPART ioctl violates the
principle of least surprise. With Don's patch, that won't happen any
more. If hardware limits change, whether they increase or decrease, the
patch will also do the Right Thing, AFAICS. Increasing hw limits will
not automatically increase the sw limit, but IMO that's actually
expected.

Best Regards,
Martin

-- 
Dr. Martin Wilck , Tel. +49 (0)911 74053 2107
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)



Re: [PATCH V3 7/8] block: allow to allocate req with REQF_PREEMPT when queue is preempt frozen

2017-09-09 Thread Ming Lei
On Fri, Sep 08, 2017 at 05:28:16PM +, Bart Van Assche wrote:
> On Fri, 2017-09-08 at 11:08 +0800, Ming Lei wrote:
> > Looks I replied or clarified all your questions/comments on this
> > patchset.
> 
> No, you have not addressed all my comments, you know that you have not
> addressed all my comments so you should not have written that you have

I do not know.

> addressed all my comments. This patch series not only introduces ugly
> changes in the request queue freeze mechanism but it also introduces an
> unacceptable race condition between blk_get_request() and request queue
> cleanup.

So what is the race? Could you reply in original thread?

> 
> BTW, you don't have to spend more time on this patch series. I have
> implemented an alternative and much cleaner approach to fix SCSI device
> suspend. I'm currently testing that patch series.

You do not understand the issue at all, it is not only related with
suspend. Please take a look at scsi_execute(), each request run
via scsi_execute() need to be dispatched successfully even when
SCSI device is quiesced.

-- 
Ming