[PATCH] x86/cpu/amd: Remove duplicate definition of MSRC001_1029

2021-03-23 Thread Venkatesh Srinivas
MSRC001_1029 had duplicate definitions in msr-index.h and
amd.c. Remove the local definition in favor of the more
widely used version.

Signed-off-by: Venkatesh Srinivas 
---
 arch/x86/kernel/cpu/amd.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 347a956f71ca..31c592e5096e 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -795,15 +795,13 @@ static void init_amd_gh(struct cpuinfo_x86 *c)
set_cpu_bug(c, X86_BUG_AMD_TLB_MMATCH);
 }
 
-#define MSR_AMD64_DE_CFG   0xC0011029
-
 static void init_amd_ln(struct cpuinfo_x86 *c)
 {
/*
 * Apply erratum 665 fix unconditionally so machines without a BIOS
 * fix work.
 */
-   msr_set_bit(MSR_AMD64_DE_CFG, 31);
+   msr_set_bit(MSR_F10H_DECFG, 31);
 }
 
 static bool rdrand_force;
-- 
2.31.0.291.g576ba9dcdaf-goog



Re: [PATCH v2] vhost: replace % with & on data path

2015-12-04 Thread Venkatesh Srinivas
On Mon, Nov 30, 2015 at 11:15:23AM +0200, Michael S. Tsirkin wrote:
> We know vring num is a power of 2, so use &
> to mask the high bits.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---

The generated code switches from DIV -> masking, source is clearer as well.

Tested-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tools/virtio: fix byteswap logic

2015-12-04 Thread Venkatesh Srinivas
On Mon, Nov 30, 2015 at 10:33:34AM +0200, Michael S. Tsirkin wrote:
> commit cf561f0d2eb74574ad9985a2feab134267a9d298 ("virtio: introduce
> virtio_is_little_endian() helper") changed byteswap logic to
> skip feature bit checks for LE platforms, but didn't
> update tools/virtio, so vring_bench started failing.
> 
> Update the copy under tools/virtio/ (TODO: find a way to avoid this code
> duplication).
> 
> Cc: Greg Kurz 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  tools/virtio/linux/virtio_config.h | 20 +---
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/virtio/linux/virtio_config.h 
> b/tools/virtio/linux/virtio_config.h
> index 806d683..57a6964 100644
> --- a/tools/virtio/linux/virtio_config.h
> +++ b/tools/virtio/linux/virtio_config.h
> @@ -40,33 +40,39 @@ static inline void __virtio_clear_bit(struct 
> virtio_device *vdev,
>  #define virtio_has_feature(dev, feature) \
>   (__virtio_test_bit((dev), feature))
>  
> +static inline bool virtio_is_little_endian(struct virtio_device *vdev)
> +{
> + return virtio_has_feature(vdev, VIRTIO_F_VERSION_1) ||
> + virtio_legacy_is_little_endian();
> +}
> +
> +/* Memory accessors */
>  static inline u16 virtio16_to_cpu(struct virtio_device *vdev, __virtio16 val)
>  {
> - return __virtio16_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __virtio16_to_cpu(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline __virtio16 cpu_to_virtio16(struct virtio_device *vdev, u16 val)
>  {
> - return __cpu_to_virtio16(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __cpu_to_virtio16(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline u32 virtio32_to_cpu(struct virtio_device *vdev, __virtio32 val)
>  {
> - return __virtio32_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __virtio32_to_cpu(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline __virtio32 cpu_to_virtio32(struct virtio_device *vdev, u32 val)
>  {
> - return __cpu_to_virtio32(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __cpu_to_virtio32(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline u64 virtio64_to_cpu(struct virtio_device *vdev, __virtio64 val)
>  {
> - return __virtio64_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __virtio64_to_cpu(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline __virtio64 cpu_to_virtio64(struct virtio_device *vdev, u64 val)
>  {
> - return __cpu_to_virtio64(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __cpu_to_virtio64(virtio_is_little_endian(vdev), val);
>  }

Tested patch with vring-bench. (I did something similar when reviving it).

Tested-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] vhost: replace % with & on data path

2015-12-04 Thread Venkatesh Srinivas
On Mon, Nov 30, 2015 at 11:15:23AM +0200, Michael S. Tsirkin wrote:
> We know vring num is a power of 2, so use &
> to mask the high bits.
> 
> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> ---

The generated code switches from DIV -> masking, source is clearer as well.

Tested-by: Venkatesh Srinivas <venkate...@google.com>

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tools/virtio: fix byteswap logic

2015-12-04 Thread Venkatesh Srinivas
On Mon, Nov 30, 2015 at 10:33:34AM +0200, Michael S. Tsirkin wrote:
> commit cf561f0d2eb74574ad9985a2feab134267a9d298 ("virtio: introduce
> virtio_is_little_endian() helper") changed byteswap logic to
> skip feature bit checks for LE platforms, but didn't
> update tools/virtio, so vring_bench started failing.
> 
> Update the copy under tools/virtio/ (TODO: find a way to avoid this code
> duplication).
> 
> Cc: Greg Kurz <gk...@linux.vnet.ibm.com>
> Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
> ---
>  tools/virtio/linux/virtio_config.h | 20 +---
>  1 file changed, 13 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/virtio/linux/virtio_config.h 
> b/tools/virtio/linux/virtio_config.h
> index 806d683..57a6964 100644
> --- a/tools/virtio/linux/virtio_config.h
> +++ b/tools/virtio/linux/virtio_config.h
> @@ -40,33 +40,39 @@ static inline void __virtio_clear_bit(struct 
> virtio_device *vdev,
>  #define virtio_has_feature(dev, feature) \
>   (__virtio_test_bit((dev), feature))
>  
> +static inline bool virtio_is_little_endian(struct virtio_device *vdev)
> +{
> + return virtio_has_feature(vdev, VIRTIO_F_VERSION_1) ||
> + virtio_legacy_is_little_endian();
> +}
> +
> +/* Memory accessors */
>  static inline u16 virtio16_to_cpu(struct virtio_device *vdev, __virtio16 val)
>  {
> - return __virtio16_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __virtio16_to_cpu(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline __virtio16 cpu_to_virtio16(struct virtio_device *vdev, u16 val)
>  {
> - return __cpu_to_virtio16(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __cpu_to_virtio16(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline u32 virtio32_to_cpu(struct virtio_device *vdev, __virtio32 val)
>  {
> - return __virtio32_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __virtio32_to_cpu(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline __virtio32 cpu_to_virtio32(struct virtio_device *vdev, u32 val)
>  {
> - return __cpu_to_virtio32(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __cpu_to_virtio32(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline u64 virtio64_to_cpu(struct virtio_device *vdev, __virtio64 val)
>  {
> - return __virtio64_to_cpu(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __virtio64_to_cpu(virtio_is_little_endian(vdev), val);
>  }
>  
>  static inline __virtio64 cpu_to_virtio64(struct virtio_device *vdev, u64 val)
>  {
> - return __cpu_to_virtio64(virtio_has_feature(vdev, VIRTIO_F_VERSION_1), 
> val);
> + return __cpu_to_virtio64(virtio_is_little_endian(vdev), val);
>  }

Tested patch with vring-bench. (I did something similar when reviving it).

Tested-by: Venkatesh Srinivas <venkate...@google.com>

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] KVM: x86: work around infinite loop in microcode when #AC is delivered

2015-11-10 Thread Venkatesh Srinivas
On Tue, Nov 10, 2015 at 01:22:52PM +0100, Paolo Bonzini wrote:
> From: Eric Northup 
> 
> It was found that a guest can DoS a host by triggering an infinite
> stream of "alignment check" (#AC) exceptions.  This causes the
> microcode to enter an infinite loop where the core never receives
> another interrupt.  The host kernel panics pretty quickly due to the
> effects (CVE-2015-5307).
> 
> Signed-off-by: Eric Northup 
> Cc: sta...@vger.kernel.org
> Signed-off-by: Paolo Bonzini 

Tested-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] KVM: x86: work around infinite loop in microcode when #AC is delivered

2015-11-10 Thread Venkatesh Srinivas
On Tue, Nov 10, 2015 at 01:22:52PM +0100, Paolo Bonzini wrote:
> From: Eric Northup <digitale...@google.com>
> 
> It was found that a guest can DoS a host by triggering an infinite
> stream of "alignment check" (#AC) exceptions.  This causes the
> microcode to enter an infinite loop where the core never receives
> another interrupt.  The host kernel panics pretty quickly due to the
> effects (CVE-2015-5307).
> 
> Signed-off-by: Eric Northup <digitale...@google.com>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Paolo Bonzini <pbonz...@redhat.com>

Tested-by: Venkatesh Srinivas <venkate...@google.com>

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND 2/7] scsi: storvsc: Size the queue depth based on the ringbuffer size

2015-03-23 Thread Venkatesh Srinivas
On Mon, Mar 23, 2015 at 2:06 PM, K. Y. Srinivasan  wrote:
> Size the queue depth based on the ringbuffer size. Also accomodate for the
> fact that we could have multiple channels (ringbuffers) per adaptor.
>
> Signed-off-by: K. Y. Srinivasan 
> Reviewed-by: Long Li 
> ---
>  drivers/scsi/storvsc_drv.c |   27 ---
>  1 files changed, 16 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
> index 27fe850..5a12897 100644
> --- a/drivers/scsi/storvsc_drv.c
> +++ b/drivers/scsi/storvsc_drv.c
> @@ -309,10 +309,15 @@ enum storvsc_request_type {
>   */
>
>  static int storvsc_ringbuffer_size = (256 * PAGE_SIZE);
> +static u32 max_outstanding_req_per_channel;
> +
> +static int storvsc_vcpus_per_sub_channel = 4;
>
>  module_param(storvsc_ringbuffer_size, int, S_IRUGO);
>  MODULE_PARM_DESC(storvsc_ringbuffer_size, "Ring buffer size (bytes)");
>
> +module_param(storvsc_vcpus_per_sub_channel, int, S_IRUGO);
> +MODULE_PARM_DESC(vcpus_per_sub_channel, "Ratio of VCPUs to subchannels");
>  /*
>   * Timeout in seconds for all devices managed by this driver.
>   */
> @@ -320,7 +325,6 @@ static int storvsc_timeout = 180;
>
>  static int msft_blist_flags = BLIST_TRY_VPD_PAGES;
>
> -#define STORVSC_MAX_IO_REQUESTS200
>
>  static void storvsc_on_channel_callback(void *context);
>
> @@ -1376,7 +1380,6 @@ static int storvsc_do_io(struct hv_device *device,
>
>  static int storvsc_device_configure(struct scsi_device *sdevice)
>  {
> -   scsi_change_queue_depth(sdevice, STORVSC_MAX_IO_REQUESTS);
>
> blk_queue_max_segment_size(sdevice->request_queue, PAGE_SIZE);
>
> @@ -1646,7 +1649,6 @@ static struct scsi_host_template scsi_driver = {
> .eh_timed_out = storvsc_eh_timed_out,
> .slave_configure =  storvsc_device_configure,
> .cmd_per_lun =  255,
> -   .can_queue =STORVSC_MAX_IO_REQUESTS*STORVSC_MAX_TARGETS,
> .this_id =  -1,
> /* no use setting to 0 since ll_blk_rw reset it to 1 */
> /* currently 32 */
> @@ -1686,6 +1688,7 @@ static int storvsc_probe(struct hv_device *device,
> const struct hv_vmbus_device_id *dev_id)
>  {
> int ret;
> +   int num_cpus = num_online_cpus();
> struct Scsi_Host *host;
> struct hv_host_device *host_dev;
> bool dev_is_ide = ((dev_id->driver_data == IDE_GUID) ? true : false);
> @@ -1694,6 +1697,7 @@ static int storvsc_probe(struct hv_device *device,
> int max_luns_per_target;
> int max_targets;
> int max_channels;
> +   int max_sub_channels = 0;
>
> /*
>  * Based on the windows host we are running on,
> @@ -1719,12 +1723,18 @@ static int storvsc_probe(struct hv_device *device,
> max_luns_per_target = STORVSC_MAX_LUNS_PER_TARGET;
> max_targets = STORVSC_MAX_TARGETS;
> max_channels = STORVSC_MAX_CHANNELS;
> +   /*
> +* On Windows8 and above, we support sub-channels for storage.
> +* The number of sub-channels offerred is based on the number 
> of
> +* VCPUs in the guest.
> +*/
> +   max_sub_channels = (num_cpus / storvsc_vcpus_per_sub_channel);
> break;
> }
>
> -   if (dev_id->driver_data == SFC_GUID)
> -   scsi_driver.can_queue = (STORVSC_MAX_IO_REQUESTS *
> -STORVSC_FC_MAX_TARGETS);
> +   scsi_driver.can_queue = (max_outstanding_req_per_channel *
> +max_sub_channels + 1);
> +

If num_online_cpus() returned 1 - 3, can_queue will be set to 1 I
think. Is that desired?

> host = scsi_host_alloc(_driver,
>sizeof(struct hv_host_device));
> if (!host)
> @@ -1837,7 +1847,6 @@ static struct hv_driver storvsc_drv = {
>
>  static int __init storvsc_drv_init(void)
>  {
> -   u32 max_outstanding_req_per_channel;
>
> /*
>  * Divide the ring buffer data size (which is 1 page less
> @@ -1852,10 +1861,6 @@ static int __init storvsc_drv_init(void)
> vmscsi_size_delta,
> sizeof(u64)));
>
> -   if (max_outstanding_req_per_channel <
> -   STORVSC_MAX_IO_REQUESTS)
> -   return -EINVAL;
> -
> return vmbus_driver_register(_drv);
>  }

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND 2/7] scsi: storvsc: Size the queue depth based on the ringbuffer size

2015-03-23 Thread Venkatesh Srinivas
On Mon, Mar 23, 2015 at 2:06 PM, K. Y. Srinivasan k...@microsoft.com wrote:
 Size the queue depth based on the ringbuffer size. Also accomodate for the
 fact that we could have multiple channels (ringbuffers) per adaptor.

 Signed-off-by: K. Y. Srinivasan k...@microsoft.com
 Reviewed-by: Long Li lon...@microsoft.com
 ---
  drivers/scsi/storvsc_drv.c |   27 ---
  1 files changed, 16 insertions(+), 11 deletions(-)

 diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
 index 27fe850..5a12897 100644
 --- a/drivers/scsi/storvsc_drv.c
 +++ b/drivers/scsi/storvsc_drv.c
 @@ -309,10 +309,15 @@ enum storvsc_request_type {
   */

  static int storvsc_ringbuffer_size = (256 * PAGE_SIZE);
 +static u32 max_outstanding_req_per_channel;
 +
 +static int storvsc_vcpus_per_sub_channel = 4;

  module_param(storvsc_ringbuffer_size, int, S_IRUGO);
  MODULE_PARM_DESC(storvsc_ringbuffer_size, Ring buffer size (bytes));

 +module_param(storvsc_vcpus_per_sub_channel, int, S_IRUGO);
 +MODULE_PARM_DESC(vcpus_per_sub_channel, Ratio of VCPUs to subchannels);
  /*
   * Timeout in seconds for all devices managed by this driver.
   */
 @@ -320,7 +325,6 @@ static int storvsc_timeout = 180;

  static int msft_blist_flags = BLIST_TRY_VPD_PAGES;

 -#define STORVSC_MAX_IO_REQUESTS200

  static void storvsc_on_channel_callback(void *context);

 @@ -1376,7 +1380,6 @@ static int storvsc_do_io(struct hv_device *device,

  static int storvsc_device_configure(struct scsi_device *sdevice)
  {
 -   scsi_change_queue_depth(sdevice, STORVSC_MAX_IO_REQUESTS);

 blk_queue_max_segment_size(sdevice-request_queue, PAGE_SIZE);

 @@ -1646,7 +1649,6 @@ static struct scsi_host_template scsi_driver = {
 .eh_timed_out = storvsc_eh_timed_out,
 .slave_configure =  storvsc_device_configure,
 .cmd_per_lun =  255,
 -   .can_queue =STORVSC_MAX_IO_REQUESTS*STORVSC_MAX_TARGETS,
 .this_id =  -1,
 /* no use setting to 0 since ll_blk_rw reset it to 1 */
 /* currently 32 */
 @@ -1686,6 +1688,7 @@ static int storvsc_probe(struct hv_device *device,
 const struct hv_vmbus_device_id *dev_id)
  {
 int ret;
 +   int num_cpus = num_online_cpus();
 struct Scsi_Host *host;
 struct hv_host_device *host_dev;
 bool dev_is_ide = ((dev_id-driver_data == IDE_GUID) ? true : false);
 @@ -1694,6 +1697,7 @@ static int storvsc_probe(struct hv_device *device,
 int max_luns_per_target;
 int max_targets;
 int max_channels;
 +   int max_sub_channels = 0;

 /*
  * Based on the windows host we are running on,
 @@ -1719,12 +1723,18 @@ static int storvsc_probe(struct hv_device *device,
 max_luns_per_target = STORVSC_MAX_LUNS_PER_TARGET;
 max_targets = STORVSC_MAX_TARGETS;
 max_channels = STORVSC_MAX_CHANNELS;
 +   /*
 +* On Windows8 and above, we support sub-channels for storage.
 +* The number of sub-channels offerred is based on the number 
 of
 +* VCPUs in the guest.
 +*/
 +   max_sub_channels = (num_cpus / storvsc_vcpus_per_sub_channel);
 break;
 }

 -   if (dev_id-driver_data == SFC_GUID)
 -   scsi_driver.can_queue = (STORVSC_MAX_IO_REQUESTS *
 -STORVSC_FC_MAX_TARGETS);
 +   scsi_driver.can_queue = (max_outstanding_req_per_channel *
 +max_sub_channels + 1);
 +

If num_online_cpus() returned 1 - 3, can_queue will be set to 1 I
think. Is that desired?

 host = scsi_host_alloc(scsi_driver,
sizeof(struct hv_host_device));
 if (!host)
 @@ -1837,7 +1847,6 @@ static struct hv_driver storvsc_drv = {

  static int __init storvsc_drv_init(void)
  {
 -   u32 max_outstanding_req_per_channel;

 /*
  * Divide the ring buffer data size (which is 1 page less
 @@ -1852,10 +1861,6 @@ static int __init storvsc_drv_init(void)
 vmscsi_size_delta,
 sizeof(u64)));

 -   if (max_outstanding_req_per_channel 
 -   STORVSC_MAX_IO_REQUESTS)
 -   return -EINVAL;
 -
 return vmbus_driver_register(storvsc_drv);
  }

-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4] virtio-scsi: Fix the race condition in virtscsi_handle_event

2015-01-08 Thread Venkatesh Srinivas
On Wed, Jan 7, 2015 at 6:57 PM, Fam Zheng  wrote:
>
> There is a race condition in virtscsi_handle_event, when many device
> hotplug/unplug events flush in quickly.
>
> The scsi_remove_device in virtscsi_handle_transport_reset may trigger
> the BUG_ON in scsi_target_reap, because the state is altered behind it,
> probably by scsi_scan_host of another event. I'm able to reproduce it by
> repeatedly plugging and unplugging a scsi disk with the same lun number.
>
> To fix this, a single thread workqueue (local to the module) is added,
> which makes the scan work serialized. With this change, the panic goes
> away.
>
> Signed-off-by: Fam Zheng 

Reviewed-by: Venkatesh Srinivas 

> ---
>
> v4: Addressing MST's comments:
> Use ordered workqueue, with WQ_FREEZABLE and WQ_MEM_RECLAIM flags.
> Coding style fixes.
>
> v3: Fix spacing and destroy order. (MST)
> ---
>  drivers/scsi/virtio_scsi.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index c52bb5d..0db63b5 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -120,6 +120,7 @@ struct virtio_scsi {
>
>  static struct kmem_cache *virtscsi_cmd_cache;
>  static mempool_t *virtscsi_cmd_pool;
> +static struct workqueue_struct *virtscsi_scan_wq;
>
>  static inline struct Scsi_Host *virtio_scsi_host(struct virtio_device *vdev)
>  {
> @@ -404,7 +405,7 @@ static void virtscsi_complete_event(struct virtio_scsi 
> *vscsi, void *buf)
> struct virtio_scsi_event_node *event_node = buf;
>
> if (!vscsi->stop_events)
> -   queue_work(system_freezable_wq, _node->work);
> +   queue_work(virtscsi_scan_wq, _node->work);
>  }
>
>  static void virtscsi_event_done(struct virtqueue *vq)
> @@ -1119,6 +1120,14 @@ static int __init init(void)
> pr_err("mempool_create() for virtscsi_cmd_pool failed\n");
> goto error;
> }
> +
> +   virtscsi_scan_wq =
> +   alloc_ordered_workqueue("virtscsi-scan", WQ_FREEZABLE | 
> WQ_MEM_RECLAIM);
> +   if (!virtscsi_scan_wq) {
> +   pr_err("create_singlethread_workqueue() for virtscsi_scan_wq 
> failed\n");
> +   goto error;
> +   }
> +
> ret = register_virtio_driver(_scsi_driver);
> if (ret < 0)
> goto error;
> @@ -1126,6 +1135,8 @@ static int __init init(void)
> return 0;
>
>  error:
> +   if (virtscsi_scan_wq)
> +   destroy_workqueue(virtscsi_scan_wq);
> if (virtscsi_cmd_pool) {
> mempool_destroy(virtscsi_cmd_pool);
> virtscsi_cmd_pool = NULL;
> @@ -1140,6 +1151,7 @@ error:
>  static void __exit fini(void)
>  {
> unregister_virtio_driver(_scsi_driver);
> +   destroy_workqueue(virtscsi_scan_wq);
> mempool_destroy(virtscsi_cmd_pool);
> kmem_cache_destroy(virtscsi_cmd_cache);
>  }
> --
> 1.9.3
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4] virtio-scsi: Fix the race condition in virtscsi_handle_event

2015-01-08 Thread Venkatesh Srinivas
On Wed, Jan 7, 2015 at 6:57 PM, Fam Zheng f...@redhat.com wrote:

 There is a race condition in virtscsi_handle_event, when many device
 hotplug/unplug events flush in quickly.

 The scsi_remove_device in virtscsi_handle_transport_reset may trigger
 the BUG_ON in scsi_target_reap, because the state is altered behind it,
 probably by scsi_scan_host of another event. I'm able to reproduce it by
 repeatedly plugging and unplugging a scsi disk with the same lun number.

 To fix this, a single thread workqueue (local to the module) is added,
 which makes the scan work serialized. With this change, the panic goes
 away.

 Signed-off-by: Fam Zheng f...@redhat.com

Reviewed-by: Venkatesh Srinivas venkate...@google.com

 ---

 v4: Addressing MST's comments:
 Use ordered workqueue, with WQ_FREEZABLE and WQ_MEM_RECLAIM flags.
 Coding style fixes.

 v3: Fix spacing and destroy order. (MST)
 ---
  drivers/scsi/virtio_scsi.c | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)

 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index c52bb5d..0db63b5 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -120,6 +120,7 @@ struct virtio_scsi {

  static struct kmem_cache *virtscsi_cmd_cache;
  static mempool_t *virtscsi_cmd_pool;
 +static struct workqueue_struct *virtscsi_scan_wq;

  static inline struct Scsi_Host *virtio_scsi_host(struct virtio_device *vdev)
  {
 @@ -404,7 +405,7 @@ static void virtscsi_complete_event(struct virtio_scsi 
 *vscsi, void *buf)
 struct virtio_scsi_event_node *event_node = buf;

 if (!vscsi-stop_events)
 -   queue_work(system_freezable_wq, event_node-work);
 +   queue_work(virtscsi_scan_wq, event_node-work);
  }

  static void virtscsi_event_done(struct virtqueue *vq)
 @@ -1119,6 +1120,14 @@ static int __init init(void)
 pr_err(mempool_create() for virtscsi_cmd_pool failed\n);
 goto error;
 }
 +
 +   virtscsi_scan_wq =
 +   alloc_ordered_workqueue(virtscsi-scan, WQ_FREEZABLE | 
 WQ_MEM_RECLAIM);
 +   if (!virtscsi_scan_wq) {
 +   pr_err(create_singlethread_workqueue() for virtscsi_scan_wq 
 failed\n);
 +   goto error;
 +   }
 +
 ret = register_virtio_driver(virtio_scsi_driver);
 if (ret  0)
 goto error;
 @@ -1126,6 +1135,8 @@ static int __init init(void)
 return 0;

  error:
 +   if (virtscsi_scan_wq)
 +   destroy_workqueue(virtscsi_scan_wq);
 if (virtscsi_cmd_pool) {
 mempool_destroy(virtscsi_cmd_pool);
 virtscsi_cmd_pool = NULL;
 @@ -1140,6 +1151,7 @@ error:
  static void __exit fini(void)
  {
 unregister_virtio_driver(virtio_scsi_driver);
 +   destroy_workqueue(virtscsi_scan_wq);
 mempool_destroy(virtscsi_cmd_pool);
 kmem_cache_destroy(virtscsi_cmd_cache);
  }
 --
 1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] virtio-scsi: Fix the race condition in virtscsi_handle_event

2015-01-06 Thread Venkatesh Srinivas
On Mon, Jan 5, 2015 at 11:33 PM, Fam Zheng  wrote:
> There is a race condition in virtscsi_handle_event, when many device
> hotplug/unplug events flush in quickly.
>
> The scsi_remove_device in virtscsi_handle_transport_reset may trigger
> the BUG_ON in scsi_target_reap, because the state is altered behind it,
> probably by scsi_scan_host of another event. I'm able to reproduce it by
> repeatedly plugging and unplugging a scsi disk with the same lun number.
>
> To make is safe, a single thread workqueue local to the module is added
> which runs the scan work. With this change, the panic goes away.
>
> Signed-off-by: Fam Zheng 
>
> ---
>
> v2: Use a single threaded workqueue instead of mutex to serialize work
> (Venkatesh)
> ---
>  drivers/scsi/virtio_scsi.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index c52bb5d..71b0091 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -120,6 +120,8 @@ struct virtio_scsi {
>
>  static struct kmem_cache *virtscsi_cmd_cache;
>  static mempool_t *virtscsi_cmd_pool;
> +static struct workqueue_struct *virtscsi_scan_wq;
> +
>
>  static inline struct Scsi_Host *virtio_scsi_host(struct virtio_device *vdev)
>  {
> @@ -404,7 +406,7 @@ static void virtscsi_complete_event(struct virtio_scsi 
> *vscsi, void *buf)
> struct virtio_scsi_event_node *event_node = buf;
>
> if (!vscsi->stop_events)
> -   queue_work(system_freezable_wq, _node->work);
> +   queue_work(virtscsi_scan_wq, _node->work);
>  }
>
>  static void virtscsi_event_done(struct virtqueue *vq)
> @@ -1119,6 +1121,12 @@ static int __init init(void)
> pr_err("mempool_create() for virtscsi_cmd_pool failed\n");
> goto error;
> }
> +   virtscsi_scan_wq = create_singlethread_workqueue("virtscsi-scan");
> +   if (!virtscsi_scan_wq) {
> +   pr_err("create_singlethread_workqueue() for virtscsi_scan_wq 
> failed\n");
> +   goto error;
> +   }
> +
> ret = register_virtio_driver(_scsi_driver);
> if (ret < 0)
> goto error;
> @@ -1126,6 +1134,9 @@ static int __init init(void)
> return 0;
>
>  error:
> +   if (virtscsi_scan_wq) {
> +   destroy_workqueue(virtscsi_scan_wq);
> +   }
> if (virtscsi_cmd_pool) {
> mempool_destroy(virtscsi_cmd_pool);
> virtscsi_cmd_pool = NULL;
> @@ -1142,6 +1153,7 @@ static void __exit fini(void)
> unregister_virtio_driver(_scsi_driver);
> mempool_destroy(virtscsi_cmd_pool);
> kmem_cache_destroy(virtscsi_cmd_cache);
> +   destroy_workqueue(virtscsi_scan_wq);
>  }
>  module_init(init);
>  module_exit(fini);
> --
> 1.9.3

Reviewed-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] virtio-scsi: Fix the race condition in virtscsi_handle_event

2015-01-06 Thread Venkatesh Srinivas
On Mon, Jan 5, 2015 at 11:33 PM, Fam Zheng f...@redhat.com wrote:
 There is a race condition in virtscsi_handle_event, when many device
 hotplug/unplug events flush in quickly.

 The scsi_remove_device in virtscsi_handle_transport_reset may trigger
 the BUG_ON in scsi_target_reap, because the state is altered behind it,
 probably by scsi_scan_host of another event. I'm able to reproduce it by
 repeatedly plugging and unplugging a scsi disk with the same lun number.

 To make is safe, a single thread workqueue local to the module is added
 which runs the scan work. With this change, the panic goes away.

 Signed-off-by: Fam Zheng f...@redhat.com

 ---

 v2: Use a single threaded workqueue instead of mutex to serialize work
 (Venkatesh)
 ---
  drivers/scsi/virtio_scsi.c | 14 +-
  1 file changed, 13 insertions(+), 1 deletion(-)

 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index c52bb5d..71b0091 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -120,6 +120,8 @@ struct virtio_scsi {

  static struct kmem_cache *virtscsi_cmd_cache;
  static mempool_t *virtscsi_cmd_pool;
 +static struct workqueue_struct *virtscsi_scan_wq;
 +

  static inline struct Scsi_Host *virtio_scsi_host(struct virtio_device *vdev)
  {
 @@ -404,7 +406,7 @@ static void virtscsi_complete_event(struct virtio_scsi 
 *vscsi, void *buf)
 struct virtio_scsi_event_node *event_node = buf;

 if (!vscsi-stop_events)
 -   queue_work(system_freezable_wq, event_node-work);
 +   queue_work(virtscsi_scan_wq, event_node-work);
  }

  static void virtscsi_event_done(struct virtqueue *vq)
 @@ -1119,6 +1121,12 @@ static int __init init(void)
 pr_err(mempool_create() for virtscsi_cmd_pool failed\n);
 goto error;
 }
 +   virtscsi_scan_wq = create_singlethread_workqueue(virtscsi-scan);
 +   if (!virtscsi_scan_wq) {
 +   pr_err(create_singlethread_workqueue() for virtscsi_scan_wq 
 failed\n);
 +   goto error;
 +   }
 +
 ret = register_virtio_driver(virtio_scsi_driver);
 if (ret  0)
 goto error;
 @@ -1126,6 +1134,9 @@ static int __init init(void)
 return 0;

  error:
 +   if (virtscsi_scan_wq) {
 +   destroy_workqueue(virtscsi_scan_wq);
 +   }
 if (virtscsi_cmd_pool) {
 mempool_destroy(virtscsi_cmd_pool);
 virtscsi_cmd_pool = NULL;
 @@ -1142,6 +1153,7 @@ static void __exit fini(void)
 unregister_virtio_driver(virtio_scsi_driver);
 mempool_destroy(virtscsi_cmd_pool);
 kmem_cache_destroy(virtscsi_cmd_cache);
 +   destroy_workqueue(virtscsi_scan_wq);
  }
  module_init(init);
  module_exit(fini);
 --
 1.9.3

Reviewed-by: Venkatesh Srinivas venkate...@google.com

-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event

2015-01-05 Thread Venkatesh Srinivas
On Sun, Jan 4, 2015 at 10:04 PM, Fam Zheng  wrote:
>
> There is a race condition in virtscsi_handle_event, when many device
> hotplug/unplug events flush in quickly.
>
> The scsi_remove_device in virtscsi_handle_transport_reset may trigger
> the BUG_ON in scsi_target_reap, because the state is altered behind it,
> probably by scsi_scan_host of another event. I'm able to reproduce it by
> repeatedly plugging and unplugging a scsi disk with the same lun number.
>
> To make is safe, the mutex added in struct virtio_scsi is held in
> virtscsi_handle_event, so that all the events are processed in a
> synchronized way. With this lock, the panic goes away.
>
> Signed-off-by: Fam Zheng 
> ---
>  drivers/scsi/virtio_scsi.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index c52bb5d..7f194d4 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -110,6 +110,9 @@ struct virtio_scsi {
> /* CPU hotplug notifier */
> struct notifier_block nb;
>
> +   /* Protect the hotplug/unplug event handling */
> +   struct mutex scan_lock;
> +
> /* Protected by event_vq lock */
> bool stop_events;
>
> @@ -377,6 +380,7 @@ static void virtscsi_handle_event(struct work_struct 
> *work)
> struct virtio_scsi *vscsi = event_node->vscsi;
> struct virtio_scsi_event *event = _node->event;
>
> +   mutex_lock(>scan_lock);
> if (event->event &
> cpu_to_virtio32(vscsi->vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
> event->event &= ~cpu_to_virtio32(vscsi->vdev,
> @@ -397,6 +401,7 @@ static void virtscsi_handle_event(struct work_struct 
> *work)
> pr_err("Unsupport virtio scsi event %x\n", event->event);
> }
> virtscsi_kick_event(vscsi, event_node);
> +   mutex_unlock(>scan_lock);
>  }
>
>  static void virtscsi_complete_event(struct virtio_scsi *vscsi, void *buf)
> @@ -894,6 +899,7 @@ static int virtscsi_init(struct virtio_device *vdev,
> const char **names;
> struct virtqueue **vqs;
>
> +   mutex_init(>scan_lock);
> num_vqs = vscsi->num_queues + VIRTIO_SCSI_VQ_BASE;
> vqs = kmalloc(num_vqs * sizeof(struct virtqueue *), GFP_KERNEL);
> callbacks = kmalloc(num_vqs * sizeof(vq_callback_t *), GFP_KERNEL);
> --
> 1.9.3


Nice find.

This fix does have the effect of serializing all event handling via
scan_lock; perhaps you want to instead create a singlethreaded
workqueue in virtio_scsi and queue handle_event there, rather than
waiting on scan_lock on the system workqueue?

Reviewed-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-scsi: Fix the race condition in virtscsi_handle_event

2015-01-05 Thread Venkatesh Srinivas
On Sun, Jan 4, 2015 at 10:04 PM, Fam Zheng f...@redhat.com wrote:

 There is a race condition in virtscsi_handle_event, when many device
 hotplug/unplug events flush in quickly.

 The scsi_remove_device in virtscsi_handle_transport_reset may trigger
 the BUG_ON in scsi_target_reap, because the state is altered behind it,
 probably by scsi_scan_host of another event. I'm able to reproduce it by
 repeatedly plugging and unplugging a scsi disk with the same lun number.

 To make is safe, the mutex added in struct virtio_scsi is held in
 virtscsi_handle_event, so that all the events are processed in a
 synchronized way. With this lock, the panic goes away.

 Signed-off-by: Fam Zheng f...@redhat.com
 ---
  drivers/scsi/virtio_scsi.c | 6 ++
  1 file changed, 6 insertions(+)

 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index c52bb5d..7f194d4 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -110,6 +110,9 @@ struct virtio_scsi {
 /* CPU hotplug notifier */
 struct notifier_block nb;

 +   /* Protect the hotplug/unplug event handling */
 +   struct mutex scan_lock;
 +
 /* Protected by event_vq lock */
 bool stop_events;

 @@ -377,6 +380,7 @@ static void virtscsi_handle_event(struct work_struct 
 *work)
 struct virtio_scsi *vscsi = event_node-vscsi;
 struct virtio_scsi_event *event = event_node-event;

 +   mutex_lock(vscsi-scan_lock);
 if (event-event 
 cpu_to_virtio32(vscsi-vdev, VIRTIO_SCSI_T_EVENTS_MISSED)) {
 event-event = ~cpu_to_virtio32(vscsi-vdev,
 @@ -397,6 +401,7 @@ static void virtscsi_handle_event(struct work_struct 
 *work)
 pr_err(Unsupport virtio scsi event %x\n, event-event);
 }
 virtscsi_kick_event(vscsi, event_node);
 +   mutex_unlock(vscsi-scan_lock);
  }

  static void virtscsi_complete_event(struct virtio_scsi *vscsi, void *buf)
 @@ -894,6 +899,7 @@ static int virtscsi_init(struct virtio_device *vdev,
 const char **names;
 struct virtqueue **vqs;

 +   mutex_init(vscsi-scan_lock);
 num_vqs = vscsi-num_queues + VIRTIO_SCSI_VQ_BASE;
 vqs = kmalloc(num_vqs * sizeof(struct virtqueue *), GFP_KERNEL);
 callbacks = kmalloc(num_vqs * sizeof(vq_callback_t *), GFP_KERNEL);
 --
 1.9.3


Nice find.

This fix does have the effect of serializing all event handling via
scan_lock; perhaps you want to instead create a singlethreaded
workqueue in virtio_scsi and queue handle_event there, rather than
waiting on scan_lock on the system workqueue?

Reviewed-by: Venkatesh Srinivas venkate...@google.com

-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scsi: Fix qemu boot hang problem

2014-08-11 Thread Venkatesh Srinivas
Should we wrap the sdev->device_busy read and test in a new
scsi_device_busy(), so as to preserve the old polarity?

Reviewed-by: Venkatesh Srinivas 

On 8/11/14, Jens Axboe  wrote:
> On 2014-08-10 06:54, Guenter Roeck wrote:
>> The latest kernel fails to boot qemu arm images when using scsi
>> for disk access. Boot gets stuck after the following messages.
>>
>> brd: module loaded
>> sym53c8xx :00:0c.0: enabling device (0100 -> 0103)
>> sym0: <895a> rev 0x0 at pci :00:0c.0 irq 93
>> sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
>> sym0: SCSI BUS has been reset.
>> scsi host0: sym-2.2.3
>>
>> Bisect points to commit 71e75c97f97a ("scsi: convert device_busy to
>> atomic_t"). Code inspection shows the following suspicious change
>> in scsi_request_fn.
>>
>> out_delay:
>> -   if (sdev->device_busy == 0 && !scsi_device_blocked(sdev))
>> +   if (atomic_read(>device_busy) &&
>> !scsi_device_blocked(sdev))
>>  blk_delay_queue(q, SCSI_QUEUE_DELAY);
>>  }
>>
>> 'sdev->device_busy == 0' was replaced with
>> 'atomic_read(>device_busy)',
>> meaning the logic was reversed. Changing this expression to
>> '!atomic_read(>device_busy)' fixes the problem.
>>
>> Cc: Christoph Hellwig 
>> Cc: Hannes Reinecke 
>> Cc: Webb Scales 
>> Cc: Jens Axboe 
>> Signed-off-by: Guenter Roeck 
>> ---
>>   drivers/scsi/scsi_lib.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 9c44392..ce62e87 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1774,7 +1774,7 @@ static void scsi_request_fn(struct request_queue
>> *q)
>>  blk_requeue_request(q, req);
>>  atomic_dec(>device_busy);
>>   out_delay:
>> -if (atomic_read(>device_busy) && !scsi_device_blocked(sdev))
>> +if (!atomic_read(>device_busy) && !scsi_device_blocked(sdev))
>>  blk_delay_queue(q, SCSI_QUEUE_DELAY);
>>   }
>>
>>
>
> That's a no-brainer obvious fix!
>
> Acked-by: Jens Axboe 
>
>
> --
> Jens Axboe
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] scsi: Fix qemu boot hang problem

2014-08-11 Thread Venkatesh Srinivas
Should we wrap the sdev-device_busy read and test in a new
scsi_device_busy(), so as to preserve the old polarity?

Reviewed-by: Venkatesh Srinivas venkate...@google.com

On 8/11/14, Jens Axboe ax...@kernel.dk wrote:
 On 2014-08-10 06:54, Guenter Roeck wrote:
 The latest kernel fails to boot qemu arm images when using scsi
 for disk access. Boot gets stuck after the following messages.

 brd: module loaded
 sym53c8xx :00:0c.0: enabling device (0100 - 0103)
 sym0: 895a rev 0x0 at pci :00:0c.0 irq 93
 sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
 sym0: SCSI BUS has been reset.
 scsi host0: sym-2.2.3

 Bisect points to commit 71e75c97f97a (scsi: convert device_busy to
 atomic_t). Code inspection shows the following suspicious change
 in scsi_request_fn.

 out_delay:
 -   if (sdev-device_busy == 0  !scsi_device_blocked(sdev))
 +   if (atomic_read(sdev-device_busy) 
 !scsi_device_blocked(sdev))
  blk_delay_queue(q, SCSI_QUEUE_DELAY);
  }

 'sdev-device_busy == 0' was replaced with
 'atomic_read(sdev-device_busy)',
 meaning the logic was reversed. Changing this expression to
 '!atomic_read(sdev-device_busy)' fixes the problem.

 Cc: Christoph Hellwig h...@lst.de
 Cc: Hannes Reinecke h...@suse.de
 Cc: Webb Scales web...@hp.com
 Cc: Jens Axboe ax...@kernel.dk
 Signed-off-by: Guenter Roeck li...@roeck-us.net
 ---
   drivers/scsi/scsi_lib.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
 index 9c44392..ce62e87 100644
 --- a/drivers/scsi/scsi_lib.c
 +++ b/drivers/scsi/scsi_lib.c
 @@ -1774,7 +1774,7 @@ static void scsi_request_fn(struct request_queue
 *q)
  blk_requeue_request(q, req);
  atomic_dec(sdev-device_busy);
   out_delay:
 -if (atomic_read(sdev-device_busy)  !scsi_device_blocked(sdev))
 +if (!atomic_read(sdev-device_busy)  !scsi_device_blocked(sdev))
  blk_delay_queue(q, SCSI_QUEUE_DELAY);
   }



 That's a no-brainer obvious fix!

 Acked-by: Jens Axboe ax...@fb.com


 --
 Jens Axboe

 --
 To unsubscribe from this list: send the line unsubscribe linux-scsi in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 5/6] virtio-scsi: fix various bad behavior on aborted requests

2014-06-04 Thread Venkatesh Srinivas
On 6/4/14, Paolo Bonzini  wrote:
> Il 04/06/2014 19:29, Venkatesh Srinivas ha scritto:
>> Do you really want to poll the request VQs for completions if the TMF
>> was rejected?
>
> I wasn't sure, but bugs in this path are hard enough that I preferred
> the safer patch.

Ok. As long as it was deliberate. I'd slightly prefer only doing so in
the success case, but simplicity is a compelling argument :)

Reviewed-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 5/6] virtio-scsi: fix various bad behavior on aborted requests

2014-06-04 Thread Venkatesh Srinivas
On 6/4/14, Paolo Bonzini  wrote:
> Even though the virtio-scsi spec guarantees that all requests related
> to the TMF will have been completed by the time the TMF itself completes,
> the request queue's callback might not have run yet.  This causes requests
> to be completed more than once, and as a result triggers a variety of
> BUGs or oopses.
>
> Cc: sta...@vger.kernel.org
> Signed-off-by: Paolo Bonzini 
> ---
>  drivers/scsi/virtio_scsi.c | 22 ++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index d66c4ee2c774..fda9fb35 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -235,6 +235,16 @@ static void virtscsi_req_done(struct virtqueue *vq)
>   virtscsi_vq_done(vscsi, req_vq, virtscsi_complete_cmd);
>  };
>
> +static void virtscsi_poll_requests(struct virtio_scsi *vscsi)
> +{
> + int i, num_vqs;
> +
> + num_vqs = vscsi->num_queues;
> + for (i = 0; i < num_vqs; i++)
> + virtscsi_vq_done(vscsi, >req_vqs[i],
> +  virtscsi_complete_cmd);
> +}
> +
>  static void virtscsi_complete_free(struct virtio_scsi *vscsi, void *buf)
>  {
>   struct virtio_scsi_cmd *cmd = buf;
> @@ -579,6 +589,18 @@ static int virtscsi_tmf(struct virtio_scsi *vscsi,
> struct virtio_scsi_cmd *cmd)
>   cmd->resp.tmf.response == VIRTIO_SCSI_S_FUNCTION_SUCCEEDED)
>   ret = SUCCESS;
>
> + /*
> +  * The spec guarantees that all requests related to the TMF have
> +  * been completed, but the callback might not have run yet if
> +  * we're using independent interrupts (e.g. MSI).  Poll the
> +  * virtqueues once.
> +  *
> +  * In the abort case, sc->scsi_done will do nothing, because
> +  * the block layer must have detected a timeout and as a result
> +  * REQ_ATOM_COMPLETE has been set.
> +  */
> + virtscsi_poll_requests(vscsi);

Do you really want to poll the request VQs for completions if the TMF
was rejected?

TMF ABORT may return FUNCTION REJECTED if the command to abort
completed before the device saw the TMF ABORT message, for example. In
such cases, this would
unnecessarily lengthen the EH path.

> +
>  out:
>   mempool_free(cmd, virtscsi_cmd_pool);
>   return ret;
> --
> 1.8.3.1

Thanks for looking into this,
-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] virtio-scsi: replace target spinlock with seqcount

2014-06-04 Thread Venkatesh Srinivas
On 6/4/14, Paolo Bonzini  wrote:
> From: Ming Lei 
>
> The spinlock of tgt_lock is only for serializing read and write
> req_vq, one lockless seqcount is enough for the purpose.
>
> On one 16core VM with vhost-scsi backend, the patch can improve
> IOPS with 3% on random read test.
>
> Signed-off-by: Ming Lei 
> [Add initialization in virtscsi_target_alloc. - Paolo]
> Signed-off-by: Paolo Bonzini 

Reviewed-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/6] virtio-scsi: replace target spinlock with seqcount

2014-06-04 Thread Venkatesh Srinivas
On 6/4/14, Paolo Bonzini pbonz...@redhat.com wrote:
 From: Ming Lei ming@canonical.com

 The spinlock of tgt_lock is only for serializing read and write
 req_vq, one lockless seqcount is enough for the purpose.

 On one 16core VM with vhost-scsi backend, the patch can improve
 IOPS with 3% on random read test.

 Signed-off-by: Ming Lei ming@canonical.com
 [Add initialization in virtscsi_target_alloc. - Paolo]
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com

Reviewed-by: Venkatesh Srinivas venkate...@google.com

-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 5/6] virtio-scsi: fix various bad behavior on aborted requests

2014-06-04 Thread Venkatesh Srinivas
On 6/4/14, Paolo Bonzini pbonz...@redhat.com wrote:
 Even though the virtio-scsi spec guarantees that all requests related
 to the TMF will have been completed by the time the TMF itself completes,
 the request queue's callback might not have run yet.  This causes requests
 to be completed more than once, and as a result triggers a variety of
 BUGs or oopses.

 Cc: sta...@vger.kernel.org
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
  drivers/scsi/virtio_scsi.c | 22 ++
  1 file changed, 22 insertions(+)

 diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
 index d66c4ee2c774..fda9fb35 100644
 --- a/drivers/scsi/virtio_scsi.c
 +++ b/drivers/scsi/virtio_scsi.c
 @@ -235,6 +235,16 @@ static void virtscsi_req_done(struct virtqueue *vq)
   virtscsi_vq_done(vscsi, req_vq, virtscsi_complete_cmd);
  };

 +static void virtscsi_poll_requests(struct virtio_scsi *vscsi)
 +{
 + int i, num_vqs;
 +
 + num_vqs = vscsi-num_queues;
 + for (i = 0; i  num_vqs; i++)
 + virtscsi_vq_done(vscsi, vscsi-req_vqs[i],
 +  virtscsi_complete_cmd);
 +}
 +
  static void virtscsi_complete_free(struct virtio_scsi *vscsi, void *buf)
  {
   struct virtio_scsi_cmd *cmd = buf;
 @@ -579,6 +589,18 @@ static int virtscsi_tmf(struct virtio_scsi *vscsi,
 struct virtio_scsi_cmd *cmd)
   cmd-resp.tmf.response == VIRTIO_SCSI_S_FUNCTION_SUCCEEDED)
   ret = SUCCESS;

 + /*
 +  * The spec guarantees that all requests related to the TMF have
 +  * been completed, but the callback might not have run yet if
 +  * we're using independent interrupts (e.g. MSI).  Poll the
 +  * virtqueues once.
 +  *
 +  * In the abort case, sc-scsi_done will do nothing, because
 +  * the block layer must have detected a timeout and as a result
 +  * REQ_ATOM_COMPLETE has been set.
 +  */
 + virtscsi_poll_requests(vscsi);

Do you really want to poll the request VQs for completions if the TMF
was rejected?

TMF ABORT may return FUNCTION REJECTED if the command to abort
completed before the device saw the TMF ABORT message, for example. In
such cases, this would
unnecessarily lengthen the EH path.

 +
  out:
   mempool_free(cmd, virtscsi_cmd_pool);
   return ret;
 --
 1.8.3.1

Thanks for looking into this,
-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 5/6] virtio-scsi: fix various bad behavior on aborted requests

2014-06-04 Thread Venkatesh Srinivas
On 6/4/14, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 04/06/2014 19:29, Venkatesh Srinivas ha scritto:
 Do you really want to poll the request VQs for completions if the TMF
 was rejected?

 I wasn't sure, but bugs in this path are hard enough that I preferred
 the safer patch.

Ok. As long as it was deliberate. I'd slightly prefer only doing so in
the success case, but simplicity is a compelling argument :)

Reviewed-by: Venkatesh Srinivas venkate...@google.com

-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf/x86: fix RAPL rdmsrl_safe() usage

2014-04-23 Thread Venkatesh Srinivas
On Wed, Apr 23, 2014 at 12:00 PM, Borislav Petkov  wrote:
> On Wed, Apr 23, 2014 at 07:04:19PM +0200, Stephane Eranian wrote:
>>
>> This patch fixes a bug introduced by commit 24223657.
>
> Maybe add
>
> Fixes: 
>
> tag?
>
>> The rdmsrl_safe() function returns 0 on success.
>> The current code was failing to detect the RAPL PMU
>> on real hardware  (missing /sys/devices/power) because
>> the return value of rdmsrl_safe() was misinterpreted.
>>
>> Signed-off-by: Stephane Eranian 
>
> Acked-by: Borislav Petkov 

Brown-Paper-Bag-Worn-By: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf/x86: fix RAPL rdmsrl_safe() usage

2014-04-23 Thread Venkatesh Srinivas
On Wed, Apr 23, 2014 at 12:00 PM, Borislav Petkov b...@alien8.de wrote:
 On Wed, Apr 23, 2014 at 07:04:19PM +0200, Stephane Eranian wrote:

 This patch fixes a bug introduced by commit 24223657.

 Maybe add

 Fixes: commit-id

 tag?

 The rdmsrl_safe() function returns 0 on success.
 The current code was failing to detect the RAPL PMU
 on real hardware  (missing /sys/devices/power) because
 the return value of rdmsrl_safe() was misinterpreted.

 Signed-off-by: Stephane Eranian eran...@google.com

 Acked-by: Borislav Petkov b...@suse.de

Brown-Paper-Bag-Worn-By: Venkatesh Srinivas venkate...@google.com

-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:perf/core] perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU

2014-04-18 Thread tip-bot for Venkatesh Srinivas
Commit-ID:  24223657806a0ebd0ae5c9caaf7b021091889cf2
Gitweb: http://git.kernel.org/tip/24223657806a0ebd0ae5c9caaf7b021091889cf2
Author: Venkatesh Srinivas 
AuthorDate: Thu, 13 Mar 2014 12:36:26 -0700
Committer:  Ingo Molnar 
CommitDate: Fri, 18 Apr 2014 12:14:26 +0200

perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU

CPUs which should support the RAPL counters according to
Family/Model/Stepping may still issue #GP when attempting to access
the RAPL MSRs. This may happen when Linux is running under KVM and
we are passing-through host F/M/S data, for example. Use rdmsrl_safe
to first access the RAPL_POWER_UNIT MSR; if this fails, do not
attempt to use this PMU.

Signed-off-by: Venkatesh Srinivas 
Signed-off-by: Peter Zijlstra 
Link: 
http://lkml.kernel.org/r/1394739386-22260-1-git-send-email-venkate...@google.com
Cc: zheng.z@intel.com
Cc: eran...@google.com
Cc: a...@linux.intel.com
Cc: linux-kernel@vger.kernel.org
[ The patch also silently fixes another bug: rapl_pmu_init() didn't handle the 
memory alloc failure case previously. ]
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c 
b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index 4b9a9e9..7c87424 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -535,6 +535,7 @@ static int rapl_cpu_prepare(int cpu)
struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
int phys_id = topology_physical_package_id(cpu);
u64 ms;
+   u64 msr_rapl_power_unit_bits;
 
if (pmu)
return 0;
@@ -542,6 +543,9 @@ static int rapl_cpu_prepare(int cpu)
if (phys_id < 0)
return -1;
 
+   if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, _rapl_power_unit_bits))
+   return -1;
+
pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
if (!pmu)
return -1;
@@ -555,8 +559,7 @@ static int rapl_cpu_prepare(int cpu)
 *
 * we cache in local PMU instance
 */
-   rdmsrl(MSR_RAPL_POWER_UNIT, pmu->hw_unit);
-   pmu->hw_unit = (pmu->hw_unit >> 8) & 0x1FULL;
+   pmu->hw_unit = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
pmu->pmu = _pmu_class;
 
/*
@@ -677,7 +680,9 @@ static int __init rapl_pmu_init(void)
cpu_notifier_register_begin();
 
for_each_online_cpu(cpu) {
-   rapl_cpu_prepare(cpu);
+   ret = rapl_cpu_prepare(cpu);
+   if (ret)
+   goto out;
rapl_cpu_init(cpu);
}
 
@@ -700,6 +705,7 @@ static int __init rapl_pmu_init(void)
hweight32(rapl_cntr_mask),
ktime_to_ms(pmu->timer_interval));
 
+out:
cpu_notifier_register_done();
 
return 0;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:perf/core] perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU

2014-04-18 Thread tip-bot for Venkatesh Srinivas
Commit-ID:  24223657806a0ebd0ae5c9caaf7b021091889cf2
Gitweb: http://git.kernel.org/tip/24223657806a0ebd0ae5c9caaf7b021091889cf2
Author: Venkatesh Srinivas venkate...@google.com
AuthorDate: Thu, 13 Mar 2014 12:36:26 -0700
Committer:  Ingo Molnar mi...@kernel.org
CommitDate: Fri, 18 Apr 2014 12:14:26 +0200

perf/x86/intel: Use rdmsrl_safe() when initializing RAPL PMU

CPUs which should support the RAPL counters according to
Family/Model/Stepping may still issue #GP when attempting to access
the RAPL MSRs. This may happen when Linux is running under KVM and
we are passing-through host F/M/S data, for example. Use rdmsrl_safe
to first access the RAPL_POWER_UNIT MSR; if this fails, do not
attempt to use this PMU.

Signed-off-by: Venkatesh Srinivas venkate...@google.com
Signed-off-by: Peter Zijlstra pet...@infradead.org
Link: 
http://lkml.kernel.org/r/1394739386-22260-1-git-send-email-venkate...@google.com
Cc: zheng.z@intel.com
Cc: eran...@google.com
Cc: a...@linux.intel.com
Cc: linux-kernel@vger.kernel.org
[ The patch also silently fixes another bug: rapl_pmu_init() didn't handle the 
memory alloc failure case previously. ]
Signed-off-by: Ingo Molnar mi...@kernel.org
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c 
b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index 4b9a9e9..7c87424 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -535,6 +535,7 @@ static int rapl_cpu_prepare(int cpu)
struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
int phys_id = topology_physical_package_id(cpu);
u64 ms;
+   u64 msr_rapl_power_unit_bits;
 
if (pmu)
return 0;
@@ -542,6 +543,9 @@ static int rapl_cpu_prepare(int cpu)
if (phys_id  0)
return -1;
 
+   if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, msr_rapl_power_unit_bits))
+   return -1;
+
pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
if (!pmu)
return -1;
@@ -555,8 +559,7 @@ static int rapl_cpu_prepare(int cpu)
 *
 * we cache in local PMU instance
 */
-   rdmsrl(MSR_RAPL_POWER_UNIT, pmu-hw_unit);
-   pmu-hw_unit = (pmu-hw_unit  8)  0x1FULL;
+   pmu-hw_unit = (msr_rapl_power_unit_bits  8)  0x1FULL;
pmu-pmu = rapl_pmu_class;
 
/*
@@ -677,7 +680,9 @@ static int __init rapl_pmu_init(void)
cpu_notifier_register_begin();
 
for_each_online_cpu(cpu) {
-   rapl_cpu_prepare(cpu);
+   ret = rapl_cpu_prepare(cpu);
+   if (ret)
+   goto out;
rapl_cpu_init(cpu);
}
 
@@ -700,6 +705,7 @@ static int __init rapl_pmu_init(void)
hweight32(rapl_cntr_mask),
ktime_to_ms(pmu-timer_interval));
 
+out:
cpu_notifier_register_done();
 
return 0;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: make the queue depth the max supportable by the hypervisor

2014-03-25 Thread Venkatesh Srinivas
On Wed, Mar 19, 2014 at 10:48 AM, Venkatesh Srinivas
 wrote:
>> And I rewrote it substantially, mainly to take
>> VIRTIO_RING_F_INDIRECT_DESC into account.
>>
>> As QEMU sets the vq size for PCI to 128, Venkatash's patch wouldn't
>> have made a change.  This version does (since QEMU also offers
>> VIRTIO_RING_F_INDIRECT_DESC.
>
> That divide-by-2 produced the same queue depth as the prior
> computation in QEMU was deliberate -- but raising it to 128 seems
> pretty reasonable.
>
> Signed-off-by: Venkatesh Srinivas 

Soft ping about this patch.

Thanks,
-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: make the queue depth the max supportable by the hypervisor

2014-03-25 Thread Venkatesh Srinivas
On Wed, Mar 19, 2014 at 10:48 AM, Venkatesh Srinivas
venkate...@google.com wrote:
 And I rewrote it substantially, mainly to take
 VIRTIO_RING_F_INDIRECT_DESC into account.

 As QEMU sets the vq size for PCI to 128, Venkatash's patch wouldn't
 have made a change.  This version does (since QEMU also offers
 VIRTIO_RING_F_INDIRECT_DESC.

 That divide-by-2 produced the same queue depth as the prior
 computation in QEMU was deliberate -- but raising it to 128 seems
 pretty reasonable.

 Signed-off-by: Venkatesh Srinivas venkate...@google.com

Soft ping about this patch.

Thanks,
-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: GPF in intel_pmu_lbr_reset() with qemu -cpu host

2014-03-21 Thread Venkatesh Srinivas
On Fri, Mar 21, 2014 at 10:46 AM, Peter Wu  wrote:
> cc'ing kvm people and list.
>
> On Friday 21 March 2014 18:42:40 Peter Wu wrote:
>> Hi,
>>
>> While trying to run QEMU with `-enable-kvm -host cpu`, I get a GPF in
>> intel_pmu_lbr_reset():
>>
>> [0.024000] general protection fault:  [#1]
>> [0.024000] CPU: 0 PID: 1 Comm: swapper Not tainted 
>> 3.14.0-rc7-qemu-00059-g08edb33 #14
>> [0.024000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
>> [0.024000] task: 88003e05 ti: 88003e054000 task.ti: 
>> 88003e054000
>> [0.024000] RIP: 0010:[]  [] 
>> intel_pmu_lbr_reset+0x2a/0x80
>> [0.024000] RSP: :88003e055e78  EFLAGS: 0002
>> [0.024000] RAX:  RBX: 0286 RCX: 
>> 0680
>> [0.024000] RDX:  RSI:  RDI: 
>> 
>> [0.024000] RBP: 81622120 R08: 88003ffee0e0 R09: 
>> 88003e00bf00
>> [0.024000] R10:  R11: 0004 R12: 
>> 
>> [0.024000] R13:  R14:  R15: 
>> 
>> [0.024000] FS:  () GS:8161e000() 
>> knlGS:
>> [0.024000] CS:  0010 DS:  ES:  CR0: 80050033
>> [0.024000] CR2: 8800019bb000 CR3: 01611000 CR4: 
>> 001407b0
>> [0.024000] Stack:
>> [0.024000]  8101308a 8100e3da 8165ba62 
>> 
>> [0.024000]  8165b5bd   
>> 
>> [0.024000]  81655dcd   
>> 
>> [0.024000] Call Trace:
>> [0.024000]  [] ? intel_pmu_cpu_starting+0xa/0x80
>> [0.024000]  [] ? x86_pmu_notifier+0x5a/0xc0
>> [0.024000]  [] ? init_hw_perf_events+0x4a5/0x4dd
>> [0.024000]  [] ? check_bugs+0x42/0x42
>> [0.024000]  [] ? do_one_initcall+0x76/0xf9
>> [0.024000]  [] ? rest_init+0x70/0x70
>> [0.024000]  [] ? kernel_init_freeable+0x57/0x177
>> [0.024000]  [] ? rest_init+0x70/0x70
>> [0.024000]  [] ? kernel_init+0x5/0xe0
>> [0.024000]  [] ? ret_from_fork+0x7a/0xb0
>> [0.024000]  [] ? rest_init+0x70/0x70
>> [0.024000] Code: 00 8b 15 02 c4 63 00 85 d2 74 69 f6 05 af c3 63 00 3f 
>> 75 2d 85 d2 7e 5c 31 f6 31 c0 0f 1f 44 00 00 8b 0d d2 c3 63 00 89 c2 01 f1 
>> <0f> 30 83 c6 01 3b 35 d3 c3 63 00 7c e9 f3 c3 0f 1f 80 00 00 00
>> [0.024000] RIP  [] intel_pmu_lbr_reset+0x2a/0x80
>> [0.024000]  RSP 
>> [0.024000] ---[ end trace ecbd794f78441b2c ]---
>> [0.024002] Kernel panic - not syncing: Attempted to kill init! 
>> exitcode=0x000b
>>
>>
>> It possibly has something to do with the msr write. Reproducable with:
>>
>> qemu-system-x86_64 -enable-kvm -cpu host -kernel bzImage -m 1G -serial 
>> file:ser.txt
>>
>> In the host dmesg, the following is visible when qemu:
>>
>> kvm [4939]: vcpu0 unhandled wrmsr: 0x680 data 0
>>
>> The full guest dmesg is shown below. The issue occurs also with
>> v3.13.6, v3.12.14, v3.10.33 (other versions are not tested).
>>
>> QEMU: 1.7.0
>> Host kernel: v3.14-rc5
>> Guest kernel: v3.14-rc7-59-g08edb33 (.config on the bottom)
>>
>> Kind regards,
>> Peter
>>
>> ### dmesg
>> [0.00] Linux version 3.14.0-rc7-qemu-00059-g08edb33 (pc@antartica) 
>> (gcc version 4.8.2 (Ubuntu 4.8.2-16ubuntu6) ) #14 Fri Mar 21 17:30:49 CET 
>> 2014
>> [0.00] Command line: console=ttyS0 loglevel=8
>> [0.00] KERNEL supported cpus:
>> [0.00]   Intel GenuineIntel
>> [0.00] e820: BIOS-provided physical RAM map:
>> [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
>> [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] 
>> reserved
>> [0.00] BIOS-e820: [mem 0x000f-0x000f] 
>> reserved
>> [0.00] BIOS-e820: [mem 0x0010-0x3fffdfff] usable
>> [0.00] BIOS-e820: [mem 0x3fffe000-0x3fff] 
>> reserved
>> [0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] 
>> reserved
>> [0.00] BIOS-e820: [mem 0xfffc-0x] 
>> reserved
>> [0.00] NX (Execute Disable) protection: active
>> [0.00] SMBIOS 2.4 present.
>> [0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2011
>> [0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
>> [0.00] e820: remove [mem 0x000a-0x000f] usable
>> [0.00] e820: last_pfn = 0x3fffe max_arch_pfn = 0x4
>> [0.00] MTRR default type: write-back
>> [0.00] MTRR fixed ranges enabled:
>> [0.00]   0-9 write-back
>> [0.00]   A-B uncachable
>> [0.00]   C-F write-protect
>> [0.00] MTRR variable ranges enabled:
>> [0.00]   0 base 008000 mask FF8000 uncachable
>> [0.00]   1 disabled
>> [0.00]   2 

Re: GPF in intel_pmu_lbr_reset() with qemu -cpu host

2014-03-21 Thread Venkatesh Srinivas
On Fri, Mar 21, 2014 at 10:46 AM, Peter Wu pe...@lekensteyn.nl wrote:
 cc'ing kvm people and list.

 On Friday 21 March 2014 18:42:40 Peter Wu wrote:
 Hi,

 While trying to run QEMU with `-enable-kvm -host cpu`, I get a GPF in
 intel_pmu_lbr_reset():

 [0.024000] general protection fault:  [#1]
 [0.024000] CPU: 0 PID: 1 Comm: swapper Not tainted 
 3.14.0-rc7-qemu-00059-g08edb33 #14
 [0.024000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 [0.024000] task: 88003e05 ti: 88003e054000 task.ti: 
 88003e054000
 [0.024000] RIP: 0010:[8101148a]  [8101148a] 
 intel_pmu_lbr_reset+0x2a/0x80
 [0.024000] RSP: :88003e055e78  EFLAGS: 0002
 [0.024000] RAX:  RBX: 0286 RCX: 
 0680
 [0.024000] RDX:  RSI:  RDI: 
 
 [0.024000] RBP: 81622120 R08: 88003ffee0e0 R09: 
 88003e00bf00
 [0.024000] R10:  R11: 0004 R12: 
 
 [0.024000] R13:  R14:  R15: 
 
 [0.024000] FS:  () GS:8161e000() 
 knlGS:
 [0.024000] CS:  0010 DS:  ES:  CR0: 80050033
 [0.024000] CR2: 8800019bb000 CR3: 01611000 CR4: 
 001407b0
 [0.024000] Stack:
 [0.024000]  8101308a 8100e3da 8165ba62 
 
 [0.024000]  8165b5bd   
 
 [0.024000]  81655dcd   
 
 [0.024000] Call Trace:
 [0.024000]  [8101308a] ? intel_pmu_cpu_starting+0xa/0x80
 [0.024000]  [8100e3da] ? x86_pmu_notifier+0x5a/0xc0
 [0.024000]  [8165ba62] ? init_hw_perf_events+0x4a5/0x4dd
 [0.024000]  [8165b5bd] ? check_bugs+0x42/0x42
 [0.024000]  [81655dcd] ? do_one_initcall+0x76/0xf9
 [0.024000]  [81276b70] ? rest_init+0x70/0x70
 [0.024000]  [81655ea7] ? kernel_init_freeable+0x57/0x177
 [0.024000]  [81276b70] ? rest_init+0x70/0x70
 [0.024000]  [81276b75] ? kernel_init+0x5/0xe0
 [0.024000]  [8128067a] ? ret_from_fork+0x7a/0xb0
 [0.024000]  [81276b70] ? rest_init+0x70/0x70
 [0.024000] Code: 00 8b 15 02 c4 63 00 85 d2 74 69 f6 05 af c3 63 00 3f 
 75 2d 85 d2 7e 5c 31 f6 31 c0 0f 1f 44 00 00 8b 0d d2 c3 63 00 89 c2 01 f1 
 0f 30 83 c6 01 3b 35 d3 c3 63 00 7c e9 f3 c3 0f 1f 80 00 00 00
 [0.024000] RIP  [8101148a] intel_pmu_lbr_reset+0x2a/0x80
 [0.024000]  RSP 88003e055e78
 [0.024000] ---[ end trace ecbd794f78441b2c ]---
 [0.024002] Kernel panic - not syncing: Attempted to kill init! 
 exitcode=0x000b


 It possibly has something to do with the msr write. Reproducable with:

 qemu-system-x86_64 -enable-kvm -cpu host -kernel bzImage -m 1G -serial 
 file:ser.txt

 In the host dmesg, the following is visible when qemu:

 kvm [4939]: vcpu0 unhandled wrmsr: 0x680 data 0

 The full guest dmesg is shown below. The issue occurs also with
 v3.13.6, v3.12.14, v3.10.33 (other versions are not tested).

 QEMU: 1.7.0
 Host kernel: v3.14-rc5
 Guest kernel: v3.14-rc7-59-g08edb33 (.config on the bottom)

 Kind regards,
 Peter

 ### dmesg
 [0.00] Linux version 3.14.0-rc7-qemu-00059-g08edb33 (pc@antartica) 
 (gcc version 4.8.2 (Ubuntu 4.8.2-16ubuntu6) ) #14 Fri Mar 21 17:30:49 CET 
 2014
 [0.00] Command line: console=ttyS0 loglevel=8
 [0.00] KERNEL supported cpus:
 [0.00]   Intel GenuineIntel
 [0.00] e820: BIOS-provided physical RAM map:
 [0.00] BIOS-e820: [mem 0x-0x0009fbff] usable
 [0.00] BIOS-e820: [mem 0x0009fc00-0x0009] 
 reserved
 [0.00] BIOS-e820: [mem 0x000f-0x000f] 
 reserved
 [0.00] BIOS-e820: [mem 0x0010-0x3fffdfff] usable
 [0.00] BIOS-e820: [mem 0x3fffe000-0x3fff] 
 reserved
 [0.00] BIOS-e820: [mem 0xfeffc000-0xfeff] 
 reserved
 [0.00] BIOS-e820: [mem 0xfffc-0x] 
 reserved
 [0.00] NX (Execute Disable) protection: active
 [0.00] SMBIOS 2.4 present.
 [0.00] DMI: Bochs Bochs, BIOS Bochs 01/01/2011
 [0.00] e820: update [mem 0x-0x0fff] usable == reserved
 [0.00] e820: remove [mem 0x000a-0x000f] usable
 [0.00] e820: last_pfn = 0x3fffe max_arch_pfn = 0x4
 [0.00] MTRR default type: write-back
 [0.00] MTRR fixed ranges enabled:
 [0.00]   0-9 write-back
 [0.00]   A-B uncachable
 [0.00]   C-F write-protect
 [0.00] MTRR variable ranges enabled:
 [0.00]   0 base 008000 mask FF8000 uncachable
 [0.00]   1 

Re: [PATCH] virtio-blk: make the queue depth the max supportable by the hypervisor

2014-03-19 Thread Venkatesh Srinivas
> And I rewrote it substantially, mainly to take
> VIRTIO_RING_F_INDIRECT_DESC into account.
>
> As QEMU sets the vq size for PCI to 128, Venkatash's patch wouldn't
> have made a change.  This version does (since QEMU also offers
> VIRTIO_RING_F_INDIRECT_DESC.

That divide-by-2 produced the same queue depth as the prior
computation in QEMU was deliberate -- but raising it to 128 seems
pretty reasonable.

Signed-off-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] virtio-blk: make the queue depth the max supportable by the hypervisor

2014-03-19 Thread Venkatesh Srinivas
 And I rewrote it substantially, mainly to take
 VIRTIO_RING_F_INDIRECT_DESC into account.

 As QEMU sets the vq size for PCI to 128, Venkatash's patch wouldn't
 have made a change.  This version does (since QEMU also offers
 VIRTIO_RING_F_INDIRECT_DESC.

That divide-by-2 produced the same queue depth as the prior
computation in QEMU was deliberate -- but raising it to 128 seems
pretty reasonable.

Signed-off-by: Venkatesh Srinivas venkate...@google.com

-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

2014-03-14 Thread Venkatesh Srinivas

On Fri, Mar 14, 2014 at 10:57:58AM -0600, David Ahern wrote:

On 3/14/14, 10:17 AM, Andi Kleen wrote:

The Intel ISR section for RDMSR seems to say: "Specifying a reserved
or unimplemented
MSR address in ECX will also cause a general protection exception".

From a guest's perspective, MSR_RAPL_POWER_UNIT is unimplemented; kvm matches
this behavior.


MSRs are model specific and defined per model number. If you report a model
number you're expected to implement the MSRs defined for that model number.

AFAIK Xen just reports 0 for unknown MSRs (and I'm surprised KVM doesn't too)

I would suggest to fix KVM.


I believe ignore_msrs parameter to kvm handles that.

David


Hi,

cc-ing the virtualization mailing list for more detail on the kvm
default for ignore_msrs (it defaults off).

1) Just returning 0 for unsupported MSRs is not workable -- 0 may be a
   meaningful value for an MSR. RDMSR/WRMSR already have a mechanism
   for out-of-band errors, #GP.

2) #GP has been KVM's default behavior for quite some time. Even if we
   believe changing KVM's default is appropriate, Linux w/ the RAPL PMU
   code enabled will fail to boot on existing KVM versions. W/ this
   change, Linux will boot on prior KVM versions.

Thanks,
-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

2014-03-14 Thread Venkatesh Srinivas
On Fri, Mar 14, 2014 at 6:56 AM, Andi Kleen  wrote:
> On Fri, Mar 14, 2014 at 09:44:29AM +0100, Peter Zijlstra wrote:
>> On Thu, Mar 13, 2014 at 12:36:26PM -0700, Venkatesh Srinivas wrote:
>> > CPUs which should support the RAPL counters according to
>> > Family/Model/Stepping may still issue #GP when attempting to access
>> > the RAPL MSRs. This may happen when Linux is running under KVM and
>> > we are passing-through host F/M/S data, for example. Use rdmsrl_safe
>> > to first access the RAPL_POWER_UNIT MSR; if this fails, do not
>> > attempt to use this PMU.
>>
>> This kvm thing is getting tedious :-(
>
> It sounds bogus to me too, VMs are supposed to return 0
> on reading unknown MSRs, no #GP.

The Intel ISR section for RDMSR seems to say: "Specifying a reserved
or unimplemented
MSR address in ECX will also cause a general protection exception".

>From a guest's perspective, MSR_RAPL_POWER_UNIT is unimplemented; kvm matches
this behavior.

Thanks,
-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

2014-03-14 Thread Venkatesh Srinivas
On Fri, Mar 14, 2014 at 6:56 AM, Andi Kleen a...@linux.intel.com wrote:
 On Fri, Mar 14, 2014 at 09:44:29AM +0100, Peter Zijlstra wrote:
 On Thu, Mar 13, 2014 at 12:36:26PM -0700, Venkatesh Srinivas wrote:
  CPUs which should support the RAPL counters according to
  Family/Model/Stepping may still issue #GP when attempting to access
  the RAPL MSRs. This may happen when Linux is running under KVM and
  we are passing-through host F/M/S data, for example. Use rdmsrl_safe
  to first access the RAPL_POWER_UNIT MSR; if this fails, do not
  attempt to use this PMU.

 This kvm thing is getting tedious :-(

 It sounds bogus to me too, VMs are supposed to return 0
 on reading unknown MSRs, no #GP.

The Intel ISR section for RDMSR seems to say: Specifying a reserved
or unimplemented
MSR address in ECX will also cause a general protection exception.

From a guest's perspective, MSR_RAPL_POWER_UNIT is unimplemented; kvm matches
this behavior.

Thanks,
-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

2014-03-14 Thread Venkatesh Srinivas

On Fri, Mar 14, 2014 at 10:57:58AM -0600, David Ahern wrote:

On 3/14/14, 10:17 AM, Andi Kleen wrote:

The Intel ISR section for RDMSR seems to say: Specifying a reserved
or unimplemented
MSR address in ECX will also cause a general protection exception.

From a guest's perspective, MSR_RAPL_POWER_UNIT is unimplemented; kvm matches
this behavior.


MSRs are model specific and defined per model number. If you report a model
number you're expected to implement the MSRs defined for that model number.

AFAIK Xen just reports 0 for unknown MSRs (and I'm surprised KVM doesn't too)

I would suggest to fix KVM.


I believe ignore_msrs parameter to kvm handles that.

David


Hi,

cc-ing the virtualization mailing list for more detail on the kvm
default for ignore_msrs (it defaults off).

1) Just returning 0 for unsupported MSRs is not workable -- 0 may be a
   meaningful value for an MSR. RDMSR/WRMSR already have a mechanism
   for out-of-band errors, #GP.

2) #GP has been KVM's default behavior for quite some time. Even if we
   believe changing KVM's default is appropriate, Linux w/ the RAPL PMU
   code enabled will fail to boot on existing KVM versions. W/ this
   change, Linux will boot on prior KVM versions.

Thanks,
-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

2014-03-13 Thread Venkatesh Srinivas
CPUs which should support the RAPL counters according to
Family/Model/Stepping may still issue #GP when attempting to access
the RAPL MSRs. This may happen when Linux is running under KVM and
we are passing-through host F/M/S data, for example. Use rdmsrl_safe
to first access the RAPL_POWER_UNIT MSR; if this fails, do not
attempt to use this PMU.

Signed-off-by: Venkatesh Srinivas 
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c 
b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index 5ad35ad..95700e5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -511,6 +511,7 @@ static int rapl_cpu_prepare(int cpu)
struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
int phys_id = topology_physical_package_id(cpu);
u64 ms;
+   u64 msr_rapl_power_unit_bits;
 
if (pmu)
return 0;
@@ -518,6 +519,9 @@ static int rapl_cpu_prepare(int cpu)
if (phys_id < 0)
return -1;
 
+   if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, _rapl_power_unit_bits))
+   return -1;
+
pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
if (!pmu)
return -1;
@@ -531,8 +535,7 @@ static int rapl_cpu_prepare(int cpu)
 *
 * we cache in local PMU instance
 */
-   rdmsrl(MSR_RAPL_POWER_UNIT, pmu->hw_unit);
-   pmu->hw_unit = (pmu->hw_unit >> 8) & 0x1FULL;
+   pmu->hw_unit = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
pmu->pmu = _pmu_class;
 
/*
@@ -649,7 +652,9 @@ static int __init rapl_pmu_init(void)
get_online_cpus();
 
for_each_online_cpu(cpu) {
-   rapl_cpu_prepare(cpu);
+   ret = rapl_cpu_prepare(cpu);
+   if (ret)
+   goto out;
rapl_cpu_init(cpu);
}
 
@@ -672,6 +677,7 @@ static int __init rapl_pmu_init(void)
hweight32(rapl_cntr_mask),
ktime_to_ms(pmu->timer_interval));
 
+out:
put_online_cpus();
 
return 0;
-- 
1.9.0.279.gdc9e3eb

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

2014-03-13 Thread Venkatesh Srinivas
CPUs which should support the RAPL counters according to
Family/Model/Stepping may still issue #GP when attempting to access
the RAPL MSRs. This may happen when Linux is running under KVM and
we are passing-through host F/M/S data, for example. Use rdmsrl_safe
to first access the RAPL_POWER_UNIT MSR; if this fails, do not
attempt to use this PMU.

Signed-off-by: Venkatesh Srinivas venkate...@google.com
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c 
b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index 5ad35ad..95700e5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -511,6 +511,7 @@ static int rapl_cpu_prepare(int cpu)
struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
int phys_id = topology_physical_package_id(cpu);
u64 ms;
+   u64 msr_rapl_power_unit_bits;
 
if (pmu)
return 0;
@@ -518,6 +519,9 @@ static int rapl_cpu_prepare(int cpu)
if (phys_id  0)
return -1;
 
+   if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, msr_rapl_power_unit_bits))
+   return -1;
+
pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
if (!pmu)
return -1;
@@ -531,8 +535,7 @@ static int rapl_cpu_prepare(int cpu)
 *
 * we cache in local PMU instance
 */
-   rdmsrl(MSR_RAPL_POWER_UNIT, pmu-hw_unit);
-   pmu-hw_unit = (pmu-hw_unit  8)  0x1FULL;
+   pmu-hw_unit = (msr_rapl_power_unit_bits  8)  0x1FULL;
pmu-pmu = rapl_pmu_class;
 
/*
@@ -649,7 +652,9 @@ static int __init rapl_pmu_init(void)
get_online_cpus();
 
for_each_online_cpu(cpu) {
-   rapl_cpu_prepare(cpu);
+   ret = rapl_cpu_prepare(cpu);
+   if (ret)
+   goto out;
rapl_cpu_init(cpu);
}
 
@@ -672,6 +677,7 @@ static int __init rapl_pmu_init(void)
hweight32(rapl_cntr_mask),
ktime_to_ms(pmu-timer_interval));
 
+out:
put_online_cpus();
 
return 0;
-- 
1.9.0.279.gdc9e3eb

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

2014-03-06 Thread Venkatesh Srinivas
CPUs which should support the RAPL counters according to
Family/Model/Stepping may still issue #GP when attempting to access
the RAPL MSRs. This may happen when Linux is running under KVM and
we are passing-through host F/M/S data, for example. Use rdmsrl_safe
to first access the RAPL_POWER_UNIT MSR; if this fails, do not
attempt to use this PMU.

Signed-off-by: Venkatesh Srinivas 
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c 
b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index 5ad35ad..95700e5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -511,6 +511,7 @@ static int rapl_cpu_prepare(int cpu)
struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
int phys_id = topology_physical_package_id(cpu);
u64 ms;
+   u64 msr_rapl_power_unit_bits;
 
if (pmu)
return 0;
@@ -518,6 +519,9 @@ static int rapl_cpu_prepare(int cpu)
if (phys_id < 0)
return -1;
 
+   if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, _rapl_power_unit_bits))
+   return -1;
+
pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
if (!pmu)
return -1;
@@ -531,8 +535,7 @@ static int rapl_cpu_prepare(int cpu)
 *
 * we cache in local PMU instance
 */
-   rdmsrl(MSR_RAPL_POWER_UNIT, pmu->hw_unit);
-   pmu->hw_unit = (pmu->hw_unit >> 8) & 0x1FULL;
+   pmu->hw_unit = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
pmu->pmu = _pmu_class;
 
/*
@@ -649,7 +652,9 @@ static int __init rapl_pmu_init(void)
get_online_cpus();
 
for_each_online_cpu(cpu) {
-   rapl_cpu_prepare(cpu);
+   ret = rapl_cpu_prepare(cpu);
+   if (ret)
+   goto out;
rapl_cpu_init(cpu);
}
 
@@ -672,6 +677,7 @@ static int __init rapl_pmu_init(void)
hweight32(rapl_cntr_mask),
ktime_to_ms(pmu->timer_interval));
 
+out:
put_online_cpus();
 
return 0;
-- 
1.9.0.279.gdc9e3eb

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] perf/x86/intel: Use rdmsrl_safe when initializing RAPL PMU.

2014-03-06 Thread Venkatesh Srinivas
CPUs which should support the RAPL counters according to
Family/Model/Stepping may still issue #GP when attempting to access
the RAPL MSRs. This may happen when Linux is running under KVM and
we are passing-through host F/M/S data, for example. Use rdmsrl_safe
to first access the RAPL_POWER_UNIT MSR; if this fails, do not
attempt to use this PMU.

Signed-off-by: Venkatesh Srinivas venkate...@google.com
---
 arch/x86/kernel/cpu/perf_event_intel_rapl.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_rapl.c 
b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
index 5ad35ad..95700e5 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_rapl.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_rapl.c
@@ -511,6 +511,7 @@ static int rapl_cpu_prepare(int cpu)
struct rapl_pmu *pmu = per_cpu(rapl_pmu, cpu);
int phys_id = topology_physical_package_id(cpu);
u64 ms;
+   u64 msr_rapl_power_unit_bits;
 
if (pmu)
return 0;
@@ -518,6 +519,9 @@ static int rapl_cpu_prepare(int cpu)
if (phys_id  0)
return -1;
 
+   if (!rdmsrl_safe(MSR_RAPL_POWER_UNIT, msr_rapl_power_unit_bits))
+   return -1;
+
pmu = kzalloc_node(sizeof(*pmu), GFP_KERNEL, cpu_to_node(cpu));
if (!pmu)
return -1;
@@ -531,8 +535,7 @@ static int rapl_cpu_prepare(int cpu)
 *
 * we cache in local PMU instance
 */
-   rdmsrl(MSR_RAPL_POWER_UNIT, pmu-hw_unit);
-   pmu-hw_unit = (pmu-hw_unit  8)  0x1FULL;
+   pmu-hw_unit = (msr_rapl_power_unit_bits  8)  0x1FULL;
pmu-pmu = rapl_pmu_class;
 
/*
@@ -649,7 +652,9 @@ static int __init rapl_pmu_init(void)
get_online_cpus();
 
for_each_online_cpu(cpu) {
-   rapl_cpu_prepare(cpu);
+   ret = rapl_cpu_prepare(cpu);
+   if (ret)
+   goto out;
rapl_cpu_init(cpu);
}
 
@@ -672,6 +677,7 @@ static int __init rapl_pmu_init(void)
hweight32(rapl_cntr_mask),
ktime_to_ms(pmu-timer_interval));
 
+out:
put_online_cpus();
 
return 0;
-- 
1.9.0.279.gdc9e3eb

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] vhost/scsi: Check LUN structure byte 0 is set to 1, per spec

2014-02-24 Thread Venkatesh Srinivas

The virtio spec requires byte 0 of the virtio-scsi LUN structure
to be '1'.

Signed-off-by: Venkatesh Srinivas 
---
 drivers/vhost/scsi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 0a025b8..e48d4a6 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1001,6 +1001,12 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct 
vhost_virtqueue *vq)
break;
}
 
+		/* virtio-scsi spec requires byte 0 of the lun to be 1 */

+   if (unlikely(v_req.lun[0] != 1)) {
+   vhost_scsi_send_bad_target(vs, vq, head, out);
+   continue;
+   }
+
/* Extract the tpgt */
target = v_req.lun[1];
tpg = ACCESS_ONCE(vs_tpg[target]);
--
1.8.5.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] vhost/scsi: Check LUN structure byte 0 is set to 1, per spec

2014-02-24 Thread Venkatesh Srinivas

The virtio spec requires byte 0 of the virtio-scsi LUN structure
to be '1'.

Signed-off-by: Venkatesh Srinivas venkate...@google.com
---
 drivers/vhost/scsi.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/vhost/scsi.c b/drivers/vhost/scsi.c
index 0a025b8..e48d4a6 100644
--- a/drivers/vhost/scsi.c
+++ b/drivers/vhost/scsi.c
@@ -1001,6 +1001,12 @@ vhost_scsi_handle_vq(struct vhost_scsi *vs, struct 
vhost_virtqueue *vq)
break;
}
 
+		/* virtio-scsi spec requires byte 0 of the lun to be 1 */

+   if (unlikely(v_req.lun[0] != 1)) {
+   vhost_scsi_send_bad_target(vs, vq, head, out);
+   continue;
+   }
+
/* Extract the tpgt */
target = v_req.lun[1];
tpg = ACCESS_ONCE(vs_tpg[target]);
--
1.8.5.3
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V6 4/5] virtio-scsi: introduce multiqueue support

2013-03-20 Thread Venkatesh Srinivas

On Wed, Mar 20, 2013 at 03:01:23PM +0800, Wanlong Gao wrote:

From: Paolo Bonzini 

This patch adds queue steering to virtio-scsi.  When a target is sent
multiple requests, we always drive them to the same queue so that FIFO
processing order is kept.  However, if a target was idle, we can choose
a queue arbitrarily.  In this case the queue is chosen according to the
current VCPU, so the driver expects the number of request queues to be
equal to the number of VCPUs.  This makes it easy and fast to select
the queue, and also lets the driver optimize the IRQ affinity for the
virtqueues (each virtqueue's affinity is set to the CPU that "owns"
the queue).

The speedup comes from improving cache locality and giving CPU affinity
to the virtqueues, which is why this scheme was selected.  Assuming that
the thread that is sending requests to the device is I/O-bound, it is
likely to be sleeping at the time the ISR is executed, and thus executing
the ISR on the same processor that sent the requests is cheap.

However, the kernel will not execute the ISR on the "best" processor
unless you explicitly set the affinity.  This is because in practice
you will have many such I/O-bound processes and thus many otherwise
idle processors.  Then the kernel will execute the ISR on a random
processor, rather than the one that is sending requests to the device.

The alternative to per-CPU virtqueues is per-target virtqueues.  To
achieve the same locality, we could dynamically choose the virtqueue's
affinity based on the CPU of the last task that sent a request.  This
is less appealing because we do not set the affinity directly---we only
provide a hint to the irqbalanced running in userspace.  Dynamically
changing the affinity only works if the userspace applies the hint
fast enough.


Looks good! Tested as V5.

Tested-by: Venkatesh Srinivas 

-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V6 4/5] virtio-scsi: introduce multiqueue support

2013-03-20 Thread Venkatesh Srinivas

On Wed, Mar 20, 2013 at 03:01:23PM +0800, Wanlong Gao wrote:

From: Paolo Bonzini pbonz...@redhat.com

This patch adds queue steering to virtio-scsi.  When a target is sent
multiple requests, we always drive them to the same queue so that FIFO
processing order is kept.  However, if a target was idle, we can choose
a queue arbitrarily.  In this case the queue is chosen according to the
current VCPU, so the driver expects the number of request queues to be
equal to the number of VCPUs.  This makes it easy and fast to select
the queue, and also lets the driver optimize the IRQ affinity for the
virtqueues (each virtqueue's affinity is set to the CPU that owns
the queue).

The speedup comes from improving cache locality and giving CPU affinity
to the virtqueues, which is why this scheme was selected.  Assuming that
the thread that is sending requests to the device is I/O-bound, it is
likely to be sleeping at the time the ISR is executed, and thus executing
the ISR on the same processor that sent the requests is cheap.

However, the kernel will not execute the ISR on the best processor
unless you explicitly set the affinity.  This is because in practice
you will have many such I/O-bound processes and thus many otherwise
idle processors.  Then the kernel will execute the ISR on a random
processor, rather than the one that is sending requests to the device.

The alternative to per-CPU virtqueues is per-target virtqueues.  To
achieve the same locality, we could dynamically choose the virtqueue's
affinity based on the CPU of the last task that sent a request.  This
is less appealing because we do not set the affinity directly---we only
provide a hint to the irqbalanced running in userspace.  Dynamically
changing the affinity only works if the userspace applies the hint
fast enough.


Looks good! Tested as V5.

Tested-by: Venkatesh Srinivas venkate...@google.com

-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V5 4/5] virtio-scsi: introduce multiqueue support

2013-03-19 Thread Venkatesh Srinivas

This looks pretty good!

I rather like the (lack of) locking in I/O completion (around the req
count vs. target/queue binding). It is unfortunate that you need to 
hold the per-target lock in virtscsi_pick_vq() though; have any idea

how much that lock hurts?

Just two minor comments:

(in struct virtio_scsi_target_data):
+   /* This spinlock never help at the same time as vq_lock. */
    held?

(in struct virtio_scsi):
+   /* Does the affinity hint is set for virtqueues? */
Could you rephrase that, please?

Tested on qemu and w/ Google Compute Engine's virtio-scsi device.

Reviewed-and-tested-by: Venkatesh Srinivas 

Thanks,
-- vs;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V5 4/5] virtio-scsi: introduce multiqueue support

2013-03-19 Thread Venkatesh Srinivas

This looks pretty good!

I rather like the (lack of) locking in I/O completion (around the req
count vs. target/queue binding). It is unfortunate that you need to 
hold the per-target lock in virtscsi_pick_vq() though; have any idea

how much that lock hurts?

Just two minor comments:

(in struct virtio_scsi_target_data):
+   /* This spinlock never help at the same time as vq_lock. */
    held?

(in struct virtio_scsi):
+   /* Does the affinity hint is set for virtqueues? */
Could you rephrase that, please?

Tested on qemu and w/ Google Compute Engine's virtio-scsi device.

Reviewed-and-tested-by: Venkatesh Srinivas venkate...@google.com

Thanks,
-- vs;
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/