date:20170128

Re: "isert: isert_setup_id: rdma_bind_addr() failed: -19" spam, followed by Recursive Fault on reboot

2017-01-28 Thread Sagi Grimberg




Hello


Hey Steve,


I'm trying (failing) to get iSER working. After rebooting with some settings
saved in targetcli, I got an endless stream of messages like this:

[  192.701299] isert: isert_setup_id: rdma_bind_addr() failed: -19
[  192.702733] isert: isert_setup_id: rdma_bind_addr() failed: -19
[  192.704021] isert: isert_setup_id: rdma_bind_addr() failed: -19
[  192.705458] isert: isert_setup_id: rdma_bind_addr() failed: -19
[  192.706979] isert: isert_setup_id: rdma_bind_addr() failed: -19


You get -ENODEV errors because you don't have an RDMA device.
This is probably due to the fact that the mlx5_ib (or mlx4_ib, depending
on your device) is not loaded.

Can you try loading mlx[4|5]_ib module before you enable iser on
a network portal?

I do see that mlx5 and mlx4 are requesting the mlx_ib module at probe
time, I wander how that didn't happen on your system..

I didn't see a endless loop of this error? can you share your
targetcli json?


I tried deleting everything from targetcli, but the flood would not stop. The
ib_isert module did not unload. When rebooting I got a "Recursive Fault"
with a stacktrace inside configfs.

I hope this is enough information to fix this bug. I assumed the stacktrace
would be saved to the log so I didn't write it down, and I haven't been able to
retrace all the wrong stuff I did trying to make iSER work.

Linux Version: Linux 4.8.15-2~bpo8+2 (Debian 8 Backports)


Would it be possible to try with upstream kernel and report what you
are seeing?

Re: "isert: isert_setup_id: rdma_bind_addr() failed: -19" spam, followed by Recursive Fault on reboot

2017-01-28 Thread Sagi Grimberg




Hello


Hey Steve,


I'm trying (failing) to get iSER working. After rebooting with some settings
saved in targetcli, I got an endless stream of messages like this:

[  192.701299] isert: isert_setup_id: rdma_bind_addr() failed: -19
[  192.702733] isert: isert_setup_id: rdma_bind_addr() failed: -19
[  192.704021] isert: isert_setup_id: rdma_bind_addr() failed: -19
[  192.705458] isert: isert_setup_id: rdma_bind_addr() failed: -19
[  192.706979] isert: isert_setup_id: rdma_bind_addr() failed: -19


You get -ENODEV errors because you don't have an RDMA device.
This is probably due to the fact that the mlx5_ib (or mlx4_ib, depending
on your device) is not loaded.

Can you try loading mlx[4|5]_ib module before you enable iser on
a network portal?

I do see that mlx5 and mlx4 are requesting the mlx_ib module at probe
time, I wander how that didn't happen on your system..

I didn't see a endless loop of this error? can you share your
targetcli json?


I tried deleting everything from targetcli, but the flood would not stop. The
ib_isert module did not unload. When rebooting I got a "Recursive Fault"
with a stacktrace inside configfs.

I hope this is enough information to fix this bug. I assumed the stacktrace
would be saved to the log so I didn't write it down, and I haven't been able to
retrace all the wrong stuff I did trying to make iSER work.

Linux Version: Linux 4.8.15-2~bpo8+2 (Debian 8 Backports)


Would it be possible to try with upstream kernel and report what you
are seeing?

Re: [PATCH] Staging: omap4iss: fix coding style issues

2017-01-28 Thread Ozgur Karatas



28.01.2017, 20:11, "Avraham Shukron" :
> This is a patch that fixes issues in omap4iss/iss_video.c
> Specifically, it fixes "line over 80 characters" issues

Hello,

are you have a sent this changes patch before?
And Greg KH answered you, are you read?

Please send the change once, there is no need for a repeat. 

> Signed-off-by: Avraham Shukron 
>
> ---
>  drivers/staging/media/omap4iss/iss_video.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/staging/media/omap4iss/iss_video.c 
> b/drivers/staging/media/omap4iss/iss_video.c
> index c16927a..cdab053 100644
> --- a/drivers/staging/media/omap4iss/iss_video.c
> +++ b/drivers/staging/media/omap4iss/iss_video.c
> @@ -298,7 +298,8 @@ iss_video_check_format(struct iss_video *video, struct 
> iss_video_fh *vfh)
>
>  static int iss_video_queue_setup(struct vb2_queue *vq,
>   unsigned int *count, unsigned int 
> *num_planes,
> - unsigned int sizes[], struct device *alloc_devs[])
> + unsigned int sizes[],
> + struct device *alloc_devs[])

it should be on the same line, maintainer's up to 80 characters allowed.
this "alloc_devs" variable start with int?

Example:

struct device {
  int (struct device *alloc_devs[);

Check the top lines of the codes.


>  {
>  struct iss_video_fh *vfh = vb2_get_drv_priv(vq);
>  struct iss_video *video = vfh->video;
> @@ -678,8 +679,8 @@ iss_video_get_selection(struct file *file, void *fh, 
> struct v4l2_selection *sel)
>  if (subdev == NULL)
>  return -EINVAL;
>
> - /* Try the get selection operation first and fallback to get format if not
> - * implemented.
> + /* Try the get selection operation first and fallback to get format if
> + * not implemented.
>   */

There is no change here, it opens with comment /* and closes with */.
Please read submittting patch document.

Regards,

>  sdsel.pad = pad;
>  ret = v4l2_subdev_call(subdev, pad, get_selection, NULL, );
> --
> 2.7.4

~Ozgur

Re: [PATCH] Staging: omap4iss: fix coding style issues

2017-01-28 Thread Ozgur Karatas



28.01.2017, 20:11, "Avraham Shukron" :
> This is a patch that fixes issues in omap4iss/iss_video.c
> Specifically, it fixes "line over 80 characters" issues

Hello,

are you have a sent this changes patch before?
And Greg KH answered you, are you read?

Please send the change once, there is no need for a repeat. 

> Signed-off-by: Avraham Shukron 
>
> ---
>  drivers/staging/media/omap4iss/iss_video.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/staging/media/omap4iss/iss_video.c 
> b/drivers/staging/media/omap4iss/iss_video.c
> index c16927a..cdab053 100644
> --- a/drivers/staging/media/omap4iss/iss_video.c
> +++ b/drivers/staging/media/omap4iss/iss_video.c
> @@ -298,7 +298,8 @@ iss_video_check_format(struct iss_video *video, struct 
> iss_video_fh *vfh)
>
>  static int iss_video_queue_setup(struct vb2_queue *vq,
>   unsigned int *count, unsigned int 
> *num_planes,
> - unsigned int sizes[], struct device *alloc_devs[])
> + unsigned int sizes[],
> + struct device *alloc_devs[])

it should be on the same line, maintainer's up to 80 characters allowed.
this "alloc_devs" variable start with int?

Example:

struct device {
  int (struct device *alloc_devs[);

Check the top lines of the codes.


>  {
>  struct iss_video_fh *vfh = vb2_get_drv_priv(vq);
>  struct iss_video *video = vfh->video;
> @@ -678,8 +679,8 @@ iss_video_get_selection(struct file *file, void *fh, 
> struct v4l2_selection *sel)
>  if (subdev == NULL)
>  return -EINVAL;
>
> - /* Try the get selection operation first and fallback to get format if not
> - * implemented.
> + /* Try the get selection operation first and fallback to get format if
> + * not implemented.
>   */

There is no change here, it opens with comment /* and closes with */.
Please read submittting patch document.

Regards,

>  sdsel.pad = pad;
>  ret = v4l2_subdev_call(subdev, pad, get_selection, NULL, );
> --
> 2.7.4

~Ozgur

[RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-28 Thread Dan Williams

Warnings of the following form occur because scsi reuses a devt number
while the block layer still has it referenced as the name of the bdi
[1]:

 WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
 sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192'
 [..]
 Call Trace:
  dump_stack+0x86/0xc3
  __warn+0xcb/0xf0
  warn_slowpath_fmt+0x5f/0x80
  ? kernfs_path_from_node+0x4f/0x60
  sysfs_warn_dup+0x62/0x80
  sysfs_create_dir_ns+0x77/0x90
  kobject_add_internal+0xb2/0x350
  kobject_add+0x75/0xd0
  device_add+0x15a/0x650
  device_create_groups_vargs+0xe0/0xf0
  device_create_vargs+0x1c/0x20
  bdi_register+0x90/0x240
  ? lockdep_init_map+0x57/0x200
  bdi_register_owner+0x36/0x60
  device_add_disk+0x1bb/0x4e0
  ? __pm_runtime_use_autosuspend+0x5c/0x70
  sd_probe_async+0x10d/0x1c0
  async_run_entry_fn+0x39/0x170

This is a brute-force fix to pass the devt release information from
sd_probe() to the locations where we register the bdi,
device_add_disk(), and unregister the bdi, blk_cleanup_queue().

Thanks to Omar for the quick reproducer script [2]. This patch survives
where an unmodified kernel fails in a few seconds.

[1]: https://marc.info/?l=linux-scsi=147116857810716=4
[2]: http://marc.info/?l=linux-block=148554717109098=2

Cc: James Bottomley 
Cc: Bart Van Assche 
Cc: "Martin K. Petersen" 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Reported-by: Omar Sandoval 
Signed-off-by: Dan Williams 
---
 block/blk-core.c   |1 +
 block/genhd.c  |7 +++
 drivers/scsi/sd.c  |   41 +
 include/linux/blkdev.h |1 +
 include/linux/genhd.h  |   17 +
 5 files changed, 59 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 61ba08c58b64..950cea1e202e 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -597,6 +597,7 @@ void blk_cleanup_queue(struct request_queue *q)
spin_unlock_irq(lock);
 
bdi_unregister(>backing_dev_info);
+   put_disk_devt(q->disk_devt);
 
/* @q is and will stay empty, shutdown and put */
blk_put_queue(q);
diff --git a/block/genhd.c b/block/genhd.c
index fcd6d4fae657..eb8009e928f5 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -612,6 +612,13 @@ void device_add_disk(struct device *parent, struct gendisk 
*disk)
 
disk_alloc_events(disk);
 
+   /*
+* Take a reference on the devt and assign it to queue since it
+* must not be reallocated while the bdi is registerted
+*/
+   disk->queue->disk_devt = disk->disk_devt;
+   get_disk_devt(disk->disk_devt);
+
/* Register BDI before referencing it from bdev */
bdi = >queue->backing_dev_info;
bdi_register_owner(bdi, disk_to_dev(disk));
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 0b09638fa39b..09405351577c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3067,6 +3067,23 @@ static void sd_probe_async(void *data, async_cookie_t 
cookie)
put_device(>dev);
 }
 
+struct sd_devt {
+   int idx;
+   struct disk_devt disk_devt;
+};
+
+void sd_devt_release(struct kref *kref)
+{
+   struct sd_devt *sd_devt = container_of(kref, struct sd_devt,
+   disk_devt.kref);
+
+   spin_lock(_index_lock);
+   ida_remove(_index_ida, sd_devt->idx);
+   spin_unlock(_index_lock);
+
+   kfree(sd_devt);
+}
+
 /**
  * sd_probe - called during driver initialization and whenever a
  * new scsi device is attached to the system. It is called once
@@ -3088,6 +3105,7 @@ static void sd_probe_async(void *data, async_cookie_t 
cookie)
 static int sd_probe(struct device *dev)
 {
struct scsi_device *sdp = to_scsi_device(dev);
+   struct sd_devt *sd_devt;
struct scsi_disk *sdkp;
struct gendisk *gd;
int index;
@@ -3113,9 +3131,13 @@ static int sd_probe(struct device *dev)
if (!sdkp)
goto out;
 
+   sd_devt = kzalloc(sizeof(*sd_devt), GFP_KERNEL);
+   if (!sd_devt)
+   goto out_free;
+
gd = alloc_disk(SD_MINORS);
if (!gd)
-   goto out_free;
+   goto out_free_devt;
 
do {
if (!ida_pre_get(_index_ida, GFP_KERNEL))
@@ -3131,6 +3153,11 @@ static int sd_probe(struct device *dev)
goto out_put;
}
 
+   kref_init(_devt->disk_devt.kref);
+   sd_devt->disk_devt.release = sd_devt_release;
+   sd_devt->idx = index;
+   gd->disk_devt = _devt->disk_devt;
+
error = sd_format_disk_name("sd", index, gd->disk_name, DISK_NAME_LEN);
if (error) {
sdev_printk(KERN_WARNING, sdp, "SCSI disk (sd) name length 
exceeded.\n");
@@ -3170,13 +3197,14 @@ static int sd_probe(struct device *dev)
return 0;

[RFC PATCH] scsi, block: fix duplicate bdi name registration crashes

2017-01-28 Thread Dan Williams

Warnings of the following form occur because scsi reuses a devt number
while the block layer still has it referenced as the name of the bdi
[1]:

 WARNING: CPU: 1 PID: 93 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x62/0x80
 sysfs: cannot create duplicate filename '/devices/virtual/bdi/8:192'
 [..]
 Call Trace:
  dump_stack+0x86/0xc3
  __warn+0xcb/0xf0
  warn_slowpath_fmt+0x5f/0x80
  ? kernfs_path_from_node+0x4f/0x60
  sysfs_warn_dup+0x62/0x80
  sysfs_create_dir_ns+0x77/0x90
  kobject_add_internal+0xb2/0x350
  kobject_add+0x75/0xd0
  device_add+0x15a/0x650
  device_create_groups_vargs+0xe0/0xf0
  device_create_vargs+0x1c/0x20
  bdi_register+0x90/0x240
  ? lockdep_init_map+0x57/0x200
  bdi_register_owner+0x36/0x60
  device_add_disk+0x1bb/0x4e0
  ? __pm_runtime_use_autosuspend+0x5c/0x70
  sd_probe_async+0x10d/0x1c0
  async_run_entry_fn+0x39/0x170

This is a brute-force fix to pass the devt release information from
sd_probe() to the locations where we register the bdi,
device_add_disk(), and unregister the bdi, blk_cleanup_queue().

Thanks to Omar for the quick reproducer script [2]. This patch survives
where an unmodified kernel fails in a few seconds.

[1]: https://marc.info/?l=linux-scsi=147116857810716=4
[2]: http://marc.info/?l=linux-block=148554717109098=2

Cc: James Bottomley 
Cc: Bart Van Assche 
Cc: "Martin K. Petersen" 
Cc: Christoph Hellwig 
Cc: Jens Axboe 
Reported-by: Omar Sandoval 
Signed-off-by: Dan Williams 
---
 block/blk-core.c   |1 +
 block/genhd.c  |7 +++
 drivers/scsi/sd.c  |   41 +
 include/linux/blkdev.h |1 +
 include/linux/genhd.h  |   17 +
 5 files changed, 59 insertions(+), 8 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 61ba08c58b64..950cea1e202e 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -597,6 +597,7 @@ void blk_cleanup_queue(struct request_queue *q)
spin_unlock_irq(lock);
 
bdi_unregister(>backing_dev_info);
+   put_disk_devt(q->disk_devt);
 
/* @q is and will stay empty, shutdown and put */
blk_put_queue(q);
diff --git a/block/genhd.c b/block/genhd.c
index fcd6d4fae657..eb8009e928f5 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -612,6 +612,13 @@ void device_add_disk(struct device *parent, struct gendisk 
*disk)
 
disk_alloc_events(disk);
 
+   /*
+* Take a reference on the devt and assign it to queue since it
+* must not be reallocated while the bdi is registerted
+*/
+   disk->queue->disk_devt = disk->disk_devt;
+   get_disk_devt(disk->disk_devt);
+
/* Register BDI before referencing it from bdev */
bdi = >queue->backing_dev_info;
bdi_register_owner(bdi, disk_to_dev(disk));
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 0b09638fa39b..09405351577c 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3067,6 +3067,23 @@ static void sd_probe_async(void *data, async_cookie_t 
cookie)
put_device(>dev);
 }
 
+struct sd_devt {
+   int idx;
+   struct disk_devt disk_devt;
+};
+
+void sd_devt_release(struct kref *kref)
+{
+   struct sd_devt *sd_devt = container_of(kref, struct sd_devt,
+   disk_devt.kref);
+
+   spin_lock(_index_lock);
+   ida_remove(_index_ida, sd_devt->idx);
+   spin_unlock(_index_lock);
+
+   kfree(sd_devt);
+}
+
 /**
  * sd_probe - called during driver initialization and whenever a
  * new scsi device is attached to the system. It is called once
@@ -3088,6 +3105,7 @@ static void sd_probe_async(void *data, async_cookie_t 
cookie)
 static int sd_probe(struct device *dev)
 {
struct scsi_device *sdp = to_scsi_device(dev);
+   struct sd_devt *sd_devt;
struct scsi_disk *sdkp;
struct gendisk *gd;
int index;
@@ -3113,9 +3131,13 @@ static int sd_probe(struct device *dev)
if (!sdkp)
goto out;
 
+   sd_devt = kzalloc(sizeof(*sd_devt), GFP_KERNEL);
+   if (!sd_devt)
+   goto out_free;
+
gd = alloc_disk(SD_MINORS);
if (!gd)
-   goto out_free;
+   goto out_free_devt;
 
do {
if (!ida_pre_get(_index_ida, GFP_KERNEL))
@@ -3131,6 +3153,11 @@ static int sd_probe(struct device *dev)
goto out_put;
}
 
+   kref_init(_devt->disk_devt.kref);
+   sd_devt->disk_devt.release = sd_devt_release;
+   sd_devt->idx = index;
+   gd->disk_devt = _devt->disk_devt;
+
error = sd_format_disk_name("sd", index, gd->disk_name, DISK_NAME_LEN);
if (error) {
sdev_printk(KERN_WARNING, sdp, "SCSI disk (sd) name length 
exceeded.\n");
@@ -3170,13 +3197,14 @@ static int sd_probe(struct device *dev)
return 0;
 
  out_free_index:
-   spin_lock(_index_lock);
-   ida_remove(_index_ida, index);
-   spin_unlock(_index_lock);
+   put_disk_devt(_devt->disk_devt);
+   sd_devt = NULL;

[PATCH 40/60] staging: ptlrpc: leaked rs on difficult reply

2017-01-28 Thread James Simmons

From: Niu Yawei 

reply_out_callback() should call ptlrpc_schedule_difficult_reply()
to finalize the rs if it's already not on uncommitted list, otherwise,
the rs and the export held by rs could be leaked:

- target_send_reply() sends a difficult reply before the transaction
  committed, the reply is linked to scp_rep_active;

- export gets disconnected by umount or whatever reason,
  server_disconnect_export() is called to complete all outstanding
  replies, which will calls into ptlrpc_handle_rs() to dispose of
  the rs, so the rs is removed from the uncommitted list and
  LNetMDUnlink() is called to unlink the reply buffer and generate
  an unlink event;

- reply_out_callback() is called to process above unlink event,
  ptlrpc_schedule_difficult_reply() is supposed to be called to
  dispose of the rs finally. However, it could be skipped because of
  following flawed code snippet:

  if (!rs->rs_no_ack ||
  rs->rs_transno <= rs->rs_export->exp_obd->obd_last_committed)
ptlrpc_schedule_difficult_reply(rs);

The intention of above code is: if rs_no_ack is true (COS enabled),
and transaction is not committed, we should rely on commit callback
to release the rs. However, it overlooked the situation that rs
could have been removed from the uncommitted list by disconnecting
export.

Signed-off-by: Niu Yawei 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7903
Reviewed-on: http://review.whamcloud.com/22696
Reviewed-by: Andreas Dilger 
Reviewed-by: Lai Siyao 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/ptlrpc/events.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c 
b/drivers/staging/lustre/lustre/ptlrpc/events.c
index ae1650d..dc0fe9d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/events.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/events.c
@@ -420,7 +420,8 @@ void reply_out_callback(lnet_event_t *ev)
rs->rs_on_net = 0;
if (!rs->rs_no_ack ||
rs->rs_transno <=
-   rs->rs_export->exp_obd->obd_last_committed)
+   rs->rs_export->exp_obd->obd_last_committed ||
+   list_empty(>rs_obd_list))
ptlrpc_schedule_difficult_reply(rs);
 
spin_unlock(>rs_lock);
-- 
1.8.3.1

[PATCH 40/60] staging: ptlrpc: leaked rs on difficult reply

2017-01-28 Thread James Simmons

From: Niu Yawei 

reply_out_callback() should call ptlrpc_schedule_difficult_reply()
to finalize the rs if it's already not on uncommitted list, otherwise,
the rs and the export held by rs could be leaked:

- target_send_reply() sends a difficult reply before the transaction
  committed, the reply is linked to scp_rep_active;

- export gets disconnected by umount or whatever reason,
  server_disconnect_export() is called to complete all outstanding
  replies, which will calls into ptlrpc_handle_rs() to dispose of
  the rs, so the rs is removed from the uncommitted list and
  LNetMDUnlink() is called to unlink the reply buffer and generate
  an unlink event;

- reply_out_callback() is called to process above unlink event,
  ptlrpc_schedule_difficult_reply() is supposed to be called to
  dispose of the rs finally. However, it could be skipped because of
  following flawed code snippet:

  if (!rs->rs_no_ack ||
  rs->rs_transno <= rs->rs_export->exp_obd->obd_last_committed)
ptlrpc_schedule_difficult_reply(rs);

The intention of above code is: if rs_no_ack is true (COS enabled),
and transaction is not committed, we should rely on commit callback
to release the rs. However, it overlooked the situation that rs
could have been removed from the uncommitted list by disconnecting
export.

Signed-off-by: Niu Yawei 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7903
Reviewed-on: http://review.whamcloud.com/22696
Reviewed-by: Andreas Dilger 
Reviewed-by: Lai Siyao 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/ptlrpc/events.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/events.c 
b/drivers/staging/lustre/lustre/ptlrpc/events.c
index ae1650d..dc0fe9d 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/events.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/events.c
@@ -420,7 +420,8 @@ void reply_out_callback(lnet_event_t *ev)
rs->rs_on_net = 0;
if (!rs->rs_no_ack ||
rs->rs_transno <=
-   rs->rs_export->exp_obd->obd_last_committed)
+   rs->rs_export->exp_obd->obd_last_committed ||
+   list_empty(>rs_obd_list))
ptlrpc_schedule_difficult_reply(rs);
 
spin_unlock(>rs_lock);
-- 
1.8.3.1

[tip:WIP.x86/boot 11/55] arch/x86/include/asm/xen/page.h:302:7: warning: 'struct device' declared inside parameter list

2017-01-28 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/boot
head:   0c6fc11ac343c82d4a2f8348fa6f829e07c12554
commit: 5520b7e7d2d20ae2ab6e07b46c42cd43df9d2799 [11/55] x86/boot/e820: Remove 
spurious asm/e820/api.h inclusions
config: x86_64-randconfig-n0-01291212 (attached as .config)
compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
reproduce:
git checkout 5520b7e7d2d20ae2ab6e07b46c42cd43df9d2799
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from include/xen/page.h:28:0,
from arch/x86/xen/grant-table.c:43:
>> arch/x86/include/asm/xen/page.h:302:7: warning: 'struct device' declared 
>> inside parameter list [enabled by default]
  dma_addr_t dev_addr)
  ^
>> arch/x86/include/asm/xen/page.h:302:7: warning: its scope is only this 
>> definition or declaration, which is probably not what you want [enabled by 
>> default]

vim +302 arch/x86/include/asm/xen/page.h

20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  286 
 #define __pmd_ma(x)((pmd_t) { (x) } )
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  287 
 
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  288 
 #define pgd_val_ma(x)  ((x).pgd)
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  289 
 
eba3ff8b9 arch/x86/include/asm/xen/page.h Jeremy Fitzhardinge   2009-02-09  290 
 void xen_set_domain_pte(pte_t *ptep, pte_t pteval, unsigned domid);
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  291 
 
ce803e705 include/asm-x86/xen/page.h  Jeremy Fitzhardinge   2008-07-08  292 
 xmaddr_t arbitrary_virt_to_machine(void *address);
9976b39b5 arch/x86/include/asm/xen/page.h Jeremy Fitzhardinge   2009-02-27  293 
 unsigned long arbitrary_virt_to_mfn(void *vaddr);
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  294 
 void make_lowmem_page_readonly(void *vaddr);
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  295 
 void make_lowmem_page_readwrite(void *vaddr);
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  296 
 
3216dceb3 arch/x86/include/asm/xen/page.h Stefano Stabellini2013-02-19  297 
 #define xen_remap(cookie, size) ioremap((cookie), (size));
efaf30a33 arch/x86/include/asm/xen/page.h Konrad Rzeszutek Wilk 2014-01-06  298 
 #define xen_unmap(cookie) iounmap((cookie))
3216dceb3 arch/x86/include/asm/xen/page.h Stefano Stabellini2013-02-19  299 
 
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  300 
 static inline bool xen_arch_need_swiotlb(struct device *dev,
291be10fd arch/x86/include/asm/xen/page.h Julien Grall  2015-09-09  301 
 phys_addr_t phys,
291be10fd arch/x86/include/asm/xen/page.h Julien Grall  2015-09-09 @302 
 dma_addr_t dev_addr)
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  303 
 {
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  304 
return false;
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  305 
 }
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  306 
 
8746515d7 arch/x86/include/asm/xen/page.h Stefano Stabellini2015-04-24  307 
 static inline unsigned long xen_get_swiotlb_free_pages(unsigned int order)
8746515d7 arch/x86/include/asm/xen/page.h Stefano Stabellini2015-04-24  308 
 {
8746515d7 arch/x86/include/asm/xen/page.h Stefano Stabellini2015-04-24  309 
return __get_free_pages(__GFP_NOWARN, order);
8746515d7 arch/x86/include/asm/xen/page.h Stefano Stabellini2015-04-24  310 
 }

:: The code at line 302 was first introduced by commit
:: 291be10fd7511101d44cf98166d049bd31bc7600 xen/swiotlb: Pass addresses 
rather than frame numbers to xen_arch_need_swiotlb

:: TO: Julien Grall 
:: CC: David Vrabel 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

[tip:WIP.x86/boot 11/55] arch/x86/include/asm/xen/page.h:302:7: warning: 'struct device' declared inside parameter list

2017-01-28 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/boot
head:   0c6fc11ac343c82d4a2f8348fa6f829e07c12554
commit: 5520b7e7d2d20ae2ab6e07b46c42cd43df9d2799 [11/55] x86/boot/e820: Remove 
spurious asm/e820/api.h inclusions
config: x86_64-randconfig-n0-01291212 (attached as .config)
compiler: gcc-4.8 (Debian 4.8.4-1) 4.8.4
reproduce:
git checkout 5520b7e7d2d20ae2ab6e07b46c42cd43df9d2799
# save the attached .config to linux build tree
make ARCH=x86_64 

All warnings (new ones prefixed by >>):

   In file included from include/xen/page.h:28:0,
from arch/x86/xen/grant-table.c:43:
>> arch/x86/include/asm/xen/page.h:302:7: warning: 'struct device' declared 
>> inside parameter list [enabled by default]
  dma_addr_t dev_addr)
  ^
>> arch/x86/include/asm/xen/page.h:302:7: warning: its scope is only this 
>> definition or declaration, which is probably not what you want [enabled by 
>> default]

vim +302 arch/x86/include/asm/xen/page.h

20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  286 
 #define __pmd_ma(x)((pmd_t) { (x) } )
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  287 
 
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  288 
 #define pgd_val_ma(x)  ((x).pgd)
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  289 
 
eba3ff8b9 arch/x86/include/asm/xen/page.h Jeremy Fitzhardinge   2009-02-09  290 
 void xen_set_domain_pte(pte_t *ptep, pte_t pteval, unsigned domid);
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  291 
 
ce803e705 include/asm-x86/xen/page.h  Jeremy Fitzhardinge   2008-07-08  292 
 xmaddr_t arbitrary_virt_to_machine(void *address);
9976b39b5 arch/x86/include/asm/xen/page.h Jeremy Fitzhardinge   2009-02-27  293 
 unsigned long arbitrary_virt_to_mfn(void *vaddr);
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  294 
 void make_lowmem_page_readonly(void *vaddr);
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  295 
 void make_lowmem_page_readwrite(void *vaddr);
20e71f2ed include/asm-x86/xen/page.h  Isaku Yamahata2008-04-02  296 
 
3216dceb3 arch/x86/include/asm/xen/page.h Stefano Stabellini2013-02-19  297 
 #define xen_remap(cookie, size) ioremap((cookie), (size));
efaf30a33 arch/x86/include/asm/xen/page.h Konrad Rzeszutek Wilk 2014-01-06  298 
 #define xen_unmap(cookie) iounmap((cookie))
3216dceb3 arch/x86/include/asm/xen/page.h Stefano Stabellini2013-02-19  299 
 
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  300 
 static inline bool xen_arch_need_swiotlb(struct device *dev,
291be10fd arch/x86/include/asm/xen/page.h Julien Grall  2015-09-09  301 
 phys_addr_t phys,
291be10fd arch/x86/include/asm/xen/page.h Julien Grall  2015-09-09 @302 
 dma_addr_t dev_addr)
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  303 
 {
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  304 
return false;
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  305 
 }
a4dba1308 arch/x86/include/asm/xen/page.h Stefano Stabellini2014-11-21  306 
 
8746515d7 arch/x86/include/asm/xen/page.h Stefano Stabellini2015-04-24  307 
 static inline unsigned long xen_get_swiotlb_free_pages(unsigned int order)
8746515d7 arch/x86/include/asm/xen/page.h Stefano Stabellini2015-04-24  308 
 {
8746515d7 arch/x86/include/asm/xen/page.h Stefano Stabellini2015-04-24  309 
return __get_free_pages(__GFP_NOWARN, order);
8746515d7 arch/x86/include/asm/xen/page.h Stefano Stabellini2015-04-24  310 
 }

:: The code at line 302 was first introduced by commit
:: 291be10fd7511101d44cf98166d049bd31bc7600 xen/swiotlb: Pass addresses 
rather than frame numbers to xen_arch_need_swiotlb

:: TO: Julien Grall 
:: CC: David Vrabel 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation


.config.gz
Description: application/gzip

Re: [PATCH] f2fs: enhance lookup xattr

2017-01-28 Thread Jaegeuk Kim

Hi Chao,

On 01/24, Chao Yu wrote:

...

>  
> - error = read_all_xattrs(inode, ipage, _addr);
> + error = lookup_all_xattrs(inode, ipage, index, len, name,
> + , _addr);
>   if (error)
>   return error;
>  
> - entry = __find_xattr(base_addr, index, len, name);
> - if (IS_XATTR_LAST_ENTRY(entry)) {
> - error = -ENODATA;
> - goto cleanup;
> - }
> -
> - size = le16_to_cpu(entry->e_value_size);
> + size = __le16_to_cpu(entry->e_value_size);

Looks good to me, except __le16_to_cpu() here.
Do we need to use this instead of le16_to_cpu()?

Thanks,

>  
>   if (buffer && size > buffer_size) {
>   error = -ERANGE;
> - goto cleanup;
> + goto out;
>   }
>  
> + pval = entry->e_name + entry->e_name_len;
> +
>   if (buffer) {
>   char *pval = entry->e_name + entry->e_name_len;
>   memcpy(buffer, pval, size);
>   }
>   error = size;
> -
> -cleanup:
> +out:
>   kzfree(base_addr);
>   return error;
>  }
> diff --git a/fs/f2fs/xattr.h b/fs/f2fs/xattr.h
> index f990de20cdcd..d5a94928c116 100644
> --- a/fs/f2fs/xattr.h
> +++ b/fs/f2fs/xattr.h
> @@ -72,9 +72,10 @@ struct f2fs_xattr_entry {
>   for (entry = XATTR_FIRST_ENTRY(addr);\
>   !IS_XATTR_LAST_ENTRY(entry);\
>   entry = XATTR_NEXT_ENTRY(entry))
> -
> -#define MIN_OFFSET(i)XATTR_ALIGN(inline_xattr_size(i) + PAGE_SIZE -  
> \
> - sizeof(struct node_footer) - sizeof(__u32))
> +#define MAX_XATTR_BLOCK_SIZE (PAGE_SIZE - sizeof(struct node_footer))
> +#define VALID_XATTR_BLOCK_SIZE   (MAX_XATTR_BLOCK_SIZE - sizeof(__u32))
> +#define MIN_OFFSET(i)XATTR_ALIGN(inline_xattr_size(i) +  
> \
> + VALID_XATTR_BLOCK_SIZE)
>  
>  #define MAX_VALUE_LEN(i) (MIN_OFFSET(i) -\
>   sizeof(struct f2fs_xattr_header) -  \
> -- 
> 2.8.2.295.g3f1c1d0

Re: [PATCH] f2fs: enhance lookup xattr

2017-01-28 Thread Jaegeuk Kim

Hi Chao,

On 01/24, Chao Yu wrote:

...

>  
> - error = read_all_xattrs(inode, ipage, _addr);
> + error = lookup_all_xattrs(inode, ipage, index, len, name,
> + , _addr);
>   if (error)
>   return error;
>  
> - entry = __find_xattr(base_addr, index, len, name);
> - if (IS_XATTR_LAST_ENTRY(entry)) {
> - error = -ENODATA;
> - goto cleanup;
> - }
> -
> - size = le16_to_cpu(entry->e_value_size);
> + size = __le16_to_cpu(entry->e_value_size);

Looks good to me, except __le16_to_cpu() here.
Do we need to use this instead of le16_to_cpu()?

Thanks,

>  
>   if (buffer && size > buffer_size) {
>   error = -ERANGE;
> - goto cleanup;
> + goto out;
>   }
>  
> + pval = entry->e_name + entry->e_name_len;
> +
>   if (buffer) {
>   char *pval = entry->e_name + entry->e_name_len;
>   memcpy(buffer, pval, size);
>   }
>   error = size;
> -
> -cleanup:
> +out:
>   kzfree(base_addr);
>   return error;
>  }
> diff --git a/fs/f2fs/xattr.h b/fs/f2fs/xattr.h
> index f990de20cdcd..d5a94928c116 100644
> --- a/fs/f2fs/xattr.h
> +++ b/fs/f2fs/xattr.h
> @@ -72,9 +72,10 @@ struct f2fs_xattr_entry {
>   for (entry = XATTR_FIRST_ENTRY(addr);\
>   !IS_XATTR_LAST_ENTRY(entry);\
>   entry = XATTR_NEXT_ENTRY(entry))
> -
> -#define MIN_OFFSET(i)XATTR_ALIGN(inline_xattr_size(i) + PAGE_SIZE -  
> \
> - sizeof(struct node_footer) - sizeof(__u32))
> +#define MAX_XATTR_BLOCK_SIZE (PAGE_SIZE - sizeof(struct node_footer))
> +#define VALID_XATTR_BLOCK_SIZE   (MAX_XATTR_BLOCK_SIZE - sizeof(__u32))
> +#define MIN_OFFSET(i)XATTR_ALIGN(inline_xattr_size(i) +  
> \
> + VALID_XATTR_BLOCK_SIZE)
>  
>  #define MAX_VALUE_LEN(i) (MIN_OFFSET(i) -\
>   sizeof(struct f2fs_xattr_header) -  \
> -- 
> 2.8.2.295.g3f1c1d0

[PATCH] mips: audit and remove any unnecessary uses of module.h

2017-01-28 Thread Paul Gortmaker

Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends.  That changed
when we forked out support for the latter into the export.h file.

This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig.  In the case of
some code where it is modular, we can extend that to also include
files that are building basic support functionality but not related
to loading or registering the final module; such files also have
no need whatsoever for module.h

The advantage in removing such instances is that module.h itself
sources about 15 other headers; adding significantly to what we feed
cpp, and it can obscure what headers we are effectively using.

Since module.h might have been the implicit source for init.h
(for __init) and for export.h (for EXPORT_SYMBOL) we consider each
instance for the presence of either and replace/add as needed.

Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.

Build coverage of all the mips defconfigs revealed the module.h
header was masking a couple of implicit include instances, so
we add the appropriate headers there.

Cc: David Daney 
Cc: John Crispin 
Cc: Ralf Baechle 
Cc: "Steven J. Hill" 
Cc: linux-m...@linux-mips.org
Signed-off-by: Paul Gortmaker 
---

[I had this split along platform lines into 17 separate commits, but
 when I realized there were only 3 platform specific maintainers called
 out, I thought that was probably overkill.  If the split version is
 desired, let me know.

 Also, this is against v4.10-rc5.  The same patch against linux-next
 will have a trivial conflict in xway dma.c due to a spinlock.h
 addition there. ]

 arch/mips/alchemy/common/dbdma.c   | 2 +-
 arch/mips/alchemy/common/dma.c | 2 +-
 arch/mips/alchemy/common/gpiolib.c | 1 -
 arch/mips/alchemy/common/prom.c| 1 -
 arch/mips/alchemy/common/usb.c | 2 +-
 arch/mips/alchemy/common/vss.c | 2 +-
 arch/mips/alchemy/devboards/bcsr.c | 3 ++-
 arch/mips/ar7/clock.c  | 2 +-
 arch/mips/ar7/gpio.c   | 3 ++-
 arch/mips/ar7/memory.c | 1 -
 arch/mips/ar7/platform.c   | 1 -
 arch/mips/ar7/prom.c   | 2 +-
 arch/mips/ath79/clock.c| 1 -
 arch/mips/ath79/common.c   | 2 +-
 arch/mips/bcm63xx/clk.c| 3 ++-
 arch/mips/bcm63xx/cpu.c| 2 +-
 arch/mips/bcm63xx/cs.c | 3 ++-
 arch/mips/bcm63xx/gpio.c   | 2 +-
 arch/mips/bcm63xx/irq.c| 1 -
 arch/mips/bcm63xx/reset.c  | 3 ++-
 arch/mips/bcm63xx/timer.c  | 3 ++-
 arch/mips/cavium-octeon/crypto/octeon-crypto.c | 2 +-
 arch/mips/cavium-octeon/executive/cvmx-bootmem.c   | 2 +-
 arch/mips/cavium-octeon/executive/cvmx-helper-errata.c | 2 +-
 arch/mips/cavium-octeon/executive/cvmx-sysinfo.c   | 2 +-
 arch/mips/cavium-octeon/smp.c  | 3 ++-
 arch/mips/dec/prom/identify.c  | 2 +-
 arch/mips/dec/setup.c  | 2 +-
 arch/mips/dec/wbflush.c| 4 +---
 arch/mips/jazz/jazzdma.c   | 2 +-
 arch/mips/jz4740/gpio.c| 2 +-
 arch/mips/jz4740/prom.c| 1 -
 arch/mips/jz4740/timer.c   | 3 ++-
 arch/mips/lantiq/xway/dma.c| 3 +--
 arch/mips/lantiq/xway/gptu.c   | 3 +--
 arch/mips/lasat/at93c.c| 1 -
 arch/mips/lasat/sysctl.c   | 1 -
 arch/mips/loongson64/common/cs5536/cs5536_mfgpt.c  | 2 +-
 arch/mips/loongson64/common/env.c  | 2 +-
 arch/mips/loongson64/common/setup.c| 3 ++-
 arch/mips/loongson64/common/uart_base.c| 2 +-
 arch/mips/loongson64/lemote-2f/ec_kb3310b.c| 3 ++-
 arch/mips/loongson64/lemote-2f/irq.c   | 3 ++-
 arch/mips/loongson64/lemote-2f/pm.c| 2 +-
 arch/mips/loongson64/loongson-3/irq.c  | 2 +-
 arch/mips/loongson64/loongson-3/numa.c | 2 +-
 arch/mips/mti-malta/malta-platform.c   | 1 -
 arch/mips/pmcs-msp71xx/msp_prom.c  | 2 +-
 arch/mips/pmcs-msp71xx/msp_time.c  | 1 -

[PATCH] mips: audit and remove any unnecessary uses of module.h

2017-01-28 Thread Paul Gortmaker

Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends.  That changed
when we forked out support for the latter into the export.h file.

This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig.  In the case of
some code where it is modular, we can extend that to also include
files that are building basic support functionality but not related
to loading or registering the final module; such files also have
no need whatsoever for module.h

The advantage in removing such instances is that module.h itself
sources about 15 other headers; adding significantly to what we feed
cpp, and it can obscure what headers we are effectively using.

Since module.h might have been the implicit source for init.h
(for __init) and for export.h (for EXPORT_SYMBOL) we consider each
instance for the presence of either and replace/add as needed.

Also note that MODULE_DEVICE_TABLE is a no-op for non-modular code.

Build coverage of all the mips defconfigs revealed the module.h
header was masking a couple of implicit include instances, so
we add the appropriate headers there.

Cc: David Daney 
Cc: John Crispin 
Cc: Ralf Baechle 
Cc: "Steven J. Hill" 
Cc: linux-m...@linux-mips.org
Signed-off-by: Paul Gortmaker 
---

[I had this split along platform lines into 17 separate commits, but
 when I realized there were only 3 platform specific maintainers called
 out, I thought that was probably overkill.  If the split version is
 desired, let me know.

 Also, this is against v4.10-rc5.  The same patch against linux-next
 will have a trivial conflict in xway dma.c due to a spinlock.h
 addition there. ]

 arch/mips/alchemy/common/dbdma.c   | 2 +-
 arch/mips/alchemy/common/dma.c | 2 +-
 arch/mips/alchemy/common/gpiolib.c | 1 -
 arch/mips/alchemy/common/prom.c| 1 -
 arch/mips/alchemy/common/usb.c | 2 +-
 arch/mips/alchemy/common/vss.c | 2 +-
 arch/mips/alchemy/devboards/bcsr.c | 3 ++-
 arch/mips/ar7/clock.c  | 2 +-
 arch/mips/ar7/gpio.c   | 3 ++-
 arch/mips/ar7/memory.c | 1 -
 arch/mips/ar7/platform.c   | 1 -
 arch/mips/ar7/prom.c   | 2 +-
 arch/mips/ath79/clock.c| 1 -
 arch/mips/ath79/common.c   | 2 +-
 arch/mips/bcm63xx/clk.c| 3 ++-
 arch/mips/bcm63xx/cpu.c| 2 +-
 arch/mips/bcm63xx/cs.c | 3 ++-
 arch/mips/bcm63xx/gpio.c   | 2 +-
 arch/mips/bcm63xx/irq.c| 1 -
 arch/mips/bcm63xx/reset.c  | 3 ++-
 arch/mips/bcm63xx/timer.c  | 3 ++-
 arch/mips/cavium-octeon/crypto/octeon-crypto.c | 2 +-
 arch/mips/cavium-octeon/executive/cvmx-bootmem.c   | 2 +-
 arch/mips/cavium-octeon/executive/cvmx-helper-errata.c | 2 +-
 arch/mips/cavium-octeon/executive/cvmx-sysinfo.c   | 2 +-
 arch/mips/cavium-octeon/smp.c  | 3 ++-
 arch/mips/dec/prom/identify.c  | 2 +-
 arch/mips/dec/setup.c  | 2 +-
 arch/mips/dec/wbflush.c| 4 +---
 arch/mips/jazz/jazzdma.c   | 2 +-
 arch/mips/jz4740/gpio.c| 2 +-
 arch/mips/jz4740/prom.c| 1 -
 arch/mips/jz4740/timer.c   | 3 ++-
 arch/mips/lantiq/xway/dma.c| 3 +--
 arch/mips/lantiq/xway/gptu.c   | 3 +--
 arch/mips/lasat/at93c.c| 1 -
 arch/mips/lasat/sysctl.c   | 1 -
 arch/mips/loongson64/common/cs5536/cs5536_mfgpt.c  | 2 +-
 arch/mips/loongson64/common/env.c  | 2 +-
 arch/mips/loongson64/common/setup.c| 3 ++-
 arch/mips/loongson64/common/uart_base.c| 2 +-
 arch/mips/loongson64/lemote-2f/ec_kb3310b.c| 3 ++-
 arch/mips/loongson64/lemote-2f/irq.c   | 3 ++-
 arch/mips/loongson64/lemote-2f/pm.c| 2 +-
 arch/mips/loongson64/loongson-3/irq.c  | 2 +-
 arch/mips/loongson64/loongson-3/numa.c | 2 +-
 arch/mips/mti-malta/malta-platform.c   | 1 -
 arch/mips/pmcs-msp71xx/msp_prom.c  | 2 +-
 arch/mips/pmcs-msp71xx/msp_time.c  | 1 -
 arch/mips/ralink/clk.c | 3 ++-
 arch/mips/ralink/mt7620.c  | 1 -

Re: [RFC PATCH 0/4] Fast noirq bulk page allocator v2r7

2017-01-28 Thread Andy Lutomirski


On 01/09/2017 08:35 AM, Mel Gorman wrote:

The
fourth patch introduces a bulk page allocator with no in-kernel users as
an example for Jesper and others who want to build a page allocator for
DMA-coherent pages.


If you want an in-kernel user as a test, to validate the API's sanity, 
and to improve performance, how about __vmalloc_area_node()?  :)


--Andy

Re: [RFC PATCH 0/4] Fast noirq bulk page allocator v2r7

2017-01-28 Thread Andy Lutomirski


On 01/09/2017 08:35 AM, Mel Gorman wrote:

The
fourth patch introduces a bulk page allocator with no in-kernel users as
an example for Jesper and others who want to build a page allocator for
DMA-coherent pages.


If you want an in-kernel user as a test, to validate the API's sanity, 
and to improve performance, how about __vmalloc_area_node()?  :)


--Andy

Re: [PATCH 0/2] blackfin: Remove dead DSA code

2017-01-28 Thread Florian Fainelli

On 01/01/2017 02:42 PM, Florian Fainelli wrote:
> Hi all,
> 
> This patch series removes dead DSA code in the blackfin board specific
> code. There is no in tree driver for the KSZ8893M, and clearly this
> would not compile anymore.
> 
> Preparatory patch to help remove the legacy DSA platform device code from
> the tree.

Ping, is anyone maintaining blackfin these days?

> 
> Florian Fainelli (2):
>   blackfin: tcm-bf518: Remove dsa.h inclusion
>   blackfin: ezbrd: Remove non-functional DSA/KSZ8893M code
> 
>  arch/blackfin/mach-bf518/boards/ezbrd.c | 47 
> -
>  arch/blackfin/mach-bf518/boards/tcm-bf518.c |  1 -
>  2 files changed, 48 deletions(-)
> 

-- 
Florian

Re: [PATCH 0/2] blackfin: Remove dead DSA code

2017-01-28 Thread Florian Fainelli

On 01/01/2017 02:42 PM, Florian Fainelli wrote:
> Hi all,
> 
> This patch series removes dead DSA code in the blackfin board specific
> code. There is no in tree driver for the KSZ8893M, and clearly this
> would not compile anymore.
> 
> Preparatory patch to help remove the legacy DSA platform device code from
> the tree.

Ping, is anyone maintaining blackfin these days?

> 
> Florian Fainelli (2):
>   blackfin: tcm-bf518: Remove dsa.h inclusion
>   blackfin: ezbrd: Remove non-functional DSA/KSZ8893M code
> 
>  arch/blackfin/mach-bf518/boards/ezbrd.c | 47 
> -
>  arch/blackfin/mach-bf518/boards/tcm-bf518.c |  1 -
>  2 files changed, 48 deletions(-)
> 

-- 
Florian

Re: [PATCH] ARM: dts: imx53-qsb-common: fix FEC pinmux config

2017-01-28 Thread Shawn Guo

On Wed, Jan 25, 2017 at 06:25:48AM +0100, linux-kernel-...@beckhoff.com wrote:
> From: Patrick Bruenn 
> 
> The pinmux configuration in device tree was different from manual
> muxing in /board/freescale/mx53loco/mx53loco.c
> All pins were configured as NO_PAD_CTL(1 << 31), which was fine as the
> bootloader already did the correct pinmuxing for us.
> But recently u-boot is migrating to reuse device tree files from the
> kernel tree, so it seems to be better to have the correct pinmuxing in
> our files, too.
> 
> Signed-off-by: Patrick Bruenn 

Applied, thanks.

Re: [PATCH] ARM: dts: imx53-qsb-common: fix FEC pinmux config

2017-01-28 Thread Shawn Guo

On Wed, Jan 25, 2017 at 06:25:48AM +0100, linux-kernel-...@beckhoff.com wrote:
> From: Patrick Bruenn 
> 
> The pinmux configuration in device tree was different from manual
> muxing in /board/freescale/mx53loco/mx53loco.c
> All pins were configured as NO_PAD_CTL(1 << 31), which was fine as the
> bootloader already did the correct pinmuxing for us.
> But recently u-boot is migrating to reuse device tree files from the
> kernel tree, so it seems to be better to have the correct pinmuxing in
> our files, too.
> 
> Signed-off-by: Patrick Bruenn 

Applied, thanks.

[PATCH 47/60] staging: lustre: mdc: avoid returning freed request

2017-01-28 Thread James Simmons

From: "John L. Hammond" 

In mdc_close() if ptlrpc_request_pack() fails then set req to NULL so
that an already freed request is not returned in *request.

Signed-off-by: John L. Hammond 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8811
Reviewed-on: https://review.whamcloud.com/23843
Reviewed-by: Patrick Farrell 
Reviewed-by: Andreas Dilger 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 02f57d8..a12035d 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -762,6 +762,7 @@ static int mdc_close(struct obd_export *exp, struct 
md_op_data *op_data,
rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_CLOSE);
if (rc) {
ptlrpc_request_free(req);
+   req = NULL;
goto out;
}
 
-- 
1.8.3.1

[PATCH 47/60] staging: lustre: mdc: avoid returning freed request

2017-01-28 Thread James Simmons

From: "John L. Hammond" 

In mdc_close() if ptlrpc_request_pack() fails then set req to NULL so
that an already freed request is not returned in *request.

Signed-off-by: John L. Hammond 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8811
Reviewed-on: https://review.whamcloud.com/23843
Reviewed-by: Patrick Farrell 
Reviewed-by: Andreas Dilger 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/mdc/mdc_request.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_request.c 
b/drivers/staging/lustre/lustre/mdc/mdc_request.c
index 02f57d8..a12035d 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_request.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_request.c
@@ -762,6 +762,7 @@ static int mdc_close(struct obd_export *exp, struct 
md_op_data *op_data,
rc = ptlrpc_request_pack(req, LUSTRE_MDS_VERSION, MDS_CLOSE);
if (rc) {
ptlrpc_request_free(req);
+   req = NULL;
goto out;
}
 
-- 
1.8.3.1

[PATCH 46/60] staging: lustre: mdc: Make IT_OPEN take lookup bits lock

2017-01-28 Thread James Simmons

From: Patrick Farrell 

An earlier commit accidentally changed handling of IT_OPEN,
making it take the MDS_INODELOCK_UPDATE bits lock instead of
MDS_INODELOCK_LOOKUP. This does not cause any known bugs.

Signed-off-by: Patrick Farrell 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8842
Reviewed-on: https://review.whamcloud.com/23797
Fixes: 70a251f68dea ("staging: lustre: obd: decruft md_enqueue() and 
md_intent_lock()"
Reviewed-by: John L. Hammond 
Reviewed-by: Lai Siyao 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/mdc/mdc_locks.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c 
b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 156add7..91a7243 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -721,7 +721,7 @@ int mdc_enqueue(struct obd_export *exp, struct 
ldlm_enqueue_info *einfo,
LASSERT(!policy);
 
saved_flags |= LDLM_FL_HAS_INTENT;
-   if (it->it_op & (IT_OPEN | IT_UNLINK | IT_GETATTR | IT_READDIR))
+   if (it->it_op & (IT_UNLINK | IT_GETATTR | IT_READDIR))
policy = _policy;
else if (it->it_op & IT_LAYOUT)
policy = _policy;
-- 
1.8.3.1

[PATCH 46/60] staging: lustre: mdc: Make IT_OPEN take lookup bits lock

2017-01-28 Thread James Simmons

From: Patrick Farrell 

An earlier commit accidentally changed handling of IT_OPEN,
making it take the MDS_INODELOCK_UPDATE bits lock instead of
MDS_INODELOCK_LOOKUP. This does not cause any known bugs.

Signed-off-by: Patrick Farrell 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8842
Reviewed-on: https://review.whamcloud.com/23797
Fixes: 70a251f68dea ("staging: lustre: obd: decruft md_enqueue() and 
md_intent_lock()"
Reviewed-by: John L. Hammond 
Reviewed-by: Lai Siyao 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/mdc/mdc_locks.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/mdc/mdc_locks.c 
b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
index 156add7..91a7243 100644
--- a/drivers/staging/lustre/lustre/mdc/mdc_locks.c
+++ b/drivers/staging/lustre/lustre/mdc/mdc_locks.c
@@ -721,7 +721,7 @@ int mdc_enqueue(struct obd_export *exp, struct 
ldlm_enqueue_info *einfo,
LASSERT(!policy);
 
saved_flags |= LDLM_FL_HAS_INTENT;
-   if (it->it_op & (IT_OPEN | IT_UNLINK | IT_GETATTR | IT_READDIR))
+   if (it->it_op & (IT_UNLINK | IT_GETATTR | IT_READDIR))
policy = _policy;
else if (it->it_op & IT_LAYOUT)
policy = _policy;
-- 
1.8.3.1

[tip:WIP.x86/boot 51/55] arch/x86/kernel/e820.c:120:10-11: WARNING: return of 0/1 in function 'e820__mapped_all' with return type bool

2017-01-28 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/boot
head:   0c6fc11ac343c82d4a2f8348fa6f829e07c12554
commit: 81b3e090fa1f237d49c8feb2fa4afe2aabd3a4ff [51/55] x86/boot/e820: Use 
bool in query APIs


coccinelle warnings: (new ones prefixed by >>)

>> arch/x86/kernel/e820.c:120:10-11: WARNING: return of 0/1 in function 
>> 'e820__mapped_all' with return type bool
>> arch/x86/kernel/e820.c:82:9-10: WARNING: return of 0/1 in function 
>> 'e820__mapped_any' with return type bool
--
>> arch/x86/pci/mmconfig-shared.c:462:9-10: WARNING: return of 0/1 in function 
>> 'is_mmconf_reserved' with return type bool
>> arch/x86/pci/mmconfig-shared.c:502:10-11: WARNING: return of 0/1 in function 
>> 'pci_mmcfg_check_reserved' with return type bool

vim +/e820__mapped_all +120 arch/x86/kernel/e820.c

e5540f875 Ingo Molnar 2017-01-28   76   struct e820_entry *entry = 
_table->entries[i];
b79cd8f12 Yinghai Lu  2008-05-11   77  
e5540f875 Ingo Molnar 2017-01-28   78   if (type && entry->type != type)
b79cd8f12 Yinghai Lu  2008-05-11   79   continue;
e5540f875 Ingo Molnar 2017-01-28   80   if (entry->addr >= end || 
entry->addr + entry->size <= start)
b79cd8f12 Yinghai Lu  2008-05-11   81   continue;
b79cd8f12 Yinghai Lu  2008-05-11  @82   return 1;
b79cd8f12 Yinghai Lu  2008-05-11   83   }
b79cd8f12 Yinghai Lu  2008-05-11   84   return 0;
b79cd8f12 Yinghai Lu  2008-05-11   85  }
3bce64f01 Ingo Molnar 2017-01-28   86  EXPORT_SYMBOL_GPL(e820__mapped_any);
b79cd8f12 Yinghai Lu  2008-05-11   87  
b79cd8f12 Yinghai Lu  2008-05-11   88  /*
640e1b38b Ingo Molnar 2017-01-28   89   * This function checks if the entire 
 range is mapped with 'type'.
b79cd8f12 Yinghai Lu  2008-05-11   90   *
640e1b38b Ingo Molnar 2017-01-28   91   * Note: this function only works 
correctly once the E820 table is sorted and
640e1b38b Ingo Molnar 2017-01-28   92   * not-overlapping (at least for the 
range specified), which is the case normally.
b79cd8f12 Yinghai Lu  2008-05-11   93   */
81b3e090f Ingo Molnar 2017-01-28   94  bool __init e820__mapped_all(u64 start, 
u64 end, enum e820_type type)
b79cd8f12 Yinghai Lu  2008-05-11   95  {
b79cd8f12 Yinghai Lu  2008-05-11   96   int i;
b79cd8f12 Yinghai Lu  2008-05-11   97  
bf495573f Ingo Molnar 2017-01-27   98   for (i = 0; i < e820_table->nr_entries; 
i++) {
e5540f875 Ingo Molnar 2017-01-28   99   struct e820_entry *entry = 
_table->entries[i];
b79cd8f12 Yinghai Lu  2008-05-11  100  
e5540f875 Ingo Molnar 2017-01-28  101   if (type && entry->type != type)
b79cd8f12 Yinghai Lu  2008-05-11  102   continue;
640e1b38b Ingo Molnar 2017-01-28  103  
640e1b38b Ingo Molnar 2017-01-28  104   /* Is the region (part) in 
overlap with the current region? */
e5540f875 Ingo Molnar 2017-01-28  105   if (entry->addr >= end || 
entry->addr + entry->size <= start)
b79cd8f12 Yinghai Lu  2008-05-11  106   continue;
b79cd8f12 Yinghai Lu  2008-05-11  107  
640e1b38b Ingo Molnar 2017-01-28  108   /*
640e1b38b Ingo Molnar 2017-01-28  109* If the region is at the 
beginning of  we move
640e1b38b Ingo Molnar 2017-01-28  110* 'start' to the end of the 
region since it's ok until there
b79cd8f12 Yinghai Lu  2008-05-11  111*/
e5540f875 Ingo Molnar 2017-01-28  112   if (entry->addr <= start)
e5540f875 Ingo Molnar 2017-01-28  113   start = entry->addr + 
entry->size;
640e1b38b Ingo Molnar 2017-01-28  114  
b79cd8f12 Yinghai Lu  2008-05-11  115   /*
640e1b38b Ingo Molnar 2017-01-28  116* If 'start' is now at or 
beyond 'end', we're done, full
640e1b38b Ingo Molnar 2017-01-28  117* coverage of the desired 
range exists:
b79cd8f12 Yinghai Lu  2008-05-11  118*/
b79cd8f12 Yinghai Lu  2008-05-11  119   if (start >= end)
b79cd8f12 Yinghai Lu  2008-05-11 @120   return 1;
b79cd8f12 Yinghai Lu  2008-05-11  121   }
b79cd8f12 Yinghai Lu  2008-05-11  122   return 0;
b79cd8f12 Yinghai Lu  2008-05-11  123  }

:: The code at line 120 was first introduced by commit
:: b79cd8f1268bab57ff85b19d131f7f23deab2dee x86: make e820.c to have common 
functions

:: TO: Yinghai Lu 
:: CC: Thomas Gleixner 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

[tip:WIP.x86/boot 51/55] arch/x86/kernel/e820.c:120:10-11: WARNING: return of 0/1 in function 'e820__mapped_all' with return type bool

2017-01-28 Thread kbuild test robot

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.x86/boot
head:   0c6fc11ac343c82d4a2f8348fa6f829e07c12554
commit: 81b3e090fa1f237d49c8feb2fa4afe2aabd3a4ff [51/55] x86/boot/e820: Use 
bool in query APIs


coccinelle warnings: (new ones prefixed by >>)

>> arch/x86/kernel/e820.c:120:10-11: WARNING: return of 0/1 in function 
>> 'e820__mapped_all' with return type bool
>> arch/x86/kernel/e820.c:82:9-10: WARNING: return of 0/1 in function 
>> 'e820__mapped_any' with return type bool
--
>> arch/x86/pci/mmconfig-shared.c:462:9-10: WARNING: return of 0/1 in function 
>> 'is_mmconf_reserved' with return type bool
>> arch/x86/pci/mmconfig-shared.c:502:10-11: WARNING: return of 0/1 in function 
>> 'pci_mmcfg_check_reserved' with return type bool

vim +/e820__mapped_all +120 arch/x86/kernel/e820.c

e5540f875 Ingo Molnar 2017-01-28   76   struct e820_entry *entry = 
_table->entries[i];
b79cd8f12 Yinghai Lu  2008-05-11   77  
e5540f875 Ingo Molnar 2017-01-28   78   if (type && entry->type != type)
b79cd8f12 Yinghai Lu  2008-05-11   79   continue;
e5540f875 Ingo Molnar 2017-01-28   80   if (entry->addr >= end || 
entry->addr + entry->size <= start)
b79cd8f12 Yinghai Lu  2008-05-11   81   continue;
b79cd8f12 Yinghai Lu  2008-05-11  @82   return 1;
b79cd8f12 Yinghai Lu  2008-05-11   83   }
b79cd8f12 Yinghai Lu  2008-05-11   84   return 0;
b79cd8f12 Yinghai Lu  2008-05-11   85  }
3bce64f01 Ingo Molnar 2017-01-28   86  EXPORT_SYMBOL_GPL(e820__mapped_any);
b79cd8f12 Yinghai Lu  2008-05-11   87  
b79cd8f12 Yinghai Lu  2008-05-11   88  /*
640e1b38b Ingo Molnar 2017-01-28   89   * This function checks if the entire 
 range is mapped with 'type'.
b79cd8f12 Yinghai Lu  2008-05-11   90   *
640e1b38b Ingo Molnar 2017-01-28   91   * Note: this function only works 
correctly once the E820 table is sorted and
640e1b38b Ingo Molnar 2017-01-28   92   * not-overlapping (at least for the 
range specified), which is the case normally.
b79cd8f12 Yinghai Lu  2008-05-11   93   */
81b3e090f Ingo Molnar 2017-01-28   94  bool __init e820__mapped_all(u64 start, 
u64 end, enum e820_type type)
b79cd8f12 Yinghai Lu  2008-05-11   95  {
b79cd8f12 Yinghai Lu  2008-05-11   96   int i;
b79cd8f12 Yinghai Lu  2008-05-11   97  
bf495573f Ingo Molnar 2017-01-27   98   for (i = 0; i < e820_table->nr_entries; 
i++) {
e5540f875 Ingo Molnar 2017-01-28   99   struct e820_entry *entry = 
_table->entries[i];
b79cd8f12 Yinghai Lu  2008-05-11  100  
e5540f875 Ingo Molnar 2017-01-28  101   if (type && entry->type != type)
b79cd8f12 Yinghai Lu  2008-05-11  102   continue;
640e1b38b Ingo Molnar 2017-01-28  103  
640e1b38b Ingo Molnar 2017-01-28  104   /* Is the region (part) in 
overlap with the current region? */
e5540f875 Ingo Molnar 2017-01-28  105   if (entry->addr >= end || 
entry->addr + entry->size <= start)
b79cd8f12 Yinghai Lu  2008-05-11  106   continue;
b79cd8f12 Yinghai Lu  2008-05-11  107  
640e1b38b Ingo Molnar 2017-01-28  108   /*
640e1b38b Ingo Molnar 2017-01-28  109* If the region is at the 
beginning of  we move
640e1b38b Ingo Molnar 2017-01-28  110* 'start' to the end of the 
region since it's ok until there
b79cd8f12 Yinghai Lu  2008-05-11  111*/
e5540f875 Ingo Molnar 2017-01-28  112   if (entry->addr <= start)
e5540f875 Ingo Molnar 2017-01-28  113   start = entry->addr + 
entry->size;
640e1b38b Ingo Molnar 2017-01-28  114  
b79cd8f12 Yinghai Lu  2008-05-11  115   /*
640e1b38b Ingo Molnar 2017-01-28  116* If 'start' is now at or 
beyond 'end', we're done, full
640e1b38b Ingo Molnar 2017-01-28  117* coverage of the desired 
range exists:
b79cd8f12 Yinghai Lu  2008-05-11  118*/
b79cd8f12 Yinghai Lu  2008-05-11  119   if (start >= end)
b79cd8f12 Yinghai Lu  2008-05-11 @120   return 1;
b79cd8f12 Yinghai Lu  2008-05-11  121   }
b79cd8f12 Yinghai Lu  2008-05-11  122   return 0;
b79cd8f12 Yinghai Lu  2008-05-11  123  }

:: The code at line 120 was first introduced by commit
:: b79cd8f1268bab57ff85b19d131f7f23deab2dee x86: make e820.c to have common 
functions

:: TO: Yinghai Lu 
:: CC: Thomas Gleixner 

---
0-DAY kernel test infrastructureOpen Source Technology Center
https://lists.01.org/pipermail/kbuild-all   Intel Corporation

RE: [PATCH v2 0/2] scsi: storvsc: Add support for FC lightweight host.

2017-01-28 Thread KY Srinivasan



> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Thursday, January 26, 2017 6:52 AM
> To: Cathy Avery 
> Cc: KY Srinivasan ; h...@infradead.org; Haiyang Zhang
> ; j...@linux.vnet.ibm.com;
> martin.peter...@oracle.com; dan.carpen...@oracle.com;
> de...@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> s...@vger.kernel.org; f...@redhat.com
> Subject: Re: [PATCH v2 0/2] scsi: storvsc: Add support for FC lightweight 
> host.
> 
> On Thu, Jan 26, 2017 at 08:38:58AM -0500, Cathy Avery wrote:
> > Included in the current storvsc driver for Hyper-V is the ability
> > to access luns on an FC fabric via a virtualized fiber channel
> > adapter exposed by the Hyper-V host. This was done to provide an
> > interface for existing customer tools that was more consistent with
> > a conventional FC device. The driver attaches to the FC transport
> > to allow host and port names to be published under
> > /sys/class/fc_host/hostX.
> >
> > A problem arose when attaching to the FC transport. The scsi_scan code
> > attempts to call fc_user_scan which has basically become a no-op
> > due to the virtualized nature of the FC host
> > ( missing rports, vports, etc ). At this point you cannot refresh
> > the scsi bus after mapping or unmapping luns on the SAN without
> > a reboot.
> 
> I don't think a device without rports or vports is a FC device, plain and
> simple.  So as far as I'm concerned we should remove the code from storvsc
> that pretends to be FC, and not add it to virtio to start with.
> 
> And again I think leightweight is a very confusing name -
> what exactly is leight or heavy?   It's really fake or dummy
> in the current version.

Windows has chosen this model for virtualizing FC devices to the guest -
without rports (or vports). As I noted in my earlier email, James came up with
this notion of a lightweight template almost a year ago. We can certainly pick 
a 
more appropriate name and include better documentation. 
> 
> >
> > 2) Removes an original workaround dealing with replacing
> > the eh_timed_out function. Patch 1 will not set the
> > scsi_transport_template.eh_timed_out function directly during
> > lightweight fc_attach_transport(). It instead relies on
> > whatever was indicated as the scsi_host_template timeout handler
> > during scsi_times_out() scsi_error.c. So the workaround is
> > no longer necessary.
> 
> Can you send a patch that gets rid of the transport class timeout handler
> entirely?  I think it's simply the wrong layering we have here - the
> driver needs to be in control of timeouts, and if it wants it can
> optionally call into library code in the transport class.

We will address this concern.
> 
> 
> FYI, all the long-term relevant explanation need to go into the patches
> themselves (patch description or code comments), not in the cover
> letter.

We will address this.

Regards,

K. Y

RE: [PATCH v2 0/2] scsi: storvsc: Add support for FC lightweight host.

2017-01-28 Thread KY Srinivasan



> -Original Message-
> From: Christoph Hellwig [mailto:h...@infradead.org]
> Sent: Thursday, January 26, 2017 6:52 AM
> To: Cathy Avery 
> Cc: KY Srinivasan ; h...@infradead.org; Haiyang Zhang
> ; j...@linux.vnet.ibm.com;
> martin.peter...@oracle.com; dan.carpen...@oracle.com;
> de...@linuxdriverproject.org; linux-kernel@vger.kernel.org; linux-
> s...@vger.kernel.org; f...@redhat.com
> Subject: Re: [PATCH v2 0/2] scsi: storvsc: Add support for FC lightweight 
> host.
> 
> On Thu, Jan 26, 2017 at 08:38:58AM -0500, Cathy Avery wrote:
> > Included in the current storvsc driver for Hyper-V is the ability
> > to access luns on an FC fabric via a virtualized fiber channel
> > adapter exposed by the Hyper-V host. This was done to provide an
> > interface for existing customer tools that was more consistent with
> > a conventional FC device. The driver attaches to the FC transport
> > to allow host and port names to be published under
> > /sys/class/fc_host/hostX.
> >
> > A problem arose when attaching to the FC transport. The scsi_scan code
> > attempts to call fc_user_scan which has basically become a no-op
> > due to the virtualized nature of the FC host
> > ( missing rports, vports, etc ). At this point you cannot refresh
> > the scsi bus after mapping or unmapping luns on the SAN without
> > a reboot.
> 
> I don't think a device without rports or vports is a FC device, plain and
> simple.  So as far as I'm concerned we should remove the code from storvsc
> that pretends to be FC, and not add it to virtio to start with.
> 
> And again I think leightweight is a very confusing name -
> what exactly is leight or heavy?   It's really fake or dummy
> in the current version.

Windows has chosen this model for virtualizing FC devices to the guest -
without rports (or vports). As I noted in my earlier email, James came up with
this notion of a lightweight template almost a year ago. We can certainly pick 
a 
more appropriate name and include better documentation. 
> 
> >
> > 2) Removes an original workaround dealing with replacing
> > the eh_timed_out function. Patch 1 will not set the
> > scsi_transport_template.eh_timed_out function directly during
> > lightweight fc_attach_transport(). It instead relies on
> > whatever was indicated as the scsi_host_template timeout handler
> > during scsi_times_out() scsi_error.c. So the workaround is
> > no longer necessary.
> 
> Can you send a patch that gets rid of the transport class timeout handler
> entirely?  I think it's simply the wrong layering we have here - the
> driver needs to be in control of timeouts, and if it wants it can
> optionally call into library code in the transport class.

We will address this concern.
> 
> 
> FYI, all the long-term relevant explanation need to go into the patches
> themselves (patch description or code comments), not in the cover
> letter.

We will address this.

Regards,

K. Y

[PATCH 38/60] staging: lustre: llite: Adding timed wait in ll_umount_begin

2017-01-28 Thread James Simmons

From: Rahul Deshmukh 

There exists timing race between umount and other
thread which will increment the reference count on
mnt e.g. getattr. If umount thread lose the race
then umount fails with EBUSY error. To avoid this
timed wait is added so that umount thread will wait
for user to decrement the mnt reference count.

Signed-off-by: Rahul Deshmukh 
Signed-off-by: Lokesh Nagappa Jaliminche 
Signed-off-by: Jian Yu 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1882
Seagate-bug-id: MRP-1192
Reviewed-on: http://review.whamcloud.com/20061
Reviewed-by: Andreas Dilger 
Reviewed-by: Lai Siyao 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/llite/llite_internal.h |  1 +
 drivers/staging/lustre/lustre/llite/llite_lib.c  | 12 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 501957c..ecdfd0c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -506,6 +506,7 @@ struct ll_sb_info {
 */
/* root squash */
struct root_squash_info   ll_squash;
+   struct path  ll_mnt;
 
__kernel_fsid_t   ll_fsid;
struct kobject   ll_kobj; /* sysfs object */
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 0a87058..b229cbc 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -304,6 +304,7 @@ static int client_common_fill_super(struct super_block *sb, 
char *md, char *dt,
sb->s_magic = LL_SUPER_MAGIC;
sb->s_maxbytes = MAX_LFS_FILESIZE;
sbi->ll_namelen = osfs->os_namelen;
+   sbi->ll_mnt.mnt = current->fs->root.mnt;
 
if ((sbi->ll_flags & LL_SBI_USER_XATTR) &&
!(data->ocd_connect_flags & OBD_CONNECT_XATTR)) {
@@ -1990,6 +1991,8 @@ void ll_umount_begin(struct super_block *sb)
struct ll_sb_info *sbi = ll_s2sbi(sb);
struct obd_device *obd;
struct obd_ioctl_data *ioc_data;
+   wait_queue_head_t waitq;
+   struct l_wait_info lwi;
 
CDEBUG(D_VFSTRACE, "VFS Op: superblock %p count %d active %d\n", sb,
   sb->s_count, atomic_read(>s_active));
@@ -2022,9 +2025,14 @@ void ll_umount_begin(struct super_block *sb)
}
 
/* Really, we'd like to wait until there are no requests outstanding,
-* and then continue.  For now, we just invalidate the requests,
-* schedule() and sleep one second if needed, and hope.
+* and then continue. For now, we just periodically checking for vfs
+* to decrement mnt_cnt and hope to finish it within 10sec.
 */
+   init_waitqueue_head();
+   lwi = LWI_TIMEOUT_INTERVAL(cfs_time_seconds(10),
+  cfs_time_seconds(1), NULL, NULL);
+   l_wait_event(waitq, may_umount(sbi->ll_mnt.mnt), );
+
schedule();
 }
 
-- 
1.8.3.1

[PATCH 38/60] staging: lustre: llite: Adding timed wait in ll_umount_begin

2017-01-28 Thread James Simmons

From: Rahul Deshmukh 

There exists timing race between umount and other
thread which will increment the reference count on
mnt e.g. getattr. If umount thread lose the race
then umount fails with EBUSY error. To avoid this
timed wait is added so that umount thread will wait
for user to decrement the mnt reference count.

Signed-off-by: Rahul Deshmukh 
Signed-off-by: Lokesh Nagappa Jaliminche 
Signed-off-by: Jian Yu 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-1882
Seagate-bug-id: MRP-1192
Reviewed-on: http://review.whamcloud.com/20061
Reviewed-by: Andreas Dilger 
Reviewed-by: Lai Siyao 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/llite/llite_internal.h |  1 +
 drivers/staging/lustre/lustre/llite/llite_lib.c  | 12 ++--
 2 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 501957c..ecdfd0c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -506,6 +506,7 @@ struct ll_sb_info {
 */
/* root squash */
struct root_squash_info   ll_squash;
+   struct path  ll_mnt;
 
__kernel_fsid_t   ll_fsid;
struct kobject   ll_kobj; /* sysfs object */
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 0a87058..b229cbc 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -304,6 +304,7 @@ static int client_common_fill_super(struct super_block *sb, 
char *md, char *dt,
sb->s_magic = LL_SUPER_MAGIC;
sb->s_maxbytes = MAX_LFS_FILESIZE;
sbi->ll_namelen = osfs->os_namelen;
+   sbi->ll_mnt.mnt = current->fs->root.mnt;
 
if ((sbi->ll_flags & LL_SBI_USER_XATTR) &&
!(data->ocd_connect_flags & OBD_CONNECT_XATTR)) {
@@ -1990,6 +1991,8 @@ void ll_umount_begin(struct super_block *sb)
struct ll_sb_info *sbi = ll_s2sbi(sb);
struct obd_device *obd;
struct obd_ioctl_data *ioc_data;
+   wait_queue_head_t waitq;
+   struct l_wait_info lwi;
 
CDEBUG(D_VFSTRACE, "VFS Op: superblock %p count %d active %d\n", sb,
   sb->s_count, atomic_read(>s_active));
@@ -2022,9 +2025,14 @@ void ll_umount_begin(struct super_block *sb)
}
 
/* Really, we'd like to wait until there are no requests outstanding,
-* and then continue.  For now, we just invalidate the requests,
-* schedule() and sleep one second if needed, and hope.
+* and then continue. For now, we just periodically checking for vfs
+* to decrement mnt_cnt and hope to finish it within 10sec.
 */
+   init_waitqueue_head();
+   lwi = LWI_TIMEOUT_INTERVAL(cfs_time_seconds(10),
+  cfs_time_seconds(1), NULL, NULL);
+   l_wait_event(waitq, may_umount(sbi->ll_mnt.mnt), );
+
schedule();
 }
 
-- 
1.8.3.1

[PATCH 32/50] x86/boot/e820: Create coherent API function names for E820 range operations

2017-01-28 Thread Ingo Molnar

We have these three related functions:

 extern void e820_add_region(u64 start, u64 size, int type);
 extern u64  e820_update_range(u64 start, u64 size, unsigned old_type, unsigned 
new_type);
 extern u64  e820_remove_range(u64 start, u64 size, unsigned old_type, int 
checktype);

But it's not clear from the naming that they are 3 operations based around the
same 'memory range' concept. Rename them to better signal this, and move
the prototypes next to each other:

 extern void e820__range_add   (u64 start, u64 size, int type);
 extern u64  e820__range_update(u64 start, u64 size, unsigned old_type, 
unsigned new_type);
 extern u64  e820__range_remove(u64 start, u64 size, unsigned old_type, int 
checktype);

Note that this improved organization of the functions shows another problem 
that was easy
to miss before: sometimes the E820 entry type is 'int', sometimes 'unsigned 
int' - but this
will be fixed in a separate patch.

No change in functionality.

Cc: Alex Thorlton 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dan Williams 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Huang, Ying 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Paul Jackson 
Cc: Peter Zijlstra 
Cc: Rafael J. Wysocki 
Cc: Tejun Heo 
Cc: Thomas Gleixner 
Cc: Wei Yang 
Cc: Yinghai Lu 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/e820/api.h|  8 +---
 arch/x86/kernel/acpi/boot.c|  2 +-
 arch/x86/kernel/aperture_64.c  |  2 +-
 arch/x86/kernel/cpu/mtrr/cleanup.c |  2 +-
 arch/x86/kernel/e820.c | 48 

 arch/x86/kernel/early-quirks.c |  2 +-
 arch/x86/kernel/setup.c| 10 +-
 arch/x86/lguest/boot.c |  2 +-
 arch/x86/platform/efi/efi.c|  2 +-
 arch/x86/xen/setup.c   |  6 +++---
 drivers/acpi/tables.c  |  2 +-
 11 files changed, 44 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
index d4374ba26472..1c3615825115 100644
--- a/arch/x86/include/asm/e820/api.h
+++ b/arch/x86/include/asm/e820/api.h
@@ -10,11 +10,13 @@ extern unsigned long pci_mem_start;
 
 extern int  e820__mapped_any(u64 start, u64 end, unsigned type);
 extern int  e820__mapped_all(u64 start, u64 end, unsigned type);
-extern void e820_add_region(u64 start, u64 size, int type);
+
+extern void e820__range_add   (u64 start, u64 size, int type);
+extern u64  e820__range_update(u64 start, u64 size, unsigned old_type, 
unsigned new_type);
+extern u64  e820__range_remove(u64 start, u64 size, unsigned old_type, int 
checktype);
+
 extern void e820_print_map(char *who);
 extern int  e820__update_table(struct e820_entry *biosmap, int max_nr_map, u32 
*pnr_map);
-extern u64  e820_update_range(u64 start, u64 size, unsigned old_type, unsigned 
new_type);
-extern u64  e820_remove_range(u64 start, u64 size, unsigned old_type, int 
checktype);
 extern void e820__update_table_print(void);
 extern void e820__setup_pci_gap(void);
 extern void e820__memory_setup_extended(u64 phys_addr, u32 data_len);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 8590f4891760..31b350c6a3b1 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -1715,6 +1715,6 @@ int __acpi_release_global_lock(unsigned int *lock)
 
 void __init arch_reserve_mem_area(acpi_physical_address addr, size_t size)
 {
-   e820_add_region(addr, size, E820_ACPI);
+   e820__range_add(addr, size, E820_ACPI);
e820__update_table_print();
 }
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index d027858a306e..883485684435 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -311,7 +311,7 @@ void __init early_gart_iommu_check(void)
/* reserve it, so we can reuse it in second kernel */
pr_info("e820: reserve [mem %#010Lx-%#010Lx] for 
GART\n",
aper_base, aper_base + aper_size - 1);
-   e820_add_region(aper_base, aper_size, E820_RESERVED);
+   e820__range_add(aper_base, aper_size, E820_RESERVED);
e820__update_table_print();
}
}
diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c 
b/arch/x86/kernel/cpu/mtrr/cleanup.c
index e201401a7ced..244aaa988ecd 100644
--- a/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -860,7 +860,7 @@ real_trim_memory(unsigned long start_pfn, unsigned long 
limit_pfn)

[PATCH 32/50] x86/boot/e820: Create coherent API function names for E820 range operations

2017-01-28 Thread Ingo Molnar

We have these three related functions:

 extern void e820_add_region(u64 start, u64 size, int type);
 extern u64  e820_update_range(u64 start, u64 size, unsigned old_type, unsigned 
new_type);
 extern u64  e820_remove_range(u64 start, u64 size, unsigned old_type, int 
checktype);

But it's not clear from the naming that they are 3 operations based around the
same 'memory range' concept. Rename them to better signal this, and move
the prototypes next to each other:

 extern void e820__range_add   (u64 start, u64 size, int type);
 extern u64  e820__range_update(u64 start, u64 size, unsigned old_type, 
unsigned new_type);
 extern u64  e820__range_remove(u64 start, u64 size, unsigned old_type, int 
checktype);

Note that this improved organization of the functions shows another problem 
that was easy
to miss before: sometimes the E820 entry type is 'int', sometimes 'unsigned 
int' - but this
will be fixed in a separate patch.

No change in functionality.

Cc: Alex Thorlton 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dan Williams 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Huang, Ying 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Paul Jackson 
Cc: Peter Zijlstra 
Cc: Rafael J. Wysocki 
Cc: Tejun Heo 
Cc: Thomas Gleixner 
Cc: Wei Yang 
Cc: Yinghai Lu 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar 
---
 arch/x86/include/asm/e820/api.h|  8 +---
 arch/x86/kernel/acpi/boot.c|  2 +-
 arch/x86/kernel/aperture_64.c  |  2 +-
 arch/x86/kernel/cpu/mtrr/cleanup.c |  2 +-
 arch/x86/kernel/e820.c | 48 

 arch/x86/kernel/early-quirks.c |  2 +-
 arch/x86/kernel/setup.c| 10 +-
 arch/x86/lguest/boot.c |  2 +-
 arch/x86/platform/efi/efi.c|  2 +-
 arch/x86/xen/setup.c   |  6 +++---
 drivers/acpi/tables.c  |  2 +-
 11 files changed, 44 insertions(+), 42 deletions(-)

diff --git a/arch/x86/include/asm/e820/api.h b/arch/x86/include/asm/e820/api.h
index d4374ba26472..1c3615825115 100644
--- a/arch/x86/include/asm/e820/api.h
+++ b/arch/x86/include/asm/e820/api.h
@@ -10,11 +10,13 @@ extern unsigned long pci_mem_start;
 
 extern int  e820__mapped_any(u64 start, u64 end, unsigned type);
 extern int  e820__mapped_all(u64 start, u64 end, unsigned type);
-extern void e820_add_region(u64 start, u64 size, int type);
+
+extern void e820__range_add   (u64 start, u64 size, int type);
+extern u64  e820__range_update(u64 start, u64 size, unsigned old_type, 
unsigned new_type);
+extern u64  e820__range_remove(u64 start, u64 size, unsigned old_type, int 
checktype);
+
 extern void e820_print_map(char *who);
 extern int  e820__update_table(struct e820_entry *biosmap, int max_nr_map, u32 
*pnr_map);
-extern u64  e820_update_range(u64 start, u64 size, unsigned old_type, unsigned 
new_type);
-extern u64  e820_remove_range(u64 start, u64 size, unsigned old_type, int 
checktype);
 extern void e820__update_table_print(void);
 extern void e820__setup_pci_gap(void);
 extern void e820__memory_setup_extended(u64 phys_addr, u32 data_len);
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 8590f4891760..31b350c6a3b1 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -1715,6 +1715,6 @@ int __acpi_release_global_lock(unsigned int *lock)
 
 void __init arch_reserve_mem_area(acpi_physical_address addr, size_t size)
 {
-   e820_add_region(addr, size, E820_ACPI);
+   e820__range_add(addr, size, E820_ACPI);
e820__update_table_print();
 }
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index d027858a306e..883485684435 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -311,7 +311,7 @@ void __init early_gart_iommu_check(void)
/* reserve it, so we can reuse it in second kernel */
pr_info("e820: reserve [mem %#010Lx-%#010Lx] for 
GART\n",
aper_base, aper_base + aper_size - 1);
-   e820_add_region(aper_base, aper_size, E820_RESERVED);
+   e820__range_add(aper_base, aper_size, E820_RESERVED);
e820__update_table_print();
}
}
diff --git a/arch/x86/kernel/cpu/mtrr/cleanup.c 
b/arch/x86/kernel/cpu/mtrr/cleanup.c
index e201401a7ced..244aaa988ecd 100644
--- a/arch/x86/kernel/cpu/mtrr/cleanup.c
+++ b/arch/x86/kernel/cpu/mtrr/cleanup.c
@@ -860,7 +860,7 @@ real_trim_memory(unsigned long start_pfn, unsigned long 
limit_pfn)
trim_size <<= PAGE_SHIFT;
trim_size -= trim_start;
 
-   return e820_update_range(trim_start, trim_size, E820_RAM, 
E820_RESERVED);
+   return e820__range_update(trim_start, trim_size, E820_RAM, 
E820_RESERVED);
 }
 
 /**
diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c
index cb25c7248656..f91901ab9263 100644
--- a/arch/x86/kernel/e820.c
+++

[PATCH 42/50] xen, x86/boot/e820: Simplify Xen's xen_e820_table construct

2017-01-28 Thread Ingo Molnar

The Xen guest memory setup code has:

static struct e820_entry xen_e820_table[E820_MAX_ENTRIES] __initdata;
static u32 xen_e820_table_entries __initdata;

... which is really a 'struct e820_table', open-coded.

Convert the Xen code over to use a single struct e820_table, as this
will allow the simplification of the e820__update_table() API.

No intended change in functionality, but not runtime tested.

Cc: Alex Thorlton 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dan Williams 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Huang, Ying 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Paul Jackson 
Cc: Peter Zijlstra 
Cc: Rafael J. Wysocki 
Cc: Tejun Heo 
Cc: Thomas Gleixner 
Cc: Wei Yang 
Cc: Yinghai Lu 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar 
---
 arch/x86/xen/setup.c | 70 
+-
 1 file changed, 33 insertions(+), 37 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 381a0d3577a7..cf29abfc392c 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -41,8 +41,7 @@ struct xen_memory_region 
xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
 unsigned long xen_released_pages;
 
 /* E820 map used during setting up memory. */
-static struct e820_entry xen_e820_table[E820_MAX_ENTRIES] __initdata;
-static u32 xen_e820_table_entries __initdata;
+static struct e820_table xen_e820_table __initdata;
 
 /*
  * Buffer used to remap identity mapped pages. We only need the virtual space.
@@ -198,11 +197,11 @@ void __init xen_inv_extra_mem(void)
  */
 static unsigned long __init xen_find_pfn_range(unsigned long *min_pfn)
 {
-   const struct e820_entry *entry = xen_e820_table;
+   const struct e820_entry *entry = xen_e820_table.entries;
unsigned int i;
unsigned long done = 0;
 
-   for (i = 0; i < xen_e820_table_entries; i++, entry++) {
+   for (i = 0; i < xen_e820_table.nr_entries; i++, entry++) {
unsigned long s_pfn;
unsigned long e_pfn;
 
@@ -457,7 +456,7 @@ static unsigned long __init xen_foreach_remap_area(unsigned 
long nr_pages,
 {
phys_addr_t start = 0;
unsigned long ret_val = 0;
-   const struct e820_entry *entry = xen_e820_table;
+   const struct e820_entry *entry = xen_e820_table.entries;
int i;
 
/*
@@ -471,9 +470,9 @@ static unsigned long __init xen_foreach_remap_area(unsigned 
long nr_pages,
 * example) the DMI tables in a reserved region that begins on
 * a non-page boundary.
 */
-   for (i = 0; i < xen_e820_table_entries; i++, entry++) {
+   for (i = 0; i < xen_e820_table.nr_entries; i++, entry++) {
phys_addr_t end = entry->addr + entry->size;
-   if (entry->type == E820_TYPE_RAM || i == xen_e820_table_entries 
- 1) {
+   if (entry->type == E820_TYPE_RAM || i == 
xen_e820_table.nr_entries - 1) {
unsigned long start_pfn = PFN_DOWN(start);
unsigned long end_pfn = PFN_UP(end);
 
@@ -601,10 +600,10 @@ static void __init 
xen_align_and_add_e820_region(phys_addr_t start,
 
 static void __init xen_ignore_unusable(void)
 {
-   struct e820_entry *entry = xen_e820_table;
+   struct e820_entry *entry = xen_e820_table.entries;
unsigned int i;
 
-   for (i = 0; i < xen_e820_table_entries; i++, entry++) {
+   for (i = 0; i < xen_e820_table.nr_entries; i++, entry++) {
if (entry->type == E820_TYPE_UNUSABLE)
entry->type = E820_TYPE_RAM;
}
@@ -620,9 +619,9 @@ bool __init xen_is_e820_reserved(phys_addr_t start, 
phys_addr_t size)
return false;
 
end = start + size;
-   entry = xen_e820_table;
+   entry = xen_e820_table.entries;
 
-   for (mapcnt = 0; mapcnt < xen_e820_table_entries; mapcnt++) {
+   for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) {
if (entry->type == E820_TYPE_RAM && entry->addr <= start &&
(entry->addr + entry->size) >= end)
return false;
@@ -645,9 +644,9 @@ phys_addr_t __init xen_find_free_area(phys_addr_t size)
 {
unsigned mapcnt;
phys_addr_t addr, start;
-   struct e820_entry *entry = xen_e820_table;
+   struct e820_entry *entry = xen_e820_table.entries;
 
-   for (mapcnt = 0; mapcnt < xen_e820_table_entries; mapcnt++, entry++) {
+   for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++, entry++) 
{
if

[PATCH 42/50] xen, x86/boot/e820: Simplify Xen's xen_e820_table construct

2017-01-28 Thread Ingo Molnar

The Xen guest memory setup code has:

static struct e820_entry xen_e820_table[E820_MAX_ENTRIES] __initdata;
static u32 xen_e820_table_entries __initdata;

... which is really a 'struct e820_table', open-coded.

Convert the Xen code over to use a single struct e820_table, as this
will allow the simplification of the e820__update_table() API.

No intended change in functionality, but not runtime tested.

Cc: Alex Thorlton 
Cc: Andy Lutomirski 
Cc: Borislav Petkov 
Cc: Brian Gerst 
Cc: Dan Williams 
Cc: Denys Vlasenko 
Cc: H. Peter Anvin 
Cc: Huang, Ying 
Cc: Josh Poimboeuf 
Cc: Juergen Gross 
Cc: Linus Torvalds 
Cc: Paul Jackson 
Cc: Peter Zijlstra 
Cc: Rafael J. Wysocki 
Cc: Tejun Heo 
Cc: Thomas Gleixner 
Cc: Wei Yang 
Cc: Yinghai Lu 
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar 
---
 arch/x86/xen/setup.c | 70 
+-
 1 file changed, 33 insertions(+), 37 deletions(-)

diff --git a/arch/x86/xen/setup.c b/arch/x86/xen/setup.c
index 381a0d3577a7..cf29abfc392c 100644
--- a/arch/x86/xen/setup.c
+++ b/arch/x86/xen/setup.c
@@ -41,8 +41,7 @@ struct xen_memory_region 
xen_extra_mem[XEN_EXTRA_MEM_MAX_REGIONS] __initdata;
 unsigned long xen_released_pages;
 
 /* E820 map used during setting up memory. */
-static struct e820_entry xen_e820_table[E820_MAX_ENTRIES] __initdata;
-static u32 xen_e820_table_entries __initdata;
+static struct e820_table xen_e820_table __initdata;
 
 /*
  * Buffer used to remap identity mapped pages. We only need the virtual space.
@@ -198,11 +197,11 @@ void __init xen_inv_extra_mem(void)
  */
 static unsigned long __init xen_find_pfn_range(unsigned long *min_pfn)
 {
-   const struct e820_entry *entry = xen_e820_table;
+   const struct e820_entry *entry = xen_e820_table.entries;
unsigned int i;
unsigned long done = 0;
 
-   for (i = 0; i < xen_e820_table_entries; i++, entry++) {
+   for (i = 0; i < xen_e820_table.nr_entries; i++, entry++) {
unsigned long s_pfn;
unsigned long e_pfn;
 
@@ -457,7 +456,7 @@ static unsigned long __init xen_foreach_remap_area(unsigned 
long nr_pages,
 {
phys_addr_t start = 0;
unsigned long ret_val = 0;
-   const struct e820_entry *entry = xen_e820_table;
+   const struct e820_entry *entry = xen_e820_table.entries;
int i;
 
/*
@@ -471,9 +470,9 @@ static unsigned long __init xen_foreach_remap_area(unsigned 
long nr_pages,
 * example) the DMI tables in a reserved region that begins on
 * a non-page boundary.
 */
-   for (i = 0; i < xen_e820_table_entries; i++, entry++) {
+   for (i = 0; i < xen_e820_table.nr_entries; i++, entry++) {
phys_addr_t end = entry->addr + entry->size;
-   if (entry->type == E820_TYPE_RAM || i == xen_e820_table_entries 
- 1) {
+   if (entry->type == E820_TYPE_RAM || i == 
xen_e820_table.nr_entries - 1) {
unsigned long start_pfn = PFN_DOWN(start);
unsigned long end_pfn = PFN_UP(end);
 
@@ -601,10 +600,10 @@ static void __init 
xen_align_and_add_e820_region(phys_addr_t start,
 
 static void __init xen_ignore_unusable(void)
 {
-   struct e820_entry *entry = xen_e820_table;
+   struct e820_entry *entry = xen_e820_table.entries;
unsigned int i;
 
-   for (i = 0; i < xen_e820_table_entries; i++, entry++) {
+   for (i = 0; i < xen_e820_table.nr_entries; i++, entry++) {
if (entry->type == E820_TYPE_UNUSABLE)
entry->type = E820_TYPE_RAM;
}
@@ -620,9 +619,9 @@ bool __init xen_is_e820_reserved(phys_addr_t start, 
phys_addr_t size)
return false;
 
end = start + size;
-   entry = xen_e820_table;
+   entry = xen_e820_table.entries;
 
-   for (mapcnt = 0; mapcnt < xen_e820_table_entries; mapcnt++) {
+   for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++) {
if (entry->type == E820_TYPE_RAM && entry->addr <= start &&
(entry->addr + entry->size) >= end)
return false;
@@ -645,9 +644,9 @@ phys_addr_t __init xen_find_free_area(phys_addr_t size)
 {
unsigned mapcnt;
phys_addr_t addr, start;
-   struct e820_entry *entry = xen_e820_table;
+   struct e820_entry *entry = xen_e820_table.entries;
 
-   for (mapcnt = 0; mapcnt < xen_e820_table_entries; mapcnt++, entry++) {
+   for (mapcnt = 0; mapcnt < xen_e820_table.nr_entries; mapcnt++, entry++) 
{
if (entry->type != E820_TYPE_RAM || entry->size < size)
continue;
start = entry->addr;
@@ -750,8 +749,8 @@ char * __init xen_memory_setup(void)
max_pfn = min(max_pfn, xen_start_info->nr_pages);
mem_end = PFN_PHYS(max_pfn);
 
-   memmap.nr_entries = ARRAY_SIZE(xen_e820_table);
-   set_xen_guest_handle(memmap.buffer,

Re: [PATCH 0/6] ncr5380: Miscellaneous minor patches

2017-01-28 Thread Finn Thain


On Sat, 28 Jan 2017, Ondrej Zary wrote:

> On Monday 16 January 2017 00:50:57 Finn Thain wrote:
> > This series removes some unused code and related comments, addresses 
> > the warnings generated by 'make W=1' and 'make C=1' and fixes a 
> > theoretical bug in the bus reset method in atari_scsi.
> >
> > There's also a patch to add a missing error check during target 
> > selection. The only target I tested was a QUANTUM DAYTONA514S disk as 
> > that's all I have access to right now. Some testing with other targets 
> > would be prudent.
> >
> > Michael, Ondrej, can I get you to review/test please?
> 
> Tested on HP C2502 (53C400A chip), Canon FG2-5202 (53C400 chip), 
> DTC-3181L (DTCT-436P chip) and MS-PNR (53C400A chip) ISA cards - 
> everything works fine!
> 
> Targets tested:
> QUANTUM  LP240S GM240S01X
> IBM  DORS-32160
> IBM  0663L12
> 
> Thanks.
> 
> Tested-by: Ondrej Zary 
> 

Very helpful. Thank you, Ondrej.

--

Re: [PATCH 0/6] ncr5380: Miscellaneous minor patches

2017-01-28 Thread Finn Thain


On Sat, 28 Jan 2017, Ondrej Zary wrote:

> On Monday 16 January 2017 00:50:57 Finn Thain wrote:
> > This series removes some unused code and related comments, addresses 
> > the warnings generated by 'make W=1' and 'make C=1' and fixes a 
> > theoretical bug in the bus reset method in atari_scsi.
> >
> > There's also a patch to add a missing error check during target 
> > selection. The only target I tested was a QUANTUM DAYTONA514S disk as 
> > that's all I have access to right now. Some testing with other targets 
> > would be prudent.
> >
> > Michael, Ondrej, can I get you to review/test please?
> 
> Tested on HP C2502 (53C400A chip), Canon FG2-5202 (53C400 chip), 
> DTC-3181L (DTCT-436P chip) and MS-PNR (53C400A chip) ISA cards - 
> everything works fine!
> 
> Targets tested:
> QUANTUM  LP240S GM240S01X
> IBM  DORS-32160
> IBM  0663L12
> 
> Thanks.
> 
> Tested-by: Ondrej Zary 
> 

Very helpful. Thank you, Ondrej.

--

Re: [PATCH v5 1/8] PCI: Recognize Thunderbolt devices

2017-01-28 Thread Lukas Wunner

On Sat, Jan 28, 2017 at 03:52:08PM -0600, Bjorn Helgaas wrote:
> On Sun, Jan 15, 2017 at 09:03:45PM +0100, Lukas Wunner wrote:
> > We're about to allow runtime PM on Thunderbolt ports in
> > pci_bridge_d3_possible() and unblock runtime PM for Thunderbolt host
> > hotplug ports in pci_dev_check_d3cold().  In both cases we need to
> > uniquely identify if a PCI device belongs to a Thunderbolt controller.
> 
> Sounds like "a device belongs to a Thunderbolt controller" means the
> device is part of a Thunderbolt controller or part of the hierarchy
> below it?

The above paragraph and the following two in the commit message are
intended to explain the need for this additional bit in struct pci_dev.

Yes, the bit is set on all PCI devices that are part of the Thunderbolt
controller (upstream bridge, downstream bridges, NHI and on Thunderbolt 3
there's also an XHCI) as well as on all PCI devices below it.

That's why it says /* part of Thunderbolt daisy chain */ in the line
added to pci.h at the bottom of this patch.

> > We also have the need to detect presence of a Thunderbolt controller in
> > drivers/platform/x86/apple-gmux.c because dual GPU MacBook Pros cannot
> > switch external DP/HDMI ports between GPUs if they have Thunderbolt.
> 
> This series doesn't touch apple-gmux.c, and I don't know anything
> about this MacBook Pro topology, so I can't tell why Thunderbolt is
> relevant here.

It's just another example why this bit in struct pci_dev is needed:

Dual GPU MacBook Pros introduced before 2011 are able to switch the
external DisplayPort between GPUs.  All newer models lost this ability
and the external port can only be driven by the discrete GPU.  That's
because the port is no longer just used for DisplayPort, it's become a
combined DP/Thunderbolt port.  I guess the wiring would have been too
complicated to keep the external port switchable between GPUs and also
use it for Thunderbolt.  They already had to go to great lengths and
put various redrivers on the logic board to support the combined
DP/thunderbolt port.

We need to recognize if the model has Thunderbolt and in that case keep
the external port switched to the discrete GPU.  I have a patch for this
in the pipeline but this one needs to go in first.

> > Furthermore, in multiple places in the DRM subsystem we need to detect
> > whether a GPU is on-board or attached with Thunderbolt.  As an example,
> > Thunderbolt-attached GPUs shall not be registered with vga_switcheroo.
> 
> Why?  The connection between vga_switcheroo and Thunderbolt is not
> obvious, at least to this non-GPU person.

nouveau, radeon and amdgpu register any GPU they find with vga_switcheroo,
but vga_switcheroo only becomes enabled if the system has Optimus, AMD
PowerXpress or an apple-gmux controller.

If the user connects an external GPU to a dual GPU laptop, that external
GPU will be registered with vga_switcheroo as well.  When that external
GPU runtime suspends, vga_switcheroo will invoke the callback to cut
power to the internal discrete GPU and obviously things go south at that
point.

The solution is to not register external GPUs with vga_switcheroo at all.
For this I need the is_thunderbolt bit.

[snip]
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1206,6 +1206,37 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
> > pdev->is_hotplug_bridge = 1;
> >  }
> >  
> > +static void set_pcie_vendor_specific(struct pci_dev *dev)
> 
> This is very specific to Thunderbolt, so let's name it something that
> conveys that information.  The fact that we use a vendor-specific
> capability to figure it out isn't really relevant in the caller.

I thought that we may have the necessity in the future to parse other
VSECs on device probe, so I gave the function this generic name.

Think about it, every VSEC that needs to be parsed needs the while loop
below.  It's more efficient to have only a single while loop that handles
*all* VSECs at once.

If someone needs to parse another VSEC, they just add it to this function.
So IMO the way I've solved it is preferable to just adding a Thunderbolt-
specific function.

Are you sure you want this renamed? (y/n)

> > +{
> > +   int vsec = 0;
> > +   u32 header;
> > +
> > +   while ((vsec = pci_find_next_ext_capability(dev, vsec,
> > +   PCI_EXT_CAP_ID_VNDR))) {
> > +   pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER, );
> > +
> > +   /* Is the device part of a Thunderbolt controller? */
> > +   if (dev->vendor == PCI_VENDOR_ID_INTEL &&
> > +   PCI_VNDR_HEADER_ID(header) == PCI_VSEC_ID_INTEL_TBT)
> > +   dev->is_thunderbolt = 1;
> 
>   return;

Well, see above.  I don't want to return here to allow parsing other VSECs.

Thanks,

Lukas

> > +   }
> > +
> > +   /*
> > +* Is the device attached with Thunderbolt?  Walk upwards and check for
> > +* each encountered bridge if it's part of

Re: [PATCH v5 1/8] PCI: Recognize Thunderbolt devices

2017-01-28 Thread Lukas Wunner

On Sat, Jan 28, 2017 at 03:52:08PM -0600, Bjorn Helgaas wrote:
> On Sun, Jan 15, 2017 at 09:03:45PM +0100, Lukas Wunner wrote:
> > We're about to allow runtime PM on Thunderbolt ports in
> > pci_bridge_d3_possible() and unblock runtime PM for Thunderbolt host
> > hotplug ports in pci_dev_check_d3cold().  In both cases we need to
> > uniquely identify if a PCI device belongs to a Thunderbolt controller.
> 
> Sounds like "a device belongs to a Thunderbolt controller" means the
> device is part of a Thunderbolt controller or part of the hierarchy
> below it?

The above paragraph and the following two in the commit message are
intended to explain the need for this additional bit in struct pci_dev.

Yes, the bit is set on all PCI devices that are part of the Thunderbolt
controller (upstream bridge, downstream bridges, NHI and on Thunderbolt 3
there's also an XHCI) as well as on all PCI devices below it.

That's why it says /* part of Thunderbolt daisy chain */ in the line
added to pci.h at the bottom of this patch.

> > We also have the need to detect presence of a Thunderbolt controller in
> > drivers/platform/x86/apple-gmux.c because dual GPU MacBook Pros cannot
> > switch external DP/HDMI ports between GPUs if they have Thunderbolt.
> 
> This series doesn't touch apple-gmux.c, and I don't know anything
> about this MacBook Pro topology, so I can't tell why Thunderbolt is
> relevant here.

It's just another example why this bit in struct pci_dev is needed:

Dual GPU MacBook Pros introduced before 2011 are able to switch the
external DisplayPort between GPUs.  All newer models lost this ability
and the external port can only be driven by the discrete GPU.  That's
because the port is no longer just used for DisplayPort, it's become a
combined DP/Thunderbolt port.  I guess the wiring would have been too
complicated to keep the external port switchable between GPUs and also
use it for Thunderbolt.  They already had to go to great lengths and
put various redrivers on the logic board to support the combined
DP/thunderbolt port.

We need to recognize if the model has Thunderbolt and in that case keep
the external port switched to the discrete GPU.  I have a patch for this
in the pipeline but this one needs to go in first.

> > Furthermore, in multiple places in the DRM subsystem we need to detect
> > whether a GPU is on-board or attached with Thunderbolt.  As an example,
> > Thunderbolt-attached GPUs shall not be registered with vga_switcheroo.
> 
> Why?  The connection between vga_switcheroo and Thunderbolt is not
> obvious, at least to this non-GPU person.

nouveau, radeon and amdgpu register any GPU they find with vga_switcheroo,
but vga_switcheroo only becomes enabled if the system has Optimus, AMD
PowerXpress or an apple-gmux controller.

If the user connects an external GPU to a dual GPU laptop, that external
GPU will be registered with vga_switcheroo as well.  When that external
GPU runtime suspends, vga_switcheroo will invoke the callback to cut
power to the internal discrete GPU and obviously things go south at that
point.

The solution is to not register external GPUs with vga_switcheroo at all.
For this I need the is_thunderbolt bit.

[snip]
> > --- a/drivers/pci/probe.c
> > +++ b/drivers/pci/probe.c
> > @@ -1206,6 +1206,37 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev)
> > pdev->is_hotplug_bridge = 1;
> >  }
> >  
> > +static void set_pcie_vendor_specific(struct pci_dev *dev)
> 
> This is very specific to Thunderbolt, so let's name it something that
> conveys that information.  The fact that we use a vendor-specific
> capability to figure it out isn't really relevant in the caller.

I thought that we may have the necessity in the future to parse other
VSECs on device probe, so I gave the function this generic name.

Think about it, every VSEC that needs to be parsed needs the while loop
below.  It's more efficient to have only a single while loop that handles
*all* VSECs at once.

If someone needs to parse another VSEC, they just add it to this function.
So IMO the way I've solved it is preferable to just adding a Thunderbolt-
specific function.

Are you sure you want this renamed? (y/n)

> > +{
> > +   int vsec = 0;
> > +   u32 header;
> > +
> > +   while ((vsec = pci_find_next_ext_capability(dev, vsec,
> > +   PCI_EXT_CAP_ID_VNDR))) {
> > +   pci_read_config_dword(dev, vsec + PCI_VNDR_HEADER, );
> > +
> > +   /* Is the device part of a Thunderbolt controller? */
> > +   if (dev->vendor == PCI_VENDOR_ID_INTEL &&
> > +   PCI_VNDR_HEADER_ID(header) == PCI_VSEC_ID_INTEL_TBT)
> > +   dev->is_thunderbolt = 1;
> 
>   return;

Well, see above.  I don't want to return here to allow parsing other VSECs.

Thanks,

Lukas

> > +   }
> > +
> > +   /*
> > +* Is the device attached with Thunderbolt?  Walk upwards and check for
> > +* each encountered bridge if it's part of

[PATCH 44/60] staging: lustre: libcfs: fix error messages

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

Don't treat unability to set CPU partition affinity as error.
Improve those warning messages.

Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8703
Reviewed-on: https://review.whamcloud.com/23307
Reviewed-by: Patrick Farrell 
Reviewed-by: Doug Oucharek 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 4 ++--
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c   | 5 +++--
 drivers/staging/lustre/lnet/libcfs/workitem.c  | 2 +-
 drivers/staging/lustre/lnet/selftest/module.c  | 3 ++-
 5 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 14dbc53..e2f3f72 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3546,7 +3546,7 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
 
rc = cfs_cpt_bind(lnet_cpt_table(), sched->ibs_cpt);
if (rc) {
-   CWARN("Failed to bind on CPT %d, please verify whether all CPUs 
are healthy and reload modules if necessary, otherwise your system might under 
risk of low performance\n",
+   CWARN("Unable to bind on CPU partition %d, please verify 
whether all CPUs are healthy and reload modules if necessary, otherwise your 
system might under risk of low performance\n",
  sched->ibs_cpt);
}
 
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index 3531e7d..df4f55e 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -1414,8 +1414,8 @@ int ksocknal_scheduler(void *arg)
 
rc = cfs_cpt_bind(lnet_cpt_table(), info->ksi_cpt);
if (rc) {
-   CERROR("Can't set CPT affinity to %d: %d\n",
-  info->ksi_cpt, rc);
+   CWARN("Can't set CPU partition affinity to %d: %d\n",
+ info->ksi_cpt, rc);
}
 
spin_lock_bh(>kss_lock);
diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 62ab76e..4d35a37 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -1082,8 +1082,9 @@ static int cfs_cpu_dead(unsigned int cpu)
}
spin_unlock(_data.cpt_lock);
 
-   LCONSOLE(0, "HW CPU cores: %d, npartitions: %d\n",
-num_online_cpus(), cfs_cpt_number(cfs_cpt_table));
+   LCONSOLE(0, "HW nodes: %d, HW CPU cores: %d, npartitions: %d\n",
+num_online_nodes(), num_online_cpus(),
+cfs_cpt_number(cfs_cpt_table));
return 0;
 
  failed:
diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index d0512da..dbc2a9b 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -209,7 +209,7 @@ static int cfs_wi_scheduler(void *arg)
/* CPT affinity scheduler? */
if (sched->ws_cptab)
if (cfs_cpt_bind(sched->ws_cptab, sched->ws_cpt))
-   CWARN("Failed to bind %s on CPT %d\n",
+   CWARN("Unable to bind %s on CPU partition %d\n",
  sched->ws_name, sched->ws_cpt);
 
spin_lock(_wi_data.wi_glock);
diff --git a/drivers/staging/lustre/lnet/selftest/module.c 
b/drivers/staging/lustre/lnet/selftest/module.c
index 71485f9..b5d556f 100644
--- a/drivers/staging/lustre/lnet/selftest/module.c
+++ b/drivers/staging/lustre/lnet/selftest/module.c
@@ -112,7 +112,8 @@ enum {
rc = cfs_wi_sched_create("lst_t", lnet_cpt_table(), i,
 nthrs, _sched_test[i]);
if (rc) {
-   CERROR("Failed to create CPT affinity WI scheduler %d 
for LST\n", i);
+   CWARN("Failed to create CPU partition affinity WI 
scheduler %d for LST\n",
+ i);
goto error;
}
}
-- 
1.8.3.1

[PATCH 44/60] staging: lustre: libcfs: fix error messages

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

Don't treat unability to set CPU partition affinity as error.
Improve those warning messages.

Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8703
Reviewed-on: https://review.whamcloud.com/23307
Reviewed-by: Patrick Farrell 
Reviewed-by: Doug Oucharek 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 2 +-
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 4 ++--
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c   | 5 +++--
 drivers/staging/lustre/lnet/libcfs/workitem.c  | 2 +-
 drivers/staging/lustre/lnet/selftest/module.c  | 3 ++-
 5 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 14dbc53..e2f3f72 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3546,7 +3546,7 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
 
rc = cfs_cpt_bind(lnet_cpt_table(), sched->ibs_cpt);
if (rc) {
-   CWARN("Failed to bind on CPT %d, please verify whether all CPUs 
are healthy and reload modules if necessary, otherwise your system might under 
risk of low performance\n",
+   CWARN("Unable to bind on CPU partition %d, please verify 
whether all CPUs are healthy and reload modules if necessary, otherwise your 
system might under risk of low performance\n",
  sched->ibs_cpt);
}
 
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index 3531e7d..df4f55e 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -1414,8 +1414,8 @@ int ksocknal_scheduler(void *arg)
 
rc = cfs_cpt_bind(lnet_cpt_table(), info->ksi_cpt);
if (rc) {
-   CERROR("Can't set CPT affinity to %d: %d\n",
-  info->ksi_cpt, rc);
+   CWARN("Can't set CPU partition affinity to %d: %d\n",
+ info->ksi_cpt, rc);
}
 
spin_lock_bh(>kss_lock);
diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 62ab76e..4d35a37 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -1082,8 +1082,9 @@ static int cfs_cpu_dead(unsigned int cpu)
}
spin_unlock(_data.cpt_lock);
 
-   LCONSOLE(0, "HW CPU cores: %d, npartitions: %d\n",
-num_online_cpus(), cfs_cpt_number(cfs_cpt_table));
+   LCONSOLE(0, "HW nodes: %d, HW CPU cores: %d, npartitions: %d\n",
+num_online_nodes(), num_online_cpus(),
+cfs_cpt_number(cfs_cpt_table));
return 0;
 
  failed:
diff --git a/drivers/staging/lustre/lnet/libcfs/workitem.c 
b/drivers/staging/lustre/lnet/libcfs/workitem.c
index d0512da..dbc2a9b 100644
--- a/drivers/staging/lustre/lnet/libcfs/workitem.c
+++ b/drivers/staging/lustre/lnet/libcfs/workitem.c
@@ -209,7 +209,7 @@ static int cfs_wi_scheduler(void *arg)
/* CPT affinity scheduler? */
if (sched->ws_cptab)
if (cfs_cpt_bind(sched->ws_cptab, sched->ws_cpt))
-   CWARN("Failed to bind %s on CPT %d\n",
+   CWARN("Unable to bind %s on CPU partition %d\n",
  sched->ws_name, sched->ws_cpt);
 
spin_lock(_wi_data.wi_glock);
diff --git a/drivers/staging/lustre/lnet/selftest/module.c 
b/drivers/staging/lustre/lnet/selftest/module.c
index 71485f9..b5d556f 100644
--- a/drivers/staging/lustre/lnet/selftest/module.c
+++ b/drivers/staging/lustre/lnet/selftest/module.c
@@ -112,7 +112,8 @@ enum {
rc = cfs_wi_sched_create("lst_t", lnet_cpt_table(), i,
 nthrs, _sched_test[i]);
if (rc) {
-   CERROR("Failed to create CPT affinity WI scheduler %d 
for LST\n", i);
+   CWARN("Failed to create CPU partition affinity WI 
scheduler %d for LST\n",
+ i);
goto error;
}
}
-- 
1.8.3.1

BUG at net/sctp/socket.c:7425

2017-01-28 Thread Alexander Popov

Hello,

I'm running the syzkaller fuzzer for v4.10-rc4 
(0aa0313f9d576affd7747cc3f179feb097d28990)
and have such a crash in sctp code:

[   38.423932] [ cut here ]
[   38.424298] kernel BUG at net/sctp/socket.c:7425!
[   38.424583] invalid opcode:  [#1] SMP KASAN
[   38.424839] Dumping ftrace buffer:
[   38.425031](ftrace buffer empty)
[   38.425232] Modules linked in: sctp libcrc32c snd_hda_codec_generic 
snd_hda_intel
snd_hda_codec snd_hda_core snd_intel8x0 snd_ens1370 snd_ac97_codec gameport 
snd_rawmidi
snd_hwdep snd_seq_device ac97_bus snd_pcm hid_generic joydev usbmouse snd_timer 
psmouse
usbhid e1000 snd hid parport_pc i2c_piix4 soundcore serio_raw parport 
input_leds pcspkr
floppy evbug mac_hid
[   38.427058] CPU: 0 PID: 1930 Comm: syz-executor12 Not tainted 4.10.0-rc4+ #2
[   38.427457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   38.427999] task: 88006945ca00 task.stack: 880053e4
[   38.428364] RIP: 0010:sctp_sendmsg+0x29b3/0x3030 [sctp]
[   38.428719] RSP: 0018:880053e478f8 EFLAGS: 00010297
[   38.429062] RAX: 88006945ca00 RBX: 880048d148c0 RCX: 
[   38.429636] RDX:  RSI:  RDI: 88006d022c88
[   38.430051] RBP: 880053e47b70 R08: 0560 R09: 88007ffda680
[   38.430473] R10: 000a R11: 1d400032be05 R12: dc00
[   38.430915] R13: 880048d148c0 R14:  R15: 880059ad9160
[   38.431390] FS:  7f984a645700() GS:88006d00() 
knlGS:
[   38.431979] CS:  0010 DS:  ES:  CR0: 80050033
[   38.432405] CR2: 20005fe0 CR3: 6400a000 CR4: 06f0
[   38.432827] DR0:  DR1:  DR2: 
[   38.433253] DR3:  DR6: fffe0ff0 DR7: 0400
[   38.433765] Call Trace:
[   38.433938]  ? sctp_id2assoc+0x330/0x330 [sctp]
[   38.434245]  ? wake_atomic_t_function+0x2b0/0x2b0
[   38.434545]  inet_sendmsg+0x128/0x3a0
[   38.434758]  ? inet_recvmsg+0x420/0x420
[   38.434983]  sock_sendmsg+0xcf/0x110
[   38.435192]  sock_write_iter+0x222/0x3c0
[   38.435421]  ? sock_sendmsg+0x110/0x110
[   38.435644]  ? iov_iter_init+0xaf/0x1d0
[   38.435867]  __vfs_write+0x3cb/0x640
[   38.436075]  ? do_iter_readv_writev+0x4c0/0x4c0
[   38.436338]  ? apparmor_file_permission+0x27/0x30
[   38.436618]  ? rw_verify_area+0xea/0x2b0
[   38.436853]  vfs_write+0x175/0x4e0
[   38.437053]  SyS_write+0xd8/0x1b0
[   38.437283]  ? SyS_read+0x1b0/0x1b0
[   38.437522]  entry_SYSCALL_64_fastpath+0x1e/0xad
[   38.437820] RIP: 0033:0x44f869
[   38.438013] RSP: 002b:7f984a644b58 EFLAGS: 0212 ORIG_RAX: 
0001
[   38.438464] RAX: ffda RBX: 7f984a645700 RCX: 0044f869
[   38.438886] RDX: 0018 RSI: 20ac4fe8 RDI: 0004
[   38.439305] RBP: 7ffe1d7be490 R08:  R09: 
[   38.439712] R10:  R11: 0212 R12: 
[   38.440145] R13: 7ffe1d7be40f R14: 7f984a6459c0 R15: 
[   38.440563] Code: c7 c7 10 1a 5c a0 e8 4d fb 76 e1 c6 44 24 68 01 e9 a2 f2 
ff ff e8 be
34 e1 e0 8b 9c 24 98 00 00 00 e9 06 fd ff ff e8 ad 34 e1 e0 <0f> 0b e8 a6 34 e1 
e0 4c 8b
4c 24 78 4c 8b 44 24 68 4c 89 f9 48
[   38.441881] RIP: sctp_sendmsg+0x29b3/0x3030 [sctp] RSP: 880053e478f8
[   38.442341] ---[ end trace c704b04c884389c0 ]---
[   38.442634] Kernel panic - not syncing: Fatal exception
[   38.443084] Dumping ftrace buffer:
[   38.443335](ftrace buffer empty)
[   38.443590] Kernel Offset: disabled


Unfortunately, I didn't manage to get a C program reproducing the crash (looks 
like race).
However, I stably hit it on my setup - so I can help fixing the issue.

The crash happens here:
/* Let another process have a go.  Since we are going
 * to sleep anyway.
 */
release_sock(sk);
current_timeo = schedule_timeout(current_timeo);
>   BUG_ON(sk != asoc->base.sk);
lock_sock(sk);

I've added some debugging output and see, that the original value of 
asoc->base.sk is
changed to the address of another struct sock, which appeared in 
sctp_endpoint_init()
shortly before the crash.

Hope for some assistance.
Best regards,
Alexander

BUG at net/sctp/socket.c:7425

2017-01-28 Thread Alexander Popov

Hello,

I'm running the syzkaller fuzzer for v4.10-rc4 
(0aa0313f9d576affd7747cc3f179feb097d28990)
and have such a crash in sctp code:

[   38.423932] [ cut here ]
[   38.424298] kernel BUG at net/sctp/socket.c:7425!
[   38.424583] invalid opcode:  [#1] SMP KASAN
[   38.424839] Dumping ftrace buffer:
[   38.425031](ftrace buffer empty)
[   38.425232] Modules linked in: sctp libcrc32c snd_hda_codec_generic 
snd_hda_intel
snd_hda_codec snd_hda_core snd_intel8x0 snd_ens1370 snd_ac97_codec gameport 
snd_rawmidi
snd_hwdep snd_seq_device ac97_bus snd_pcm hid_generic joydev usbmouse snd_timer 
psmouse
usbhid e1000 snd hid parport_pc i2c_piix4 soundcore serio_raw parport 
input_leds pcspkr
floppy evbug mac_hid
[   38.427058] CPU: 0 PID: 1930 Comm: syz-executor12 Not tainted 4.10.0-rc4+ #2
[   38.427457] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
Ubuntu-1.8.2-1ubuntu1 04/01/2014
[   38.427999] task: 88006945ca00 task.stack: 880053e4
[   38.428364] RIP: 0010:sctp_sendmsg+0x29b3/0x3030 [sctp]
[   38.428719] RSP: 0018:880053e478f8 EFLAGS: 00010297
[   38.429062] RAX: 88006945ca00 RBX: 880048d148c0 RCX: 
[   38.429636] RDX:  RSI:  RDI: 88006d022c88
[   38.430051] RBP: 880053e47b70 R08: 0560 R09: 88007ffda680
[   38.430473] R10: 000a R11: 1d400032be05 R12: dc00
[   38.430915] R13: 880048d148c0 R14:  R15: 880059ad9160
[   38.431390] FS:  7f984a645700() GS:88006d00() 
knlGS:
[   38.431979] CS:  0010 DS:  ES:  CR0: 80050033
[   38.432405] CR2: 20005fe0 CR3: 6400a000 CR4: 06f0
[   38.432827] DR0:  DR1:  DR2: 
[   38.433253] DR3:  DR6: fffe0ff0 DR7: 0400
[   38.433765] Call Trace:
[   38.433938]  ? sctp_id2assoc+0x330/0x330 [sctp]
[   38.434245]  ? wake_atomic_t_function+0x2b0/0x2b0
[   38.434545]  inet_sendmsg+0x128/0x3a0
[   38.434758]  ? inet_recvmsg+0x420/0x420
[   38.434983]  sock_sendmsg+0xcf/0x110
[   38.435192]  sock_write_iter+0x222/0x3c0
[   38.435421]  ? sock_sendmsg+0x110/0x110
[   38.435644]  ? iov_iter_init+0xaf/0x1d0
[   38.435867]  __vfs_write+0x3cb/0x640
[   38.436075]  ? do_iter_readv_writev+0x4c0/0x4c0
[   38.436338]  ? apparmor_file_permission+0x27/0x30
[   38.436618]  ? rw_verify_area+0xea/0x2b0
[   38.436853]  vfs_write+0x175/0x4e0
[   38.437053]  SyS_write+0xd8/0x1b0
[   38.437283]  ? SyS_read+0x1b0/0x1b0
[   38.437522]  entry_SYSCALL_64_fastpath+0x1e/0xad
[   38.437820] RIP: 0033:0x44f869
[   38.438013] RSP: 002b:7f984a644b58 EFLAGS: 0212 ORIG_RAX: 
0001
[   38.438464] RAX: ffda RBX: 7f984a645700 RCX: 0044f869
[   38.438886] RDX: 0018 RSI: 20ac4fe8 RDI: 0004
[   38.439305] RBP: 7ffe1d7be490 R08:  R09: 
[   38.439712] R10:  R11: 0212 R12: 
[   38.440145] R13: 7ffe1d7be40f R14: 7f984a6459c0 R15: 
[   38.440563] Code: c7 c7 10 1a 5c a0 e8 4d fb 76 e1 c6 44 24 68 01 e9 a2 f2 
ff ff e8 be
34 e1 e0 8b 9c 24 98 00 00 00 e9 06 fd ff ff e8 ad 34 e1 e0 <0f> 0b e8 a6 34 e1 
e0 4c 8b
4c 24 78 4c 8b 44 24 68 4c 89 f9 48
[   38.441881] RIP: sctp_sendmsg+0x29b3/0x3030 [sctp] RSP: 880053e478f8
[   38.442341] ---[ end trace c704b04c884389c0 ]---
[   38.442634] Kernel panic - not syncing: Fatal exception
[   38.443084] Dumping ftrace buffer:
[   38.443335](ftrace buffer empty)
[   38.443590] Kernel Offset: disabled


Unfortunately, I didn't manage to get a C program reproducing the crash (looks 
like race).
However, I stably hit it on my setup - so I can help fixing the issue.

The crash happens here:
/* Let another process have a go.  Since we are going
 * to sleep anyway.
 */
release_sock(sk);
current_timeo = schedule_timeout(current_timeo);
>   BUG_ON(sk != asoc->base.sk);
lock_sock(sk);

I've added some debugging output and see, that the original value of 
asoc->base.sk is
changed to the address of another struct sock, which appeared in 
sctp_endpoint_init()
shortly before the crash.

Hope for some assistance.
Best regards,
Alexander

[PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl

2017-01-28 Thread James Simmons

The check for the smallest ioctl data in libcfs_ioctl_getdata()
is incorrect. Instead of checking against struct libcfs_ioctl_data
compare the size to struct libcfs_ioctl_hdr.

Reported-by: Doug Oucharek 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
index 3f5d58b..bda6c16 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
@@ -134,7 +134,7 @@ int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
return -EINVAL;
}
 
-   if (hdr.ioc_len < sizeof(struct libcfs_ioctl_data)) {
+   if (hdr.ioc_len < sizeof(hdr)) {
CERROR("libcfs ioctl: user buffer too small for ioctl\n");
return -EINVAL;
}
-- 
1.8.3.1

[PATCH 01/60] staging: lustre: llite: Remove access of stripe in ll_setattr_raw

2017-01-28 Thread James Simmons

From: Jinshan Xiong 

In ll_setattr_raw(), it needs to know if a file is released
when the file is being truncated. It used to get this information
by accessing lov_stripe_md. This turns out not necessary. This
patch removes the access of lov_stripe_md and solves the problem
in lov_io_init_released().

Signed-off-by: Jinshan Xiong 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/13514
Reviewed-by: James Simmons 
Reviewed-by: Henri Doreau 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   6 --
 drivers/staging/lustre/lustre/llite/file.c |   2 +-
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |   9 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   1 +
 drivers/staging/lustre/lustre/llite/llite_lib.c| 109 ++---
 drivers/staging/lustre/lustre/llite/vvp_io.c   |  10 +-
 drivers/staging/lustre/lustre/lov/lov_io.c |   7 +-
 drivers/staging/lustre/lustre/lov/lov_object.c |   3 -
 8 files changed, 68 insertions(+), 79 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h 
b/drivers/staging/lustre/lustre/include/cl_object.h
index dc68561..a1b8301 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -284,12 +284,6 @@ struct cl_layout {
size_t  cl_size;
/** Layout generation. */
u32 cl_layout_gen;
-   /**
-* True if this is a released file.
-* Temporarily added for released file truncate in ll_setattr_raw().
-* It will be removed later. -Jinshan
-*/
-   boolcl_is_released;
 };
 
 /**
diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index a171188..0ee02f1 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1821,7 +1821,7 @@ static int ll_swap_layouts(struct file *file1, struct 
file *file2,
return rc;
 }
 
-static int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss)
+int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss)
 {
struct md_op_data   *op_data;
int  rc;
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c 
b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index dd1cfd8..f1036f4 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -94,6 +94,7 @@ int cl_setattr_ost(struct cl_object *obj, const struct iattr 
*attr,
 
io = vvp_env_thread_io(env);
io->ci_obj = obj;
+   io->ci_verify_layout = 1;
 
io->u.ci_setattr.sa_attr.lvb_atime = LTIME_S(attr->ia_atime);
io->u.ci_setattr.sa_attr.lvb_mtime = LTIME_S(attr->ia_mtime);
@@ -120,13 +121,7 @@ int cl_setattr_ost(struct cl_object *obj, const struct 
iattr *attr,
cl_io_fini(env, io);
if (unlikely(io->ci_need_restart))
goto again;
-   /* HSM import case: file is released, cannot be restored
-* no need to fail except if restore registration failed
-* with -ENODATA
-*/
-   if (result == -ENODATA && io->ci_restore_needed &&
-   io->ci_result != -ENODATA)
-   result = 0;
+
cl_env_put(env, );
return result;
 }
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 065a9a7..2c72177 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -766,6 +766,7 @@ int ll_dir_getstripe(struct inode *inode, void **lmmp, int 
*lmm_size,
 int ll_fid2path(struct inode *inode, void __user *arg);
 int ll_data_version(struct inode *inode, __u64 *data_version, int flags);
 int ll_hsm_release(struct inode *inode);
+int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss);
 
 /* llite/dcache.c */
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 9cb4909..769b307 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1402,7 +1402,11 @@ static int ll_md_setattr(struct dentry *dentry, struct 
md_op_data *op_data)
 * cache is not cleared yet.
 */
op_data->op_attr.ia_valid &= ~(TIMES_SET_FLAGS | ATTR_SIZE);
+   if (S_ISREG(inode->i_mode))
+   inode_lock(inode);
rc = simple_setattr(dentry, _data->op_attr);
+   if (S_ISREG(inode->i_mode))
+   inode_unlock(inode);
op_data->op_attr.ia_valid = ia_valid;
 
rc

[PATCH 60/60] staging: lustre: libcfs: fix minimum size check for libcfs ioctl

2017-01-28 Thread James Simmons

The check for the smallest ioctl data in libcfs_ioctl_getdata()
is incorrect. Instead of checking against struct libcfs_ioctl_data
compare the size to struct libcfs_ioctl_hdr.

Reported-by: Doug Oucharek 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-module.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
index 3f5d58b..bda6c16 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-module.c
@@ -134,7 +134,7 @@ int libcfs_ioctl_getdata(struct libcfs_ioctl_hdr **hdr_pp,
return -EINVAL;
}
 
-   if (hdr.ioc_len < sizeof(struct libcfs_ioctl_data)) {
+   if (hdr.ioc_len < sizeof(hdr)) {
CERROR("libcfs ioctl: user buffer too small for ioctl\n");
return -EINVAL;
}
-- 
1.8.3.1

[PATCH 01/60] staging: lustre: llite: Remove access of stripe in ll_setattr_raw

2017-01-28 Thread James Simmons

From: Jinshan Xiong 

In ll_setattr_raw(), it needs to know if a file is released
when the file is being truncated. It used to get this information
by accessing lov_stripe_md. This turns out not necessary. This
patch removes the access of lov_stripe_md and solves the problem
in lov_io_init_released().

Signed-off-by: Jinshan Xiong 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5823
Reviewed-on: http://review.whamcloud.com/13514
Reviewed-by: James Simmons 
Reviewed-by: Henri Doreau 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   6 --
 drivers/staging/lustre/lustre/llite/file.c |   2 +-
 drivers/staging/lustre/lustre/llite/lcommon_cl.c   |   9 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |   1 +
 drivers/staging/lustre/lustre/llite/llite_lib.c| 109 ++---
 drivers/staging/lustre/lustre/llite/vvp_io.c   |  10 +-
 drivers/staging/lustre/lustre/lov/lov_io.c |   7 +-
 drivers/staging/lustre/lustre/lov/lov_object.c |   3 -
 8 files changed, 68 insertions(+), 79 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h 
b/drivers/staging/lustre/lustre/include/cl_object.h
index dc68561..a1b8301 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -284,12 +284,6 @@ struct cl_layout {
size_t  cl_size;
/** Layout generation. */
u32 cl_layout_gen;
-   /**
-* True if this is a released file.
-* Temporarily added for released file truncate in ll_setattr_raw().
-* It will be removed later. -Jinshan
-*/
-   boolcl_is_released;
 };
 
 /**
diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index a171188..0ee02f1 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -1821,7 +1821,7 @@ static int ll_swap_layouts(struct file *file1, struct 
file *file2,
return rc;
 }
 
-static int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss)
+int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss)
 {
struct md_op_data   *op_data;
int  rc;
diff --git a/drivers/staging/lustre/lustre/llite/lcommon_cl.c 
b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
index dd1cfd8..f1036f4 100644
--- a/drivers/staging/lustre/lustre/llite/lcommon_cl.c
+++ b/drivers/staging/lustre/lustre/llite/lcommon_cl.c
@@ -94,6 +94,7 @@ int cl_setattr_ost(struct cl_object *obj, const struct iattr 
*attr,
 
io = vvp_env_thread_io(env);
io->ci_obj = obj;
+   io->ci_verify_layout = 1;
 
io->u.ci_setattr.sa_attr.lvb_atime = LTIME_S(attr->ia_atime);
io->u.ci_setattr.sa_attr.lvb_mtime = LTIME_S(attr->ia_mtime);
@@ -120,13 +121,7 @@ int cl_setattr_ost(struct cl_object *obj, const struct 
iattr *attr,
cl_io_fini(env, io);
if (unlikely(io->ci_need_restart))
goto again;
-   /* HSM import case: file is released, cannot be restored
-* no need to fail except if restore registration failed
-* with -ENODATA
-*/
-   if (result == -ENODATA && io->ci_restore_needed &&
-   io->ci_result != -ENODATA)
-   result = 0;
+
cl_env_put(env, );
return result;
 }
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 065a9a7..2c72177 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -766,6 +766,7 @@ int ll_dir_getstripe(struct inode *inode, void **lmmp, int 
*lmm_size,
 int ll_fid2path(struct inode *inode, void __user *arg);
 int ll_data_version(struct inode *inode, __u64 *data_version, int flags);
 int ll_hsm_release(struct inode *inode);
+int ll_hsm_state_set(struct inode *inode, struct hsm_state_set *hss);
 
 /* llite/dcache.c */
 
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index 9cb4909..769b307 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -1402,7 +1402,11 @@ static int ll_md_setattr(struct dentry *dentry, struct 
md_op_data *op_data)
 * cache is not cleared yet.
 */
op_data->op_attr.ia_valid &= ~(TIMES_SET_FLAGS | ATTR_SIZE);
+   if (S_ISREG(inode->i_mode))
+   inode_lock(inode);
rc = simple_setattr(dentry, _data->op_attr);
+   if (S_ISREG(inode->i_mode))
+   inode_unlock(inode);
op_data->op_attr.ia_valid = ia_valid;
 
rc = ll_update_inode(inode, );
@@ -1431,7 +1435,6 @@ int ll_setattr_raw(struct dentry *dentry, struct iattr 
*attr, bool hsm_import)
struct inode *inode =

[PATCH 58/60] staging: lustre: osc: avoid 64 divide in osc_cache_too_much

2017-01-28 Thread James Simmons

The use of 64 bit time introduces an expensive 64 bit
division operation. Since the time lapse being calculated
in osc_cache_too_much will never be more than seventy years
we can cast the time lapse to an long and perform a normal
32 bit divison operation instead.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8835
Reviewed-on: https://review.whamcloud.com/23814
Reviewed-by: Andreas Dilger 
Reviewed-by: Dmitry Eremin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/osc/osc_page.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c 
b/drivers/staging/lustre/lustre/osc/osc_page.c
index 0461408..ab9d0d7 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -370,12 +370,17 @@ static int osc_cache_too_much(struct client_obd *cli)
return lru_shrink_min(cli);
} else {
time64_t duration = ktime_get_real_seconds();
+   long timediff;
 
/* knock out pages by duration of no IO activity */
duration -= cli->cl_lru_last_used;
-   duration >>= 6; /* approximately 1 minute */
-   if (duration > 0 &&
-   pages >= div64_s64((s64)budget, duration))
+   /*
+* The difference shouldn't be more than 70 years
+* so we can safely case to a long. Round to
+* approximately 1 minute.
+*/
+   timediff = (long)(duration >> 6);
+   if (timediff > 0 && pages >= budget / timediff)
return lru_shrink_min(cli);
}
return 0;
-- 
1.8.3.1

[PATCH 59/60] staging: lustre: ptlrpc : remove userland usage from ptlrpc

2017-01-28 Thread James Simmons

The reason for __REQ_LAYOUT_USER__ was to expose a
section of code in layout.c to userland for a utility
similar to wireshark. This was done before wireshark
existed but now that it does we no longer need to do
this type of hack. This also reduces lustre_acl.h to
strictly a kernel header now.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8945
Reviewed-on: https://review.whamcloud.com/24396
Reviewed-by: Dmitry Eremin 
Reviewed-by: Andreas Dilger 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/lustre_req_layout.h | 10 ++
 drivers/staging/lustre/lustre/ptlrpc/layout.c |  9 -
 2 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h 
b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index fbcd395..cd62ccd 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -39,6 +39,8 @@
 #ifndef _LUSTRE_REQ_LAYOUT_H__
 #define _LUSTRE_REQ_LAYOUT_H__
 
+#include 
+
 /** \defgroup req_layout req_layout
  *
  * @{
@@ -66,11 +68,6 @@ struct req_capsule {
__u32   rc_area[RCL_NR][REQ_MAX_FIELD_NR];
 };
 
-#if !defined(__REQ_LAYOUT_USER__)
-
-/* struct ptlrpc_request, lustre_msg* */
-#include "lustre_net.h"
-
 void req_capsule_init(struct req_capsule *pill, struct ptlrpc_request *req,
  enum req_location location);
 void req_capsule_fini(struct req_capsule *pill);
@@ -120,9 +117,6 @@ void req_capsule_shrink(struct req_capsule *pill,
 int  req_layout_init(void);
 void req_layout_fini(void);
 
-/* __REQ_LAYOUT_USER__ */
-#endif
-
 extern struct req_format RQF_OBD_PING;
 extern struct req_format RQF_OBD_SET_INFO;
 extern struct req_format RQF_SEC_CTX;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c 
b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 2052848..356d735 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -42,8 +42,6 @@
  * of the format that the request conforms to.
  */
 
-#if !defined(__REQ_LAYOUT_USER__)
-
 #define DEBUG_SUBSYSTEM S_RPC
 
 #include 
@@ -57,8 +55,6 @@
 #include "../include/obd.h"
 #include "../include/obd_support.h"
 
-/* __REQ_LAYOUT_USER__ */
-#endif
 /* struct ptlrpc_request, lustre_msg* */
 #include "../include/lustre_req_layout.h"
 #include "../include/lustre_acl.h"
@@ -1558,8 +1554,6 @@ struct req_format RQF_OST_GET_INFO_FIEMAP =
ost_get_fiemap_server);
 EXPORT_SYMBOL(RQF_OST_GET_INFO_FIEMAP);
 
-#if !defined(__REQ_LAYOUT_USER__)
-
 /* Convenience macro */
 #define FMT_FIELD(fmt, i, j) (fmt)->rf_fields[(i)].d[(j)]
 
@@ -2238,6 +2232,3 @@ void req_capsule_shrink(struct req_capsule *pill,
1);
 }
 EXPORT_SYMBOL(req_capsule_shrink);
-
-/* __REQ_LAYOUT_USER__ */
-#endif
-- 
1.8.3.1

[PATCH 58/60] staging: lustre: osc: avoid 64 divide in osc_cache_too_much

2017-01-28 Thread James Simmons

The use of 64 bit time introduces an expensive 64 bit
division operation. Since the time lapse being calculated
in osc_cache_too_much will never be more than seventy years
we can cast the time lapse to an long and perform a normal
32 bit divison operation instead.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8835
Reviewed-on: https://review.whamcloud.com/23814
Reviewed-by: Andreas Dilger 
Reviewed-by: Dmitry Eremin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/osc/osc_page.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_page.c 
b/drivers/staging/lustre/lustre/osc/osc_page.c
index 0461408..ab9d0d7 100644
--- a/drivers/staging/lustre/lustre/osc/osc_page.c
+++ b/drivers/staging/lustre/lustre/osc/osc_page.c
@@ -370,12 +370,17 @@ static int osc_cache_too_much(struct client_obd *cli)
return lru_shrink_min(cli);
} else {
time64_t duration = ktime_get_real_seconds();
+   long timediff;
 
/* knock out pages by duration of no IO activity */
duration -= cli->cl_lru_last_used;
-   duration >>= 6; /* approximately 1 minute */
-   if (duration > 0 &&
-   pages >= div64_s64((s64)budget, duration))
+   /*
+* The difference shouldn't be more than 70 years
+* so we can safely case to a long. Round to
+* approximately 1 minute.
+*/
+   timediff = (long)(duration >> 6);
+   if (timediff > 0 && pages >= budget / timediff)
return lru_shrink_min(cli);
}
return 0;
-- 
1.8.3.1

[PATCH 59/60] staging: lustre: ptlrpc : remove userland usage from ptlrpc

2017-01-28 Thread James Simmons

The reason for __REQ_LAYOUT_USER__ was to expose a
section of code in layout.c to userland for a utility
similar to wireshark. This was done before wireshark
existed but now that it does we no longer need to do
this type of hack. This also reduces lustre_acl.h to
strictly a kernel header now.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8945
Reviewed-on: https://review.whamcloud.com/24396
Reviewed-by: Dmitry Eremin 
Reviewed-by: Andreas Dilger 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/lustre_req_layout.h | 10 ++
 drivers/staging/lustre/lustre/ptlrpc/layout.c |  9 -
 2 files changed, 2 insertions(+), 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_req_layout.h 
b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
index fbcd395..cd62ccd 100644
--- a/drivers/staging/lustre/lustre/include/lustre_req_layout.h
+++ b/drivers/staging/lustre/lustre/include/lustre_req_layout.h
@@ -39,6 +39,8 @@
 #ifndef _LUSTRE_REQ_LAYOUT_H__
 #define _LUSTRE_REQ_LAYOUT_H__
 
+#include 
+
 /** \defgroup req_layout req_layout
  *
  * @{
@@ -66,11 +68,6 @@ struct req_capsule {
__u32   rc_area[RCL_NR][REQ_MAX_FIELD_NR];
 };
 
-#if !defined(__REQ_LAYOUT_USER__)
-
-/* struct ptlrpc_request, lustre_msg* */
-#include "lustre_net.h"
-
 void req_capsule_init(struct req_capsule *pill, struct ptlrpc_request *req,
  enum req_location location);
 void req_capsule_fini(struct req_capsule *pill);
@@ -120,9 +117,6 @@ void req_capsule_shrink(struct req_capsule *pill,
 int  req_layout_init(void);
 void req_layout_fini(void);
 
-/* __REQ_LAYOUT_USER__ */
-#endif
-
 extern struct req_format RQF_OBD_PING;
 extern struct req_format RQF_OBD_SET_INFO;
 extern struct req_format RQF_SEC_CTX;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/layout.c 
b/drivers/staging/lustre/lustre/ptlrpc/layout.c
index 2052848..356d735 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/layout.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/layout.c
@@ -42,8 +42,6 @@
  * of the format that the request conforms to.
  */
 
-#if !defined(__REQ_LAYOUT_USER__)
-
 #define DEBUG_SUBSYSTEM S_RPC
 
 #include 
@@ -57,8 +55,6 @@
 #include "../include/obd.h"
 #include "../include/obd_support.h"
 
-/* __REQ_LAYOUT_USER__ */
-#endif
 /* struct ptlrpc_request, lustre_msg* */
 #include "../include/lustre_req_layout.h"
 #include "../include/lustre_acl.h"
@@ -1558,8 +1554,6 @@ struct req_format RQF_OST_GET_INFO_FIEMAP =
ost_get_fiemap_server);
 EXPORT_SYMBOL(RQF_OST_GET_INFO_FIEMAP);
 
-#if !defined(__REQ_LAYOUT_USER__)
-
 /* Convenience macro */
 #define FMT_FIELD(fmt, i, j) (fmt)->rf_fields[(i)].d[(j)]
 
@@ -2238,6 +2232,3 @@ void req_capsule_shrink(struct req_capsule *pill,
1);
 }
 EXPORT_SYMBOL(req_capsule_shrink);
-
-/* __REQ_LAYOUT_USER__ */
-#endif
-- 
1.8.3.1

[PATCH 57/60] staging: lustre: lmv: remove nlink check in lmv_revalidate_slaves

2017-01-28 Thread James Simmons

From: wang di 

If an application attempts to remove millions of files in a
single directory it will fail. This failure was tracked down to
the nlink < 2 check in lmv_revalidate_slaves, because after
nlink reaches to maximum value of LDISKFS_LINK_MAX (65000),
the nlink broadcast back from the server will be reported as
one. The return value of 1 is not invalid so lets remove
the check.

Signed-off-by: wang di 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6984
Reviewed-on: http://review.whamcloud.com/16490
Reviewed-by: James Simmons 
Reviewed-by: Jian Yu 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/lmv/lmv_intent.c | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lmv_intent.c 
b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
index b1071cf..aa42066 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_intent.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
@@ -220,21 +220,7 @@ int lmv_revalidate_slaves(struct obd_export *exp,
/* refresh slave from server */
body = req_capsule_server_get(>rq_pill,
  _MDT_BODY);
-   LASSERT(body);
-
-   if (unlikely(body->mbo_nlink < 2)) {
-   /*
-* If this is bad stripe, most likely due
-* to the race between close(unlink) and
-* getattr, let's return -EONENT, so llite
-* will revalidate the dentry see
-* ll_inode_revalidate_fini()
-*/
-   CDEBUG(D_INODE, "%s: nlink %d < 2 corrupt 
stripe %d "DFID":" DFID"\n",
-  obd->obd_name, body->mbo_nlink, i,
-  PFID(>lsm_md_oinfo[i].lmo_fid),
-  PFID(>lsm_md_oinfo[0].lmo_fid));
-
+   if (!body) {
if (it.it_lock_mode && lockh) {
ldlm_lock_decref(lockh, 
it.it_lock_mode);
it.it_lock_mode = 0;
-- 
1.8.3.1

[PATCH 57/60] staging: lustre: lmv: remove nlink check in lmv_revalidate_slaves

2017-01-28 Thread James Simmons

From: wang di 

If an application attempts to remove millions of files in a
single directory it will fail. This failure was tracked down to
the nlink < 2 check in lmv_revalidate_slaves, because after
nlink reaches to maximum value of LDISKFS_LINK_MAX (65000),
the nlink broadcast back from the server will be reported as
one. The return value of 1 is not invalid so lets remove
the check.

Signed-off-by: wang di 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6984
Reviewed-on: http://review.whamcloud.com/16490
Reviewed-by: James Simmons 
Reviewed-by: Jian Yu 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/lmv/lmv_intent.c | 16 +---
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lmv_intent.c 
b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
index b1071cf..aa42066 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_intent.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_intent.c
@@ -220,21 +220,7 @@ int lmv_revalidate_slaves(struct obd_export *exp,
/* refresh slave from server */
body = req_capsule_server_get(>rq_pill,
  _MDT_BODY);
-   LASSERT(body);
-
-   if (unlikely(body->mbo_nlink < 2)) {
-   /*
-* If this is bad stripe, most likely due
-* to the race between close(unlink) and
-* getattr, let's return -EONENT, so llite
-* will revalidate the dentry see
-* ll_inode_revalidate_fini()
-*/
-   CDEBUG(D_INODE, "%s: nlink %d < 2 corrupt 
stripe %d "DFID":" DFID"\n",
-  obd->obd_name, body->mbo_nlink, i,
-  PFID(>lsm_md_oinfo[i].lmo_fid),
-  PFID(>lsm_md_oinfo[0].lmo_fid));
-
+   if (!body) {
if (it.it_lock_mode && lockh) {
ldlm_lock_decref(lockh, 
it.it_lock_mode);
it.it_lock_mode = 0;
-- 
1.8.3.1

[PATCH 56/60] staging: lustre: llite: don't invoke direct_IO for the EOF case

2017-01-28 Thread James Simmons

From: Yang Sheng 

The function generic_file_read_iter() does not check EOF
before invoke direct_IO callback. So we have to check it
ourselves.

Signed-off-by: Yang Sheng 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8969
Reviewed-on: https://review.whamcloud.com/24552
Reviewed-by: Bob Glossman 
Reviewed-by: Bobi Jam 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/llite/rw26.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/rw26.c 
b/drivers/staging/lustre/lustre/llite/rw26.c
index 21e06e5..d89e795 100644
--- a/drivers/staging/lustre/lustre/llite/rw26.c
+++ b/drivers/staging/lustre/lustre/llite/rw26.c
@@ -345,6 +345,10 @@ static ssize_t ll_direct_IO_26(struct kiocb *iocb, struct 
iov_iter *iter)
ssize_t tot_bytes = 0, result = 0;
long size = MAX_DIO_SIZE;
 
+   /* Check EOF by ourselves */
+   if (iov_iter_rw(iter) == READ && file_offset >= i_size_read(inode))
+   return 0;
+
/* FIXME: io smaller than PAGE_SIZE is broken on ia64 ??? */
if ((file_offset & ~PAGE_MASK) || (count & ~PAGE_MASK))
return -EINVAL;
-- 
1.8.3.1

[PATCH 56/60] staging: lustre: llite: don't invoke direct_IO for the EOF case

2017-01-28 Thread James Simmons

From: Yang Sheng 

The function generic_file_read_iter() does not check EOF
before invoke direct_IO callback. So we have to check it
ourselves.

Signed-off-by: Yang Sheng 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8969
Reviewed-on: https://review.whamcloud.com/24552
Reviewed-by: Bob Glossman 
Reviewed-by: Bobi Jam 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/llite/rw26.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/rw26.c 
b/drivers/staging/lustre/lustre/llite/rw26.c
index 21e06e5..d89e795 100644
--- a/drivers/staging/lustre/lustre/llite/rw26.c
+++ b/drivers/staging/lustre/lustre/llite/rw26.c
@@ -345,6 +345,10 @@ static ssize_t ll_direct_IO_26(struct kiocb *iocb, struct 
iov_iter *iter)
ssize_t tot_bytes = 0, result = 0;
long size = MAX_DIO_SIZE;
 
+   /* Check EOF by ourselves */
+   if (iov_iter_rw(iter) == READ && file_offset >= i_size_read(inode))
+   return 0;
+
/* FIXME: io smaller than PAGE_SIZE is broken on ia64 ??? */
if ((file_offset & ~PAGE_MASK) || (count & ~PAGE_MASK))
return -EINVAL;
-- 
1.8.3.1

[PATCH 54/60] staging: lustre: fid: Change positional struct initializers to C99

2017-01-28 Thread James Simmons

From: Steve Guminski 

This patch makes no functional changes.  Struct initializers in the
fid directory that use C89 or GCC-only syntax are updated to C99
syntax.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

lustre/fid/fid_lib.c:
const struct lu_seq_range LUSTRE_SEQ_SPACE_RANGE
const struct lu_seq_range LUSTRE_SEQ_ZERO_RANGE
lustre/fid/lproc_fid.c:
struct lprocfs_vars seq_client_debugfs_list

Signed-off-by: Steve Guminski 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6210
Reviewed-on: https://review.whamcloud.com/23789
Reviewed-by: Nathaniel Clark 
Reviewed-by: James Simmons 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/fid/fid_lib.c   |  7 +++
 drivers/staging/lustre/lustre/fid/lproc_fid.c | 12 
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/fid/fid_lib.c 
b/drivers/staging/lustre/lustre/fid/fid_lib.c
index 4e49cb3..9eb4059 100644
--- a/drivers/staging/lustre/lustre/fid/fid_lib.c
+++ b/drivers/staging/lustre/lustre/fid/fid_lib.c
@@ -60,14 +60,13 @@
  * FID_SEQ_START + 2 is for .lustre directory and its objects
  */
 const struct lu_seq_range LUSTRE_SEQ_SPACE_RANGE = {
-   FID_SEQ_NORMAL,
-   (__u64)~0ULL
+   .lsr_start  = FID_SEQ_NORMAL,
+   .lsr_end= (__u64)~0ULL,
 };
 
 /* Zero range, used for init and other purposes. */
 const struct lu_seq_range LUSTRE_SEQ_ZERO_RANGE = {
-   0,
-   0
+   .lsr_start  = 0,
 };
 
 /* Lustre Big Fs Lock fid. */
diff --git a/drivers/staging/lustre/lustre/fid/lproc_fid.c 
b/drivers/staging/lustre/lustre/fid/lproc_fid.c
index 97d4849..3eed838 100644
--- a/drivers/staging/lustre/lustre/fid/lproc_fid.c
+++ b/drivers/staging/lustre/lustre/fid/lproc_fid.c
@@ -203,9 +203,13 @@
 LPROC_SEQ_FOPS_RO(ldebugfs_fid_fid);
 
 struct lprocfs_vars seq_client_debugfs_list[] = {
-   { "space", _fid_space_fops },
-   { "width", _fid_width_fops },
-   { "server", _fid_server_fops },
-   { "fid", _fid_fid_fops },
+   { .name =   "space",
+ .fops =   _fid_space_fops },
+   { .name =   "width",
+ .fops =   _fid_width_fops },
+   { .name =   "server",
+ .fops =   _fid_server_fops },
+   { .name =   "fid",
+ .fops =   _fid_fid_fops },
{ NULL }
 };
-- 
1.8.3.1

[PATCH 55/60] staging: lustre: obd: move s3 in lmd_parse to inner loop

2017-01-28 Thread James Simmons

Building the lustre client with W=1 reports the following
error:

obdclass/obd_mount.c: In function lmd_parse:
obdclass/obd_mount.c:880: warning: variable set but not used

The solution is to move s3 to the inner loop
where it is only used.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8378
Reviewed-on: https://review.whamcloud.com/23820
Reviewed-by: Andreas Dilger 
Reviewed-by: Jinshan Xiong 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/obdclass/obd_mount.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c 
b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index 2283e92..8e0d4b1 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -877,7 +877,7 @@ static int lmd_parse_mgs(struct lustre_mount_data *lmd, 
char **ptr)
  */
 static int lmd_parse(char *options, struct lustre_mount_data *lmd)
 {
-   char *s1, *s2, *s3, *devname = NULL;
+   char *s1, *s2, *devname = NULL;
struct lustre_mount_data *raw = (struct lustre_mount_data *)options;
int rc = 0;
 
@@ -906,6 +906,7 @@ static int lmd_parse(char *options, struct 
lustre_mount_data *lmd)
while (*s1) {
int clear = 0;
int time_min = OBD_RECOVERY_TIME_MIN;
+   char *s3;
 
/* Skip whitespace and extra commas */
while (*s1 == ' ' || *s1 == ',')
-- 
1.8.3.1

[PATCH 54/60] staging: lustre: fid: Change positional struct initializers to C99

2017-01-28 Thread James Simmons

From: Steve Guminski 

This patch makes no functional changes.  Struct initializers in the
fid directory that use C89 or GCC-only syntax are updated to C99
syntax.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

lustre/fid/fid_lib.c:
const struct lu_seq_range LUSTRE_SEQ_SPACE_RANGE
const struct lu_seq_range LUSTRE_SEQ_ZERO_RANGE
lustre/fid/lproc_fid.c:
struct lprocfs_vars seq_client_debugfs_list

Signed-off-by: Steve Guminski 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6210
Reviewed-on: https://review.whamcloud.com/23789
Reviewed-by: Nathaniel Clark 
Reviewed-by: James Simmons 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/fid/fid_lib.c   |  7 +++
 drivers/staging/lustre/lustre/fid/lproc_fid.c | 12 
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/lustre/fid/fid_lib.c 
b/drivers/staging/lustre/lustre/fid/fid_lib.c
index 4e49cb3..9eb4059 100644
--- a/drivers/staging/lustre/lustre/fid/fid_lib.c
+++ b/drivers/staging/lustre/lustre/fid/fid_lib.c
@@ -60,14 +60,13 @@
  * FID_SEQ_START + 2 is for .lustre directory and its objects
  */
 const struct lu_seq_range LUSTRE_SEQ_SPACE_RANGE = {
-   FID_SEQ_NORMAL,
-   (__u64)~0ULL
+   .lsr_start  = FID_SEQ_NORMAL,
+   .lsr_end= (__u64)~0ULL,
 };
 
 /* Zero range, used for init and other purposes. */
 const struct lu_seq_range LUSTRE_SEQ_ZERO_RANGE = {
-   0,
-   0
+   .lsr_start  = 0,
 };
 
 /* Lustre Big Fs Lock fid. */
diff --git a/drivers/staging/lustre/lustre/fid/lproc_fid.c 
b/drivers/staging/lustre/lustre/fid/lproc_fid.c
index 97d4849..3eed838 100644
--- a/drivers/staging/lustre/lustre/fid/lproc_fid.c
+++ b/drivers/staging/lustre/lustre/fid/lproc_fid.c
@@ -203,9 +203,13 @@
 LPROC_SEQ_FOPS_RO(ldebugfs_fid_fid);
 
 struct lprocfs_vars seq_client_debugfs_list[] = {
-   { "space", _fid_space_fops },
-   { "width", _fid_width_fops },
-   { "server", _fid_server_fops },
-   { "fid", _fid_fid_fops },
+   { .name =   "space",
+ .fops =   _fid_space_fops },
+   { .name =   "width",
+ .fops =   _fid_width_fops },
+   { .name =   "server",
+ .fops =   _fid_server_fops },
+   { .name =   "fid",
+ .fops =   _fid_fid_fops },
{ NULL }
 };
-- 
1.8.3.1

[PATCH 55/60] staging: lustre: obd: move s3 in lmd_parse to inner loop

2017-01-28 Thread James Simmons

Building the lustre client with W=1 reports the following
error:

obdclass/obd_mount.c: In function lmd_parse:
obdclass/obd_mount.c:880: warning: variable set but not used

The solution is to move s3 to the inner loop
where it is only used.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8378
Reviewed-on: https://review.whamcloud.com/23820
Reviewed-by: Andreas Dilger 
Reviewed-by: Jinshan Xiong 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/obdclass/obd_mount.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/obdclass/obd_mount.c 
b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
index 2283e92..8e0d4b1 100644
--- a/drivers/staging/lustre/lustre/obdclass/obd_mount.c
+++ b/drivers/staging/lustre/lustre/obdclass/obd_mount.c
@@ -877,7 +877,7 @@ static int lmd_parse_mgs(struct lustre_mount_data *lmd, 
char **ptr)
  */
 static int lmd_parse(char *options, struct lustre_mount_data *lmd)
 {
-   char *s1, *s2, *s3, *devname = NULL;
+   char *s1, *s2, *devname = NULL;
struct lustre_mount_data *raw = (struct lustre_mount_data *)options;
int rc = 0;
 
@@ -906,6 +906,7 @@ static int lmd_parse(char *options, struct 
lustre_mount_data *lmd)
while (*s1) {
int clear = 0;
int time_min = OBD_RECOVERY_TIME_MIN;
+   char *s3;
 
/* Skip whitespace and extra commas */
while (*s1 == ' ' || *s1 == ',')
-- 
1.8.3.1

[PATCH 52/60] staging: lustre: linkea: linkEA size limitation

2017-01-28 Thread James Simmons

From: Fan Yong 

Under DNE mode, if we do not restrict the linkEA size, and if there
are too many cross-MDTs hard links to the same object, then it will
casue the llog overflow. On the other hand, too many linkEA entries
in the linkEA will serious affect the linkEA performance because we
only support to locate linkEA entry consecutively.

So we need to restrict the linkEA size. Currently, it is 4096 bytes,
that is independent from the backend. If too many hard links caused
the linkEA overflowed, we will add overflow timestamp in the linkEA
header. Such overflow timestamp has some functionalities:

1. It will prevent the object being migrated to other MDT, because
   some name entries may be not in the linkEA, so we cannot update
   these name entries for the migration.

2. It will tell the namespace LFSCK that the 'nlink' attribute may
   be more trustable than the linkEA, then avoid misguiding the
   namespace LFSCK to repair 'nlink' attribute based on linkEA.

There will be subsequent patch(es) for namespace LFSCK to handle the
linkEA size limitation and overflow cases.

Signed-off-by: Fan Yong 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8569
Reviewed-on: https://review.whamcloud.com/23500
Reviewed-by: Andreas Dilger 
Reviewed-by: wangdi 
Reviewed-by: Lai Siyao 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 .../lustre/lustre/include/lustre/lustre_idl.h  |  5 +-
 .../staging/lustre/lustre/include/lustre_linkea.h  | 15 -
 drivers/staging/lustre/lustre/llite/llite_lib.c|  2 +-
 drivers/staging/lustre/lustre/obdclass/linkea.c| 70 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c| 16 ++---
 5 files changed, 81 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h 
b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index b0eb80d..fc960da 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -3217,9 +3217,8 @@ struct link_ea_header {
__u32 leh_magic;
__u32 leh_reccount;
__u64 leh_len;  /* total size */
-   /* future use */
-   __u32 padding1;
-   __u32 padding2;
+   __u32 leh_overflow_time;
+   __u32 leh_padding;
 };
 
 /** Hardlink data is name and parent fid.
diff --git a/drivers/staging/lustre/lustre/include/lustre_linkea.h 
b/drivers/staging/lustre/lustre/include/lustre_linkea.h
index 249e8bf..3ff008f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_linkea.h
+++ b/drivers/staging/lustre/lustre/include/lustre_linkea.h
@@ -26,7 +26,19 @@
  * Author: di wang 
  */
 
-#define DEFAULT_LINKEA_SIZE4096
+/* There are several reasons to restrict the linkEA size:
+ *
+ * 1. Under DNE mode, if we do not restrict the linkEA size, and if there
+ *are too many cross-MDTs hard links to the same object, then it will
+ *casue the llog overflow.
+ *
+ * 2. Some backend has limited size for EA. For example, if without large
+ *EA enabled, the ldiskfs will make all EAs to share one (4K) EA block.
+ *
+ * 3. Too many entries in linkEA will seriously affect linkEA performance
+ *because we only support to locate linkEA entry consecutively.
+ */
+#define MAX_LINKEA_SIZE4096
 
 struct linkea_data {
/**
@@ -43,6 +55,7 @@ struct linkea_data {
 
 int linkea_data_new(struct linkea_data *ldata, struct lu_buf *buf);
 int linkea_init(struct linkea_data *ldata);
+int linkea_init_with_rec(struct linkea_data *ldata);
 void linkea_entry_unpack(const struct link_ea_entry *lee, int *reclen,
 struct lu_name *lname, struct lu_fid *pfid);
 int linkea_entry_pack(struct link_ea_entry *lee, const struct lu_name *lname,
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index b229cbc..9a9cdb0 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -2553,7 +2553,7 @@ static int ll_linkea_decode(struct linkea_data *ldata, 
unsigned int linkno,
unsigned int idx;
int rc;
 
-   rc = linkea_init(ldata);
+   rc = linkea_init_with_rec(ldata);
if (rc < 0)
return rc;
 
diff --git a/drivers/staging/lustre/lustre/obdclass/linkea.c 
b/drivers/staging/lustre/lustre/obdclass/linkea.c
index 0b1d2f0..0c4 100644
--- a/drivers/staging/lustre/lustre/obdclass/linkea.c
+++ b/drivers/staging/lustre/lustre/obdclass/linkea.c
@@ -39,6 +39,8 @@ int linkea_data_new(struct linkea_data *ldata, struct lu_buf 
*buf)
ldata->ld_leh->leh_magic = LINK_EA_MAGIC;
ldata->ld_leh->leh_len = sizeof(struct link_ea_header);
ldata->ld_leh->leh_reccount = 0;
+   ldata->ld_leh->leh_overflow_time = 0;

[PATCH 52/60] staging: lustre: linkea: linkEA size limitation

2017-01-28 Thread James Simmons

From: Fan Yong 

Under DNE mode, if we do not restrict the linkEA size, and if there
are too many cross-MDTs hard links to the same object, then it will
casue the llog overflow. On the other hand, too many linkEA entries
in the linkEA will serious affect the linkEA performance because we
only support to locate linkEA entry consecutively.

So we need to restrict the linkEA size. Currently, it is 4096 bytes,
that is independent from the backend. If too many hard links caused
the linkEA overflowed, we will add overflow timestamp in the linkEA
header. Such overflow timestamp has some functionalities:

1. It will prevent the object being migrated to other MDT, because
   some name entries may be not in the linkEA, so we cannot update
   these name entries for the migration.

2. It will tell the namespace LFSCK that the 'nlink' attribute may
   be more trustable than the linkEA, then avoid misguiding the
   namespace LFSCK to repair 'nlink' attribute based on linkEA.

There will be subsequent patch(es) for namespace LFSCK to handle the
linkEA size limitation and overflow cases.

Signed-off-by: Fan Yong 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8569
Reviewed-on: https://review.whamcloud.com/23500
Reviewed-by: Andreas Dilger 
Reviewed-by: wangdi 
Reviewed-by: Lai Siyao 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 .../lustre/lustre/include/lustre/lustre_idl.h  |  5 +-
 .../staging/lustre/lustre/include/lustre_linkea.h  | 15 -
 drivers/staging/lustre/lustre/llite/llite_lib.c|  2 +-
 drivers/staging/lustre/lustre/obdclass/linkea.c| 70 +-
 drivers/staging/lustre/lustre/ptlrpc/wiretest.c| 16 ++---
 5 files changed, 81 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h 
b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
index b0eb80d..fc960da 100644
--- a/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
+++ b/drivers/staging/lustre/lustre/include/lustre/lustre_idl.h
@@ -3217,9 +3217,8 @@ struct link_ea_header {
__u32 leh_magic;
__u32 leh_reccount;
__u64 leh_len;  /* total size */
-   /* future use */
-   __u32 padding1;
-   __u32 padding2;
+   __u32 leh_overflow_time;
+   __u32 leh_padding;
 };
 
 /** Hardlink data is name and parent fid.
diff --git a/drivers/staging/lustre/lustre/include/lustre_linkea.h 
b/drivers/staging/lustre/lustre/include/lustre_linkea.h
index 249e8bf..3ff008f 100644
--- a/drivers/staging/lustre/lustre/include/lustre_linkea.h
+++ b/drivers/staging/lustre/lustre/include/lustre_linkea.h
@@ -26,7 +26,19 @@
  * Author: di wang 
  */
 
-#define DEFAULT_LINKEA_SIZE4096
+/* There are several reasons to restrict the linkEA size:
+ *
+ * 1. Under DNE mode, if we do not restrict the linkEA size, and if there
+ *are too many cross-MDTs hard links to the same object, then it will
+ *casue the llog overflow.
+ *
+ * 2. Some backend has limited size for EA. For example, if without large
+ *EA enabled, the ldiskfs will make all EAs to share one (4K) EA block.
+ *
+ * 3. Too many entries in linkEA will seriously affect linkEA performance
+ *because we only support to locate linkEA entry consecutively.
+ */
+#define MAX_LINKEA_SIZE4096
 
 struct linkea_data {
/**
@@ -43,6 +55,7 @@ struct linkea_data {
 
 int linkea_data_new(struct linkea_data *ldata, struct lu_buf *buf);
 int linkea_init(struct linkea_data *ldata);
+int linkea_init_with_rec(struct linkea_data *ldata);
 void linkea_entry_unpack(const struct link_ea_entry *lee, int *reclen,
 struct lu_name *lname, struct lu_fid *pfid);
 int linkea_entry_pack(struct link_ea_entry *lee, const struct lu_name *lname,
diff --git a/drivers/staging/lustre/lustre/llite/llite_lib.c 
b/drivers/staging/lustre/lustre/llite/llite_lib.c
index b229cbc..9a9cdb0 100644
--- a/drivers/staging/lustre/lustre/llite/llite_lib.c
+++ b/drivers/staging/lustre/lustre/llite/llite_lib.c
@@ -2553,7 +2553,7 @@ static int ll_linkea_decode(struct linkea_data *ldata, 
unsigned int linkno,
unsigned int idx;
int rc;
 
-   rc = linkea_init(ldata);
+   rc = linkea_init_with_rec(ldata);
if (rc < 0)
return rc;
 
diff --git a/drivers/staging/lustre/lustre/obdclass/linkea.c 
b/drivers/staging/lustre/lustre/obdclass/linkea.c
index 0b1d2f0..0c4 100644
--- a/drivers/staging/lustre/lustre/obdclass/linkea.c
+++ b/drivers/staging/lustre/lustre/obdclass/linkea.c
@@ -39,6 +39,8 @@ int linkea_data_new(struct linkea_data *ldata, struct lu_buf 
*buf)
ldata->ld_leh->leh_magic = LINK_EA_MAGIC;
ldata->ld_leh->leh_len = sizeof(struct link_ea_header);
ldata->ld_leh->leh_reccount = 0;
+   ldata->ld_leh->leh_overflow_time = 0;
+   ldata->ld_leh->leh_padding = 0;
return 0;
 }
 EXPORT_SYMBOL(linkea_data_new);
@@ -53,11 +55,15 @@ int linkea_init(struct linkea_data *ldata)

[PATCH 50/60] staging: lustre: ptlrpc: remove unused pc->pc_env

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

Environment for request interpreters is not used any more.

Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8887
Reviewed-on: https://review.whamcloud.com/24061
Reviewed-by: John L. Hammond 
Reviewed-by: Bob Glossman 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/lustre_net.h |  4 
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 13 -
 2 files changed, 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h 
b/drivers/staging/lustre/lustre/include/lustre_net.h
index 411eb0d..a73f168 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -1661,10 +1661,6 @@ struct ptlrpcd_ctl {
 */
charpc_name[16];
/**
-* Environment for request interpreters to run in.
-*/
-   struct lu_env  pc_env;
-   /**
 * CPT the thread is bound on.
 */
int pc_cpt;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c 
b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index 1f55d64..84c5551 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -562,15 +562,6 @@ int ptlrpcd_start(struct ptlrpcd_ctl *pc)
return 0;
}
 
-   /*
-* So far only "client" ptlrpcd uses an environment. In the future,
-* ptlrpcd thread (or a thread-set) has to be given an argument,
-* describing its "scope".
-*/
-   rc = lu_context_init(>pc_env.le_ctx, LCT_CL_THREAD | LCT_REMEMBER);
-   if (rc != 0)
-   goto out;
-
task = kthread_run(ptlrpcd, pc, "%s", pc->pc_name);
if (IS_ERR(task)) {
rc = PTR_ERR(task);
@@ -593,9 +584,6 @@ int ptlrpcd_start(struct ptlrpcd_ctl *pc)
spin_unlock(>pc_lock);
ptlrpc_set_destroy(set);
}
-   lu_context_fini(>pc_env.le_ctx);
-
-out:
clear_bit(LIOD_START, >pc_flags);
return rc;
 }
@@ -623,7 +611,6 @@ void ptlrpcd_free(struct ptlrpcd_ctl *pc)
}
 
wait_for_completion(>pc_finishing);
-   lu_context_fini(>pc_env.le_ctx);
 
spin_lock(>pc_lock);
pc->pc_set = NULL;
-- 
1.8.3.1

[PATCH 50/60] staging: lustre: ptlrpc: remove unused pc->pc_env

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

Environment for request interpreters is not used any more.

Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8887
Reviewed-on: https://review.whamcloud.com/24061
Reviewed-by: John L. Hammond 
Reviewed-by: Bob Glossman 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/lustre_net.h |  4 
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 13 -
 2 files changed, 17 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/lustre_net.h 
b/drivers/staging/lustre/lustre/include/lustre_net.h
index 411eb0d..a73f168 100644
--- a/drivers/staging/lustre/lustre/include/lustre_net.h
+++ b/drivers/staging/lustre/lustre/include/lustre_net.h
@@ -1661,10 +1661,6 @@ struct ptlrpcd_ctl {
 */
charpc_name[16];
/**
-* Environment for request interpreters to run in.
-*/
-   struct lu_env  pc_env;
-   /**
 * CPT the thread is bound on.
 */
int pc_cpt;
diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c 
b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index 1f55d64..84c5551 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -562,15 +562,6 @@ int ptlrpcd_start(struct ptlrpcd_ctl *pc)
return 0;
}
 
-   /*
-* So far only "client" ptlrpcd uses an environment. In the future,
-* ptlrpcd thread (or a thread-set) has to be given an argument,
-* describing its "scope".
-*/
-   rc = lu_context_init(>pc_env.le_ctx, LCT_CL_THREAD | LCT_REMEMBER);
-   if (rc != 0)
-   goto out;
-
task = kthread_run(ptlrpcd, pc, "%s", pc->pc_name);
if (IS_ERR(task)) {
rc = PTR_ERR(task);
@@ -593,9 +584,6 @@ int ptlrpcd_start(struct ptlrpcd_ctl *pc)
spin_unlock(>pc_lock);
ptlrpc_set_destroy(set);
}
-   lu_context_fini(>pc_env.le_ctx);
-
-out:
clear_bit(LIOD_START, >pc_flags);
return rc;
 }
@@ -623,7 +611,6 @@ void ptlrpcd_free(struct ptlrpcd_ctl *pc)
}
 
wait_for_completion(>pc_finishing);
-   lu_context_fini(>pc_env.le_ctx);
 
spin_lock(>pc_lock);
pc->pc_set = NULL;
-- 
1.8.3.1

[PATCH 51/60] staging: lustre: ptlrpc: update MODULE_PARAM_DESC in ptlrpcd.c

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

Update max_ptlrpcds module parameter descriptions to let
users know its obsolete. Change cpt to CPT for the module
parameter description ptlrpcd_per_cpt_max so it matches
documentation.

Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8890
Reviewed-on: https://review.whamcloud.com/24065
Reviewed-by: John L. Hammond 
Reviewed-by: Andreas Dilger 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c 
b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index 84c5551..59b5813 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -82,7 +82,8 @@ struct ptlrpcd {
  */
 static int max_ptlrpcds;
 module_param(max_ptlrpcds, int, 0644);
-MODULE_PARM_DESC(max_ptlrpcds, "Max ptlrpcd thread count to be started.");
+MODULE_PARM_DESC(max_ptlrpcds,
+"Max ptlrpcd thread count to be started (obsolete).");
 
 /*
  * ptlrpcd_bind_policy is obsolete, but retained to ensure that
@@ -102,7 +103,7 @@ struct ptlrpcd {
 static int ptlrpcd_per_cpt_max;
 module_param(ptlrpcd_per_cpt_max, int, 0644);
 MODULE_PARM_DESC(ptlrpcd_per_cpt_max,
-"Max ptlrpcd thread count to be started per cpt.");
+"Max ptlrpcd thread count to be started per CPT.");
 
 /*
  * ptlrpcd_partner_group_size: The desired number of threads in each
-- 
1.8.3.1

[PATCH 53/60] staging: lustre: ptlrpc: update replay cursor when close during replay

2017-01-28 Thread James Simmons

From: Niu Yawei 

The replay cursor should be updated properly when close happened
during replay, otherwise, ptlrpc_replay_next() could run into a
dead loop due to an invalid replay cursor:

- replay cursor is moved to an open request during replay;
- application close that open file, so the rq_replay of the open
  request is cleared;
- ptlrpc_replay_next() calls ptlrpc_free_committed() to free
  committed/closed requests, the open request is removed from
  the committed list, so the replay cursor is changed to an
  empty list_head now. The open request won't be freed now since
  it's still held by the pending close request;
- ptlrpc_replay_next() continue to move the replay cursor to
  next and run into a dead loop at the end;

Another change in this patch is to remove the out of date comments
in ptlrpc_replay_next() and cover the whole process of finding
replay request within imp_lock, because:

1. With two separated replay lists and replay cursor introduced,
   finding replay request won't take much time as before, it's
   not necessary to do this "lock -> unlock -> lock -> unlock"
   trick anymore;

2. Nowadays there are various kind of non-replay requests are
   allowed during recovery, so ptlrpc_free_committed() may run in
   parallel to remove an open request while ptlrpc_replay_next()
   is iterating the open requests list;

Signed-off-by: Niu Yawei 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8765
Reviewed-on: https://review.whamcloud.com/23418
Reviewed-by: Yang Sheng 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/ptlrpc/client.c  | 15 ++-
 drivers/staging/lustre/lustre/ptlrpc/recover.c | 23 +--
 2 files changed, 11 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c 
b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 332b360..8dfb40f 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -2662,11 +2662,16 @@ void ptlrpc_free_committed(struct obd_import *imp)
list_for_each_entry_safe(req, saved, >imp_committed_list,
 rq_replay_list) {
LASSERT(req->rq_transno != 0);
-   if (req->rq_import_generation < imp->imp_generation) {
-   DEBUG_REQ(D_RPCTRACE, req, "free stale open request");
-   ptlrpc_free_request(req);
-   } else if (!req->rq_replay) {
-   DEBUG_REQ(D_RPCTRACE, req, "free closed open request");
+   if (req->rq_import_generation < imp->imp_generation ||
+   !req->rq_replay) {
+   DEBUG_REQ(D_RPCTRACE, req, "free %s open request",
+ req->rq_import_generation <
+ imp->imp_generation ? "stale" : "closed");
+
+   if (imp->imp_replay_cursor == >rq_replay_list)
+   imp->imp_replay_cursor =
+   req->rq_replay_list.next;
+
ptlrpc_free_request(req);
}
}
diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c 
b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index c03e113..7b58545 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -78,28 +78,11 @@ int ptlrpc_replay_next(struct obd_import *imp, int 
*inflight)
imp->imp_last_transno_checked = 0;
ptlrpc_free_committed(imp);
last_transno = imp->imp_last_replay_transno;
-   spin_unlock(>imp_lock);
 
CDEBUG(D_HA, "import %p from %s committed %llu last %llu\n",
   imp, obd2cli_tgt(imp->imp_obd),
   imp->imp_peer_committed_transno, last_transno);
 
-   /* Do I need to hold a lock across this iteration?  We shouldn't be
-* racing with any additions to the list, because we're in recovery
-* and are therefore not processing additional requests to add.  Calls
-* to ptlrpc_free_committed might commit requests, but nothing "newer"
-* than the one we're replaying (it can't be committed until it's
-* replayed, and we're doing that here).  l_f_e_safe protects against
-* problems with the current request being committed, in the unlikely
-* event of that race.  So, in conclusion, I think that it's safe to
-* perform this list-walk without the imp_lock held.
-*
-* But, the {mdc,osc}_replay_open callbacks both iterate
-* request lists, and have comments saying they assume the
-* imp_lock is being held by ptlrpc_replay, but it's not. it's
-* just a little race...
-*/
-
/* Replay all the committed open

[PATCH 51/60] staging: lustre: ptlrpc: update MODULE_PARAM_DESC in ptlrpcd.c

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

Update max_ptlrpcds module parameter descriptions to let
users know its obsolete. Change cpt to CPT for the module
parameter description ptlrpcd_per_cpt_max so it matches
documentation.

Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8890
Reviewed-on: https://review.whamcloud.com/24065
Reviewed-by: John L. Hammond 
Reviewed-by: Andreas Dilger 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c 
b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
index 84c5551..59b5813 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/ptlrpcd.c
@@ -82,7 +82,8 @@ struct ptlrpcd {
  */
 static int max_ptlrpcds;
 module_param(max_ptlrpcds, int, 0644);
-MODULE_PARM_DESC(max_ptlrpcds, "Max ptlrpcd thread count to be started.");
+MODULE_PARM_DESC(max_ptlrpcds,
+"Max ptlrpcd thread count to be started (obsolete).");
 
 /*
  * ptlrpcd_bind_policy is obsolete, but retained to ensure that
@@ -102,7 +103,7 @@ struct ptlrpcd {
 static int ptlrpcd_per_cpt_max;
 module_param(ptlrpcd_per_cpt_max, int, 0644);
 MODULE_PARM_DESC(ptlrpcd_per_cpt_max,
-"Max ptlrpcd thread count to be started per cpt.");
+"Max ptlrpcd thread count to be started per CPT.");
 
 /*
  * ptlrpcd_partner_group_size: The desired number of threads in each
-- 
1.8.3.1

[PATCH 53/60] staging: lustre: ptlrpc: update replay cursor when close during replay

2017-01-28 Thread James Simmons

From: Niu Yawei 

The replay cursor should be updated properly when close happened
during replay, otherwise, ptlrpc_replay_next() could run into a
dead loop due to an invalid replay cursor:

- replay cursor is moved to an open request during replay;
- application close that open file, so the rq_replay of the open
  request is cleared;
- ptlrpc_replay_next() calls ptlrpc_free_committed() to free
  committed/closed requests, the open request is removed from
  the committed list, so the replay cursor is changed to an
  empty list_head now. The open request won't be freed now since
  it's still held by the pending close request;
- ptlrpc_replay_next() continue to move the replay cursor to
  next and run into a dead loop at the end;

Another change in this patch is to remove the out of date comments
in ptlrpc_replay_next() and cover the whole process of finding
replay request within imp_lock, because:

1. With two separated replay lists and replay cursor introduced,
   finding replay request won't take much time as before, it's
   not necessary to do this "lock -> unlock -> lock -> unlock"
   trick anymore;

2. Nowadays there are various kind of non-replay requests are
   allowed during recovery, so ptlrpc_free_committed() may run in
   parallel to remove an open request while ptlrpc_replay_next()
   is iterating the open requests list;

Signed-off-by: Niu Yawei 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8765
Reviewed-on: https://review.whamcloud.com/23418
Reviewed-by: Yang Sheng 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/ptlrpc/client.c  | 15 ++-
 drivers/staging/lustre/lustre/ptlrpc/recover.c | 23 +--
 2 files changed, 11 insertions(+), 27 deletions(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c 
b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 332b360..8dfb40f 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -2662,11 +2662,16 @@ void ptlrpc_free_committed(struct obd_import *imp)
list_for_each_entry_safe(req, saved, >imp_committed_list,
 rq_replay_list) {
LASSERT(req->rq_transno != 0);
-   if (req->rq_import_generation < imp->imp_generation) {
-   DEBUG_REQ(D_RPCTRACE, req, "free stale open request");
-   ptlrpc_free_request(req);
-   } else if (!req->rq_replay) {
-   DEBUG_REQ(D_RPCTRACE, req, "free closed open request");
+   if (req->rq_import_generation < imp->imp_generation ||
+   !req->rq_replay) {
+   DEBUG_REQ(D_RPCTRACE, req, "free %s open request",
+ req->rq_import_generation <
+ imp->imp_generation ? "stale" : "closed");
+
+   if (imp->imp_replay_cursor == >rq_replay_list)
+   imp->imp_replay_cursor =
+   req->rq_replay_list.next;
+
ptlrpc_free_request(req);
}
}
diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c 
b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index c03e113..7b58545 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -78,28 +78,11 @@ int ptlrpc_replay_next(struct obd_import *imp, int 
*inflight)
imp->imp_last_transno_checked = 0;
ptlrpc_free_committed(imp);
last_transno = imp->imp_last_replay_transno;
-   spin_unlock(>imp_lock);
 
CDEBUG(D_HA, "import %p from %s committed %llu last %llu\n",
   imp, obd2cli_tgt(imp->imp_obd),
   imp->imp_peer_committed_transno, last_transno);
 
-   /* Do I need to hold a lock across this iteration?  We shouldn't be
-* racing with any additions to the list, because we're in recovery
-* and are therefore not processing additional requests to add.  Calls
-* to ptlrpc_free_committed might commit requests, but nothing "newer"
-* than the one we're replaying (it can't be committed until it's
-* replayed, and we're doing that here).  l_f_e_safe protects against
-* problems with the current request being committed, in the unlikely
-* event of that race.  So, in conclusion, I think that it's safe to
-* perform this list-walk without the imp_lock held.
-*
-* But, the {mdc,osc}_replay_open callbacks both iterate
-* request lists, and have comments saying they assume the
-* imp_lock is being held by ptlrpc_replay, but it's not. it's
-* just a little race...
-*/
-
/* Replay all the committed open requests on committed_list first */
if (!list_empty(>imp_committed_list)) {
tmp = imp->imp_committed_list.prev;

[PATCH 49/60] staging: lustre: socklnd: remove socklnd_init_msg

2017-01-28 Thread James Simmons

Remove the inline function socklnd_init_msg.
Its only used by the kernel code so no point
keeping it in an UAPI header.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6142
Reviewed-on: https://review.whamcloud.com/18506
Reviewed-by: Dmitry Eremin 
Reviewed-by: Doug Oucharek 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/include/linux/lnet/socklnd.h| 9 -
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 9 +++--
 2 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/socklnd.h 
b/drivers/staging/lustre/include/linux/lnet/socklnd.h
index 7d24a91..acf20ce 100644
--- a/drivers/staging/lustre/include/linux/lnet/socklnd.h
+++ b/drivers/staging/lustre/include/linux/lnet/socklnd.h
@@ -80,15 +80,6 @@
} WIRE_ATTR ksm_u;
 } WIRE_ATTR ksock_msg_t;
 
-static inline void
-socklnd_init_msg(ksock_msg_t *msg, int type)
-{
-   msg->ksm_csum = 0;
-   msg->ksm_type = type;
-   msg->ksm_zc_cookies[0] = 0;
-   msg->ksm_zc_cookies[1] = 0;
-}
-
 #define KSOCK_MSG_NOOP 0xC0/* ksm_u empty */
 #define KSOCK_MSG_LNET 0xC1/* lnet msg */
 
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index b7043e2..b161c2b 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -80,7 +80,9 @@ struct ksock_tx *
tx->tx_niov= 1;
tx->tx_nonblk  = nonblk;
 
-   socklnd_init_msg(>tx_msg, KSOCK_MSG_NOOP);
+   tx->tx_msg.ksm_csum = 0;
+   tx->tx_msg.ksm_type = KSOCK_MSG_NOOP;
+   tx->tx_msg.ksm_zc_cookies[0] = 0;
tx->tx_msg.ksm_zc_cookies[1] = cookie;
 
return tx;
@@ -1004,7 +1006,10 @@ struct ksock_route *
tx->tx_zc_capable = 1;
}
 
-   socklnd_init_msg(>tx_msg, KSOCK_MSG_LNET);
+   tx->tx_msg.ksm_csum = 0;
+   tx->tx_msg.ksm_type = KSOCK_MSG_LNET;
+   tx->tx_msg.ksm_zc_cookies[0] = 0;
+   tx->tx_msg.ksm_zc_cookies[1] = 0;
 
/* The first fragment will be set later in pro_pack */
rc = ksocknal_launch_packet(ni, tx, target);
-- 
1.8.3.1

[PATCH 48/60] staging: lustre: ksocklnd: ignore timedout TX on closing connection

2017-01-28 Thread James Simmons

From: Liang Zhen 

ksocklnd reaper thread always tries to close the connection for the
first timedout zero-copy TX. This is wrong if this connection is
already being closed, because the reaper will see the same TX again
and again and cannot find out other timedout zero-copy TXs and close
connections for them.

Signed-off-by: Liang Zhen 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8867
Reviewed-on: https://review.whamcloud.com/23973
Reviewed-by: Doug Oucharek 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index df4f55e..b7043e2 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -2456,6 +2456,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 
list_for_each_entry(peer, peers, ksnp_list) {
unsigned long deadline = 0;
+   struct ksock_tx *tx_stale;
int resid = 0;
int n = 0;
 
@@ -2503,6 +2504,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
if (list_empty(>ksnp_zc_req_list))
continue;
 
+   tx_stale = NULL;
spin_lock(>ksnp_lock);
list_for_each_entry(tx, >ksnp_zc_req_list, tx_zc_list) {
if (!cfs_time_aftereq(cfs_time_current(),
@@ -2511,26 +2513,26 @@ void ksocknal_write_callback(struct ksock_conn *conn)
/* ignore the TX if connection is being closed */
if (tx->tx_conn->ksnc_closing)
continue;
+   if (!tx_stale)
+   tx_stale = tx;
n++;
}
 
-   if (!n) {
+   if (!tx_stale) {
spin_unlock(>ksnp_lock);
continue;
}
 
-   tx = list_entry(peer->ksnp_zc_req_list.next,
-   struct ksock_tx, tx_zc_list);
-   deadline = tx->tx_deadline;
-   resid = tx->tx_resid;
-   conn = tx->tx_conn;
+   deadline = tx_stale->tx_deadline;
+   resid = tx_stale->tx_resid;
+   conn = tx_stale->tx_conn;
ksocknal_conn_addref(conn);
 
spin_unlock(>ksnp_lock);
read_unlock(_data.ksnd_global_lock);
 
CERROR("Total %d stale ZC_REQs for peer %s detected; the 
oldest(%p) timed out %ld secs ago, resid: %d, wmem: %d\n",
-  n, libcfs_nid2str(peer->ksnp_id.nid), tx,
+  n, libcfs_nid2str(peer->ksnp_id.nid), tx_stale,
   cfs_duration_sec(cfs_time_current() - deadline),
   resid, conn->ksnc_sock->sk->sk_wmem_queued);
 
-- 
1.8.3.1

[PATCH 49/60] staging: lustre: socklnd: remove socklnd_init_msg

2017-01-28 Thread James Simmons

Remove the inline function socklnd_init_msg.
Its only used by the kernel code so no point
keeping it in an UAPI header.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6142
Reviewed-on: https://review.whamcloud.com/18506
Reviewed-by: Dmitry Eremin 
Reviewed-by: Doug Oucharek 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/include/linux/lnet/socklnd.h| 9 -
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 9 +++--
 2 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/lnet/socklnd.h 
b/drivers/staging/lustre/include/linux/lnet/socklnd.h
index 7d24a91..acf20ce 100644
--- a/drivers/staging/lustre/include/linux/lnet/socklnd.h
+++ b/drivers/staging/lustre/include/linux/lnet/socklnd.h
@@ -80,15 +80,6 @@
} WIRE_ATTR ksm_u;
 } WIRE_ATTR ksock_msg_t;
 
-static inline void
-socklnd_init_msg(ksock_msg_t *msg, int type)
-{
-   msg->ksm_csum = 0;
-   msg->ksm_type = type;
-   msg->ksm_zc_cookies[0] = 0;
-   msg->ksm_zc_cookies[1] = 0;
-}
-
 #define KSOCK_MSG_NOOP 0xC0/* ksm_u empty */
 #define KSOCK_MSG_LNET 0xC1/* lnet msg */
 
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index b7043e2..b161c2b 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -80,7 +80,9 @@ struct ksock_tx *
tx->tx_niov= 1;
tx->tx_nonblk  = nonblk;
 
-   socklnd_init_msg(>tx_msg, KSOCK_MSG_NOOP);
+   tx->tx_msg.ksm_csum = 0;
+   tx->tx_msg.ksm_type = KSOCK_MSG_NOOP;
+   tx->tx_msg.ksm_zc_cookies[0] = 0;
tx->tx_msg.ksm_zc_cookies[1] = cookie;
 
return tx;
@@ -1004,7 +1006,10 @@ struct ksock_route *
tx->tx_zc_capable = 1;
}
 
-   socklnd_init_msg(>tx_msg, KSOCK_MSG_LNET);
+   tx->tx_msg.ksm_csum = 0;
+   tx->tx_msg.ksm_type = KSOCK_MSG_LNET;
+   tx->tx_msg.ksm_zc_cookies[0] = 0;
+   tx->tx_msg.ksm_zc_cookies[1] = 0;
 
/* The first fragment will be set later in pro_pack */
rc = ksocknal_launch_packet(ni, tx, target);
-- 
1.8.3.1

[PATCH 48/60] staging: lustre: ksocklnd: ignore timedout TX on closing connection

2017-01-28 Thread James Simmons

From: Liang Zhen 

ksocklnd reaper thread always tries to close the connection for the
first timedout zero-copy TX. This is wrong if this connection is
already being closed, because the reaper will see the same TX again
and again and cannot find out other timedout zero-copy TXs and close
connections for them.

Signed-off-by: Liang Zhen 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8867
Reviewed-on: https://review.whamcloud.com/23973
Reviewed-by: Doug Oucharek 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c | 16 +---
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
index df4f55e..b7043e2 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd_cb.c
@@ -2456,6 +2456,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
 
list_for_each_entry(peer, peers, ksnp_list) {
unsigned long deadline = 0;
+   struct ksock_tx *tx_stale;
int resid = 0;
int n = 0;
 
@@ -2503,6 +2504,7 @@ void ksocknal_write_callback(struct ksock_conn *conn)
if (list_empty(>ksnp_zc_req_list))
continue;
 
+   tx_stale = NULL;
spin_lock(>ksnp_lock);
list_for_each_entry(tx, >ksnp_zc_req_list, tx_zc_list) {
if (!cfs_time_aftereq(cfs_time_current(),
@@ -2511,26 +2513,26 @@ void ksocknal_write_callback(struct ksock_conn *conn)
/* ignore the TX if connection is being closed */
if (tx->tx_conn->ksnc_closing)
continue;
+   if (!tx_stale)
+   tx_stale = tx;
n++;
}
 
-   if (!n) {
+   if (!tx_stale) {
spin_unlock(>ksnp_lock);
continue;
}
 
-   tx = list_entry(peer->ksnp_zc_req_list.next,
-   struct ksock_tx, tx_zc_list);
-   deadline = tx->tx_deadline;
-   resid = tx->tx_resid;
-   conn = tx->tx_conn;
+   deadline = tx_stale->tx_deadline;
+   resid = tx_stale->tx_resid;
+   conn = tx_stale->tx_conn;
ksocknal_conn_addref(conn);
 
spin_unlock(>ksnp_lock);
read_unlock(_data.ksnd_global_lock);
 
CERROR("Total %d stale ZC_REQs for peer %s detected; the 
oldest(%p) timed out %ld secs ago, resid: %d, wmem: %d\n",
-  n, libcfs_nid2str(peer->ksnp_id.nid), tx,
+  n, libcfs_nid2str(peer->ksnp_id.nid), tx_stale,
   cfs_duration_sec(cfs_time_current() - deadline),
   resid, conn->ksnc_sock->sk->sk_wmem_queued);
 
-- 
1.8.3.1

[PATCH 43/60] staging: lustre: obd: remove OBD_NOTIFY_CREATE

2017-01-28 Thread James Simmons

From: "John L. Hammond" 

None of the obd_notify() handlers listen for the OBD_NOTIFY_CREATE
event, so remove it and its sole use in lov_add_target().

Signed-off-by: John L. Hammond 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8403
Reviewed-on: https://review.whamcloud.com/21420
Reviewed-by: Ben Evans 
Reviewed-by: Andreas Dilger 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/obd.h | 2 --
 drivers/staging/lustre/lustre/lov/lov_obd.c | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index ab47078..4ce8506 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -475,8 +475,6 @@ struct niobuf_local {
  * Events signalled through obd_notify() upcall-chain.
  */
 enum obd_notify_event {
-   /* target added */
-   OBD_NOTIFY_CREATE,
/* Device connect start */
OBD_NOTIFY_CONNECT,
/* Device activated */
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c 
b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 63b0645..b3161fb 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -592,8 +592,6 @@ static int lov_add_target(struct obd_device *obd, struct 
obd_uuid *uuidp,
CDEBUG(D_CONFIG, "idx=%d ltd_gen=%d ld_tgt_count=%d\n",
   index, tgt->ltd_gen, lov->desc.ld_tgt_count);
 
-   rc = obd_notify(obd, tgt_obd, OBD_NOTIFY_CREATE, );
-
if (lov->lov_connects == 0) {
/* lov_connect hasn't been called yet. We'll do the
 * lov_connect_obd on this target when that fn first runs,
-- 
1.8.3.1

[PATCH 43/60] staging: lustre: obd: remove OBD_NOTIFY_CREATE

2017-01-28 Thread James Simmons

From: "John L. Hammond" 

None of the obd_notify() handlers listen for the OBD_NOTIFY_CREATE
event, so remove it and its sole use in lov_add_target().

Signed-off-by: John L. Hammond 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8403
Reviewed-on: https://review.whamcloud.com/21420
Reviewed-by: Ben Evans 
Reviewed-by: Andreas Dilger 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/obd.h | 2 --
 drivers/staging/lustre/lustre/lov/lov_obd.c | 2 --
 2 files changed, 4 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index ab47078..4ce8506 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -475,8 +475,6 @@ struct niobuf_local {
  * Events signalled through obd_notify() upcall-chain.
  */
 enum obd_notify_event {
-   /* target added */
-   OBD_NOTIFY_CREATE,
/* Device connect start */
OBD_NOTIFY_CONNECT,
/* Device activated */
diff --git a/drivers/staging/lustre/lustre/lov/lov_obd.c 
b/drivers/staging/lustre/lustre/lov/lov_obd.c
index 63b0645..b3161fb 100644
--- a/drivers/staging/lustre/lustre/lov/lov_obd.c
+++ b/drivers/staging/lustre/lustre/lov/lov_obd.c
@@ -592,8 +592,6 @@ static int lov_add_target(struct obd_device *obd, struct 
obd_uuid *uuidp,
CDEBUG(D_CONFIG, "idx=%d ltd_gen=%d ld_tgt_count=%d\n",
   index, tgt->ltd_gen, lov->desc.ld_tgt_count);
 
-   rc = obd_notify(obd, tgt_obd, OBD_NOTIFY_CREATE, );
-
if (lov->lov_connects == 0) {
/* lov_connect hasn't been called yet. We'll do the
 * lov_connect_obd on this target when that fn first runs,
-- 
1.8.3.1

[PATCH 45/60] staging: lustre: libcfs: Change positional struct initializers to C99

2017-01-28 Thread James Simmons

From: Steve Guminski 

This patch makes no functional changes. Struct initializers in the
libcfs directory that use C89 or GCC-only syntax are updated to C99
syntax.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

libcfs/include/libcfs/libcfs_crypto.h:
static struct cfs_crypto_hash_type hash_types[]

Signed-off-by: Steve Guminski 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6210
Reviewed-on: https://review.whamcloud.com/23332
Reviewed-by: Frank Zago 
Reviewed-by: Dmitry Eremin 
Reviewed-by: James Simmons 
Reviewed-by: Nathaniel Clark 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 .../lustre/include/linux/libcfs/libcfs_crypto.h| 60 ++
 1 file changed, 50 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h 
b/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
index 8f34c5d..3f773a4 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
@@ -53,16 +53,56 @@ enum cfs_crypto_hash_alg {
 };
 
 static struct cfs_crypto_hash_type hash_types[] = {
-   [CFS_HASH_ALG_NULL]= { "null", 0,  0 },
-   [CFS_HASH_ALG_ADLER32] = { "adler32",  1,  4 },
-   [CFS_HASH_ALG_CRC32]   = { "crc32",   ~0,  4 },
-   [CFS_HASH_ALG_CRC32C]  = { "crc32c",  ~0,  4 },
-   [CFS_HASH_ALG_MD5] = { "md5",  0, 16 },
-   [CFS_HASH_ALG_SHA1]= { "sha1", 0, 20 },
-   [CFS_HASH_ALG_SHA256]  = { "sha256",   0, 32 },
-   [CFS_HASH_ALG_SHA384]  = { "sha384",   0, 48 },
-   [CFS_HASH_ALG_SHA512]  = { "sha512",   0, 64 },
-   [CFS_HASH_ALG_MAX]  = { NULL,   0,  64 },
+   [CFS_HASH_ALG_NULL] = {
+   .cht_name   = "null",
+   .cht_key= 0,
+   .cht_size   = 0
+   },
+   [CFS_HASH_ALG_ADLER32] = {
+   .cht_name   = "adler32",
+   .cht_key= 1,
+   .cht_size   = 4
+   },
+   [CFS_HASH_ALG_CRC32] = {
+   .cht_name   = "crc32",
+   .cht_key= ~0,
+   .cht_size   = 4
+   },
+   [CFS_HASH_ALG_CRC32C] = {
+   .cht_name   = "crc32c",
+   .cht_key= ~0,
+   .cht_size   = 4
+   },
+   [CFS_HASH_ALG_MD5] = {
+   .cht_name   = "md5",
+   .cht_key= 0,
+   .cht_size   = 16
+   },
+   [CFS_HASH_ALG_SHA1] = {
+   .cht_name   = "sha1",
+   .cht_key= 0,
+   .cht_size   = 20
+   },
+   [CFS_HASH_ALG_SHA256] = {
+   .cht_name   = "sha256",
+   .cht_key= 0,
+   .cht_size   = 32
+   },
+   [CFS_HASH_ALG_SHA384] = {
+   .cht_name   = "sha384",
+   .cht_key= 0,
+   .cht_size   = 48
+   },
+   [CFS_HASH_ALG_SHA512] = {
+   .cht_name   = "sha512",
+   .cht_key= 0,
+   .cht_size   = 64
+   },
+   [CFS_HASH_ALG_MAX] = {
+   .cht_name   = NULL,
+   .cht_key= 0,
+   .cht_size   = 64
+   },
 };
 
 /* Maximum size of hash_types[].cht_size */
-- 
1.8.3.1

[PATCH 41/60] staging: lustre: osc: osc_match_base prototype differs from declaration

2017-01-28 Thread James Simmons

From: Steve Guminski 

The patch updates the prototype in osc_internal.h to match the
enums used in the declaration.

The osc_match_base declaration in lustre/osc/osc_request.c uses
enums for stricter checking on the type and mode parameters:

int osc_match_base(struct obd_export *exp,
   ...
-->enum ldlm_type type,
   union ldlm_policy_data *policy,
-->enum ldlm_mode mode,
   ...  int unref)

The prototype in lustre/osc/osc_internal.h instead used unsigned ints:

int osc_match_base(struct obd_export *exp,
   ...
-->__u32 type,
   union ldlm_policy_data *policy,
-->__u32 mode,
   ...  int unref);

Signed-off-by: Steve Guminski 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8189
Reviewed-on: http://review.whamcloud.com/23167
Reviewed-by: Frank Zago 
Reviewed-by: Bob Glossman 
Reviewed-by: James Simmons 
Reviewed-by: John L. Hammond 
Reviewed-by: Dmitry Eremin 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/osc/osc_internal.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h 
b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 43a43e4..8abd83f 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -114,9 +114,9 @@ int osc_enqueue_base(struct obd_export *exp, struct 
ldlm_res_id *res_id,
 struct ptlrpc_request_set *rqset, int async, int agl);
 
 int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
-  __u32 type, union ldlm_policy_data *policy, __u32 mode,
-  __u64 *flags, void *data, struct lustre_handle *lockh,
-  int unref);
+  enum ldlm_type type, union ldlm_policy_data *policy,
+  enum ldlm_mode mode, __u64 *flags, void *data,
+  struct lustre_handle *lockh, int unref);
 
 int osc_setattr_async(struct obd_export *exp, struct obdo *oa,
  obd_enqueue_update_f upcall, void *cookie,
-- 
1.8.3.1

[PATCH 45/60] staging: lustre: libcfs: Change positional struct initializers to C99

2017-01-28 Thread James Simmons

From: Steve Guminski 

This patch makes no functional changes. Struct initializers in the
libcfs directory that use C89 or GCC-only syntax are updated to C99
syntax.

The C99 syntax prevents incorrect initialization if values are
accidently placed in the wrong position, allows changes in the struct
definition, and clears any members that are not given an explicit
value.

The following struct initializers have been updated:

libcfs/include/libcfs/libcfs_crypto.h:
static struct cfs_crypto_hash_type hash_types[]

Signed-off-by: Steve Guminski 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6210
Reviewed-on: https://review.whamcloud.com/23332
Reviewed-by: Frank Zago 
Reviewed-by: Dmitry Eremin 
Reviewed-by: James Simmons 
Reviewed-by: Nathaniel Clark 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 .../lustre/include/linux/libcfs/libcfs_crypto.h| 60 ++
 1 file changed, 50 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h 
b/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
index 8f34c5d..3f773a4 100644
--- a/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
+++ b/drivers/staging/lustre/include/linux/libcfs/libcfs_crypto.h
@@ -53,16 +53,56 @@ enum cfs_crypto_hash_alg {
 };
 
 static struct cfs_crypto_hash_type hash_types[] = {
-   [CFS_HASH_ALG_NULL]= { "null", 0,  0 },
-   [CFS_HASH_ALG_ADLER32] = { "adler32",  1,  4 },
-   [CFS_HASH_ALG_CRC32]   = { "crc32",   ~0,  4 },
-   [CFS_HASH_ALG_CRC32C]  = { "crc32c",  ~0,  4 },
-   [CFS_HASH_ALG_MD5] = { "md5",  0, 16 },
-   [CFS_HASH_ALG_SHA1]= { "sha1", 0, 20 },
-   [CFS_HASH_ALG_SHA256]  = { "sha256",   0, 32 },
-   [CFS_HASH_ALG_SHA384]  = { "sha384",   0, 48 },
-   [CFS_HASH_ALG_SHA512]  = { "sha512",   0, 64 },
-   [CFS_HASH_ALG_MAX]  = { NULL,   0,  64 },
+   [CFS_HASH_ALG_NULL] = {
+   .cht_name   = "null",
+   .cht_key= 0,
+   .cht_size   = 0
+   },
+   [CFS_HASH_ALG_ADLER32] = {
+   .cht_name   = "adler32",
+   .cht_key= 1,
+   .cht_size   = 4
+   },
+   [CFS_HASH_ALG_CRC32] = {
+   .cht_name   = "crc32",
+   .cht_key= ~0,
+   .cht_size   = 4
+   },
+   [CFS_HASH_ALG_CRC32C] = {
+   .cht_name   = "crc32c",
+   .cht_key= ~0,
+   .cht_size   = 4
+   },
+   [CFS_HASH_ALG_MD5] = {
+   .cht_name   = "md5",
+   .cht_key= 0,
+   .cht_size   = 16
+   },
+   [CFS_HASH_ALG_SHA1] = {
+   .cht_name   = "sha1",
+   .cht_key= 0,
+   .cht_size   = 20
+   },
+   [CFS_HASH_ALG_SHA256] = {
+   .cht_name   = "sha256",
+   .cht_key= 0,
+   .cht_size   = 32
+   },
+   [CFS_HASH_ALG_SHA384] = {
+   .cht_name   = "sha384",
+   .cht_key= 0,
+   .cht_size   = 48
+   },
+   [CFS_HASH_ALG_SHA512] = {
+   .cht_name   = "sha512",
+   .cht_key= 0,
+   .cht_size   = 64
+   },
+   [CFS_HASH_ALG_MAX] = {
+   .cht_name   = NULL,
+   .cht_key= 0,
+   .cht_size   = 64
+   },
 };
 
 /* Maximum size of hash_types[].cht_size */
-- 
1.8.3.1

[PATCH 41/60] staging: lustre: osc: osc_match_base prototype differs from declaration

2017-01-28 Thread James Simmons

From: Steve Guminski 

The patch updates the prototype in osc_internal.h to match the
enums used in the declaration.

The osc_match_base declaration in lustre/osc/osc_request.c uses
enums for stricter checking on the type and mode parameters:

int osc_match_base(struct obd_export *exp,
   ...
-->enum ldlm_type type,
   union ldlm_policy_data *policy,
-->enum ldlm_mode mode,
   ...  int unref)

The prototype in lustre/osc/osc_internal.h instead used unsigned ints:

int osc_match_base(struct obd_export *exp,
   ...
-->__u32 type,
   union ldlm_policy_data *policy,
-->__u32 mode,
   ...  int unref);

Signed-off-by: Steve Guminski 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8189
Reviewed-on: http://review.whamcloud.com/23167
Reviewed-by: Frank Zago 
Reviewed-by: Bob Glossman 
Reviewed-by: James Simmons 
Reviewed-by: John L. Hammond 
Reviewed-by: Dmitry Eremin 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/osc/osc_internal.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_internal.h 
b/drivers/staging/lustre/lustre/osc/osc_internal.h
index 43a43e4..8abd83f 100644
--- a/drivers/staging/lustre/lustre/osc/osc_internal.h
+++ b/drivers/staging/lustre/lustre/osc/osc_internal.h
@@ -114,9 +114,9 @@ int osc_enqueue_base(struct obd_export *exp, struct 
ldlm_res_id *res_id,
 struct ptlrpc_request_set *rqset, int async, int agl);
 
 int osc_match_base(struct obd_export *exp, struct ldlm_res_id *res_id,
-  __u32 type, union ldlm_policy_data *policy, __u32 mode,
-  __u64 *flags, void *data, struct lustre_handle *lockh,
-  int unref);
+  enum ldlm_type type, union ldlm_policy_data *policy,
+  enum ldlm_mode mode, __u64 *flags, void *data,
+  struct lustre_handle *lockh, int unref);
 
 int osc_setattr_async(struct obd_export *exp, struct obdo *oa,
  obd_enqueue_update_f upcall, void *cookie,
-- 
1.8.3.1

[PATCH 42/60] staging: lustre: ptlrpc: allow blocking asts to be delayed

2017-01-28 Thread James Simmons

From: Vladimir Saveliev 

ptlrpc_import_delay_req() refuses to delay blocking asts when import
is not in LUSTRE_IMP_FULL yet. That leads to client eviction assuming
that it failed to respond.

Allow delays for blocking asts being resent.

Signed-off-by: Vladimir Saveliev 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8351
Seagate-bug-id: MRP-3500
Reviewed-on: https://review.whamcloud.com/21065
Reviewed-by: Bobi Jam 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/ptlrpc/client.c  | 2 +-
 drivers/staging/lustre/lustre/ptlrpc/recover.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c 
b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 3c18ab6..332b360 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1160,7 +1160,7 @@ static int ptlrpc_import_delay_req(struct obd_import *imp,
if (atomic_read(>imp_inval_count) != 0) {
DEBUG_REQ(D_ERROR, req, "invalidate in flight");
*status = -EIO;
-   } else if (imp->imp_dlm_fake || req->rq_no_delay) {
+   } else if (req->rq_no_delay) {
*status = -EWOULDBLOCK;
} else if (req->rq_allow_replay &&
  (imp->imp_state == LUSTRE_IMP_REPLAY ||
diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c 
b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index c004490..c03e113 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -221,6 +221,7 @@ int ptlrpc_resend(struct obd_import *imp)
}
spin_unlock(>imp_lock);
 
+   OBD_FAIL_TIMEOUT(OBD_FAIL_LDLM_ENQUEUE_OLD_EXPORT, 2);
return 0;
 }
 
-- 
1.8.3.1

[PATCH 42/60] staging: lustre: ptlrpc: allow blocking asts to be delayed

2017-01-28 Thread James Simmons

From: Vladimir Saveliev 

ptlrpc_import_delay_req() refuses to delay blocking asts when import
is not in LUSTRE_IMP_FULL yet. That leads to client eviction assuming
that it failed to respond.

Allow delays for blocking asts being resent.

Signed-off-by: Vladimir Saveliev 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8351
Seagate-bug-id: MRP-3500
Reviewed-on: https://review.whamcloud.com/21065
Reviewed-by: Bobi Jam 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/ptlrpc/client.c  | 2 +-
 drivers/staging/lustre/lustre/ptlrpc/recover.c | 1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/ptlrpc/client.c 
b/drivers/staging/lustre/lustre/ptlrpc/client.c
index 3c18ab6..332b360 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/client.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/client.c
@@ -1160,7 +1160,7 @@ static int ptlrpc_import_delay_req(struct obd_import *imp,
if (atomic_read(>imp_inval_count) != 0) {
DEBUG_REQ(D_ERROR, req, "invalidate in flight");
*status = -EIO;
-   } else if (imp->imp_dlm_fake || req->rq_no_delay) {
+   } else if (req->rq_no_delay) {
*status = -EWOULDBLOCK;
} else if (req->rq_allow_replay &&
  (imp->imp_state == LUSTRE_IMP_REPLAY ||
diff --git a/drivers/staging/lustre/lustre/ptlrpc/recover.c 
b/drivers/staging/lustre/lustre/ptlrpc/recover.c
index c004490..c03e113 100644
--- a/drivers/staging/lustre/lustre/ptlrpc/recover.c
+++ b/drivers/staging/lustre/lustre/ptlrpc/recover.c
@@ -221,6 +221,7 @@ int ptlrpc_resend(struct obd_import *imp)
}
spin_unlock(>imp_lock);
 
+   OBD_FAIL_TIMEOUT(OBD_FAIL_LDLM_ENQUEUE_OLD_EXPORT, 2);
return 0;
 }
 
-- 
1.8.3.1

[PATCH 39/60] staging: libcfs: remove integer types abstraction from libcfs

2017-01-28 Thread James Simmons

Replace the ulong_ptr_t and long_ptr_t with standard
kernel types.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6245
Reviewed-on: http://review.whamcloud.com/20204
Reviewed-by: Frank Zago 
Reviewed-by: Dmitry Eremin 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h | 4 
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c| 2 +-
 drivers/staging/lustre/lnet/libcfs/debug.c | 2 +-
 drivers/staging/lustre/lnet/lnet/acceptor.c| 4 ++--
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h 
b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
index e8695e4..fa0808d 100644
--- a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
@@ -125,10 +125,6 @@
 
 #include 
 
-/* long integer with size equal to pointer */
-typedef unsigned long ulong_ptr_t;
-typedef long long_ptr_t;
-
 #ifndef WITH_WATCHDOG
 #define WITH_WATCHDOG
 #endif
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
index 2181c67..8aab001 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
@@ -2507,7 +2507,7 @@ static int ksocknal_push(lnet_ni_t *ni, lnet_process_id_t 
id)
 
snprintf(name, sizeof(name), "socknal_cd%02d", i);
rc = ksocknal_thread_start(ksocknal_connd,
-  (void *)((ulong_ptr_t)i), name);
+  (void *)((uintptr_t)i), name);
if (rc) {
spin_lock_bh(_data.ksnd_connd_lock);
ksocknal_data.ksnd_connd_starting--;
diff --git a/drivers/staging/lustre/lnet/libcfs/debug.c 
b/drivers/staging/lustre/lnet/libcfs/debug.c
index a38db23..3408041 100644
--- a/drivers/staging/lustre/lnet/libcfs/debug.c
+++ b/drivers/staging/lustre/lnet/libcfs/debug.c
@@ -343,7 +343,7 @@ void libcfs_debug_dumplog_internal(void *arg)
last_dump_time = current_time;
snprintf(debug_file_name, sizeof(debug_file_name) - 1,
 "%s.%lld.%ld", libcfs_debug_file_path_arr,
-(s64)current_time, (long_ptr_t)arg);
+(s64)current_time, (long)arg);
pr_alert("LustreError: dumping log to %s\n", debug_file_name);
cfs_tracefile_dump_all_pages(debug_file_name);
libcfs_run_debug_log_upcall(debug_file_name);
diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c 
b/drivers/staging/lustre/lnet/lnet/acceptor.c
index a55c6cd..b43a994 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -330,7 +330,7 @@
__u32 magic;
__u32 peer_ip;
int peer_port;
-   int secure = (int)((long_ptr_t)arg);
+   int secure = (int)((long)arg);
 
LASSERT(!lnet_acceptor_state.pta_sock);
 
@@ -459,7 +459,7 @@
if (!lnet_count_acceptor_nis())  /* not required */
return 0;
 
-   task = kthread_run(lnet_acceptor, (void *)(ulong_ptr_t)secure,
+   task = kthread_run(lnet_acceptor, (void *)(uintptr_t)secure,
   "acceptor_%03ld", secure);
if (IS_ERR(task)) {
rc2 = PTR_ERR(task);
-- 
1.8.3.1

[PATCH 39/60] staging: libcfs: remove integer types abstraction from libcfs

2017-01-28 Thread James Simmons

Replace the ulong_ptr_t and long_ptr_t with standard
kernel types.

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6245
Reviewed-on: http://review.whamcloud.com/20204
Reviewed-by: Frank Zago 
Reviewed-by: Dmitry Eremin 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h | 4 
 drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c| 2 +-
 drivers/staging/lustre/lnet/libcfs/debug.c | 2 +-
 drivers/staging/lustre/lnet/lnet/acceptor.c| 4 ++--
 4 files changed, 4 insertions(+), 8 deletions(-)

diff --git a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h 
b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
index e8695e4..fa0808d 100644
--- a/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
+++ b/drivers/staging/lustre/include/linux/libcfs/linux/libcfs.h
@@ -125,10 +125,6 @@
 
 #include 
 
-/* long integer with size equal to pointer */
-typedef unsigned long ulong_ptr_t;
-typedef long long_ptr_t;
-
 #ifndef WITH_WATCHDOG
 #define WITH_WATCHDOG
 #endif
diff --git a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c 
b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
index 2181c67..8aab001 100644
--- a/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
+++ b/drivers/staging/lustre/lnet/klnds/socklnd/socklnd.c
@@ -2507,7 +2507,7 @@ static int ksocknal_push(lnet_ni_t *ni, lnet_process_id_t 
id)
 
snprintf(name, sizeof(name), "socknal_cd%02d", i);
rc = ksocknal_thread_start(ksocknal_connd,
-  (void *)((ulong_ptr_t)i), name);
+  (void *)((uintptr_t)i), name);
if (rc) {
spin_lock_bh(_data.ksnd_connd_lock);
ksocknal_data.ksnd_connd_starting--;
diff --git a/drivers/staging/lustre/lnet/libcfs/debug.c 
b/drivers/staging/lustre/lnet/libcfs/debug.c
index a38db23..3408041 100644
--- a/drivers/staging/lustre/lnet/libcfs/debug.c
+++ b/drivers/staging/lustre/lnet/libcfs/debug.c
@@ -343,7 +343,7 @@ void libcfs_debug_dumplog_internal(void *arg)
last_dump_time = current_time;
snprintf(debug_file_name, sizeof(debug_file_name) - 1,
 "%s.%lld.%ld", libcfs_debug_file_path_arr,
-(s64)current_time, (long_ptr_t)arg);
+(s64)current_time, (long)arg);
pr_alert("LustreError: dumping log to %s\n", debug_file_name);
cfs_tracefile_dump_all_pages(debug_file_name);
libcfs_run_debug_log_upcall(debug_file_name);
diff --git a/drivers/staging/lustre/lnet/lnet/acceptor.c 
b/drivers/staging/lustre/lnet/lnet/acceptor.c
index a55c6cd..b43a994 100644
--- a/drivers/staging/lustre/lnet/lnet/acceptor.c
+++ b/drivers/staging/lustre/lnet/lnet/acceptor.c
@@ -330,7 +330,7 @@
__u32 magic;
__u32 peer_ip;
int peer_port;
-   int secure = (int)((long_ptr_t)arg);
+   int secure = (int)((long)arg);
 
LASSERT(!lnet_acceptor_state.pta_sock);
 
@@ -459,7 +459,7 @@
if (!lnet_count_acceptor_nis())  /* not required */
return 0;
 
-   task = kthread_run(lnet_acceptor, (void *)(ulong_ptr_t)secure,
+   task = kthread_run(lnet_acceptor, (void *)(uintptr_t)secure,
   "acceptor_%03ld", secure);
if (IS_ERR(task)) {
rc2 = PTR_ERR(task);
-- 
1.8.3.1

[PATCH 37/60] staging: lustre: llite: specify READA debug mask for ras_update

2017-01-28 Thread James Simmons

From: Bobi Jam 

So that debug log only contains relevant messages for debugging
purpose.

Signed-off-by: Bobi Jam 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8413
Reviewed-on: http://review.whamcloud.com/22753
Reviewed-by: Andreas Dilger 
Reviewed-by: Fan Yong 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/llite/rw.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/rw.c 
b/drivers/staging/lustre/lustre/llite/rw.c
index 18d3ccb..50d027e 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++ b/drivers/staging/lustre/lustre/llite/rw.c
@@ -729,6 +729,10 @@ static void ras_update(struct ll_sb_info *sbi, struct 
inode *inode,
 
spin_lock(>ras_lock);
 
+   if (!hit)
+   CDEBUG(D_READA, DFID " pages at %lu miss.\n",
+  PFID(ll_inode2fid(inode)), index);
+
ll_ra_stats_inc_sbi(sbi, hit ? RA_STAT_HIT : RA_STAT_MISS);
 
/* reset the read-ahead window in two cases.  First when the app seeks
-- 
1.8.3.1

[PATCH 37/60] staging: lustre: llite: specify READA debug mask for ras_update

2017-01-28 Thread James Simmons

From: Bobi Jam 

So that debug log only contains relevant messages for debugging
purpose.

Signed-off-by: Bobi Jam 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8413
Reviewed-on: http://review.whamcloud.com/22753
Reviewed-by: Andreas Dilger 
Reviewed-by: Fan Yong 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/llite/rw.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/staging/lustre/lustre/llite/rw.c 
b/drivers/staging/lustre/lustre/llite/rw.c
index 18d3ccb..50d027e 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++ b/drivers/staging/lustre/lustre/llite/rw.c
@@ -729,6 +729,10 @@ static void ras_update(struct ll_sb_info *sbi, struct 
inode *inode,
 
spin_lock(>ras_lock);
 
+   if (!hit)
+   CDEBUG(D_READA, DFID " pages at %lu miss.\n",
+  PFID(ll_inode2fid(inode)), index);
+
ll_ra_stats_inc_sbi(sbi, hit ? RA_STAT_HIT : RA_STAT_MISS);
 
/* reset the read-ahead window in two cases.  First when the app seeks
-- 
1.8.3.1

[PATCH 34/60] staging: lustre: libcfs: default CPT matches NUMA topology

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

Change default value of CPT pattern and make it match NUMA topology

Signed-off-by: Liang Zhen 
Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5050
Reviewed-on: http://review.whamcloud.com/22377
Reviewed-by: James Simmons 
Reviewed-by: Olaf Weber 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 71a5b19..62ab76e 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -59,7 +59,7 @@
  *
  * NB: If user specified cpu_pattern, cpu_npartitions will be ignored
  */
-static char*cpu_pattern = "";
+static char*cpu_pattern = "N";
 module_param(cpu_pattern, charp, 0444);
 MODULE_PARM_DESC(cpu_pattern, "CPU partitions pattern");
 
-- 
1.8.3.1

[PATCH 34/60] staging: lustre: libcfs: default CPT matches NUMA topology

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

Change default value of CPT pattern and make it match NUMA topology

Signed-off-by: Liang Zhen 
Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5050
Reviewed-on: http://review.whamcloud.com/22377
Reviewed-by: James Simmons 
Reviewed-by: Olaf Weber 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 71a5b19..62ab76e 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -59,7 +59,7 @@
  *
  * NB: If user specified cpu_pattern, cpu_npartitions will be ignored
  */
-static char*cpu_pattern = "";
+static char*cpu_pattern = "N";
 module_param(cpu_pattern, charp, 0444);
 MODULE_PARM_DESC(cpu_pattern, "CPU partitions pattern");
 
-- 
1.8.3.1

[PATCH 36/60] staging: lustre: header: remove assert from interval_set()

2017-01-28 Thread James Simmons

In the case of interval_tree.h only interval_set()
uses LASSERT which is removed in this patch and
interval_set() instead reports a real error. The
header libcfs.h for interval_tree.h is not needed
anymore so we can just use the standard linux
kernel headers instead.h

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6401
Reviewed-on: https://review.whamcloud.com/22522
Reviewed-on: https://review.whamcloud.com/24323
Reviewed-by: Frank Zago 
Reviewed-by: Dmitry Eremin 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/interval_tree.h | 12 
 drivers/staging/lustre/lustre/ldlm/ldlm_extent.c  |  5 +++--
 drivers/staging/lustre/lustre/llite/range_lock.c  | 10 --
 drivers/staging/lustre/lustre/llite/range_lock.h  |  2 +-
 4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/interval_tree.h 
b/drivers/staging/lustre/lustre/include/interval_tree.h
index 5d387d3..0d4f92e 100644
--- a/drivers/staging/lustre/lustre/include/interval_tree.h
+++ b/drivers/staging/lustre/lustre/include/interval_tree.h
@@ -36,7 +36,9 @@
 #ifndef _INTERVAL_H__
 #define _INTERVAL_H__
 
-#include "../../include/linux/libcfs/libcfs.h" /* LASSERT. */
+#include 
+#include 
+#include 
 
 struct interval_node {
struct interval_node   *in_left;
@@ -73,13 +75,15 @@ static inline __u64 interval_high(struct interval_node 
*node)
return node->in_extent.end;
 }
 
-static inline void interval_set(struct interval_node *node,
-   __u64 start, __u64 end)
+static inline int interval_set(struct interval_node *node,
+  __u64 start, __u64 end)
 {
-   LASSERT(start <= end);
+   if (start > end)
+   return -ERANGE;
node->in_extent.start = start;
node->in_extent.end = end;
node->in_max_high = end;
+   return 0;
 }
 
 /*
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
index 5616ea4..08f97e2 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
@@ -162,7 +162,7 @@ void ldlm_extent_add_lock(struct ldlm_resource *res,
struct interval_node *found, **root;
struct ldlm_interval *node;
struct ldlm_extent *extent;
-   int idx;
+   int idx, rc;
 
LASSERT(lock->l_granted_mode == lock->l_req_mode);
 
@@ -176,7 +176,8 @@ void ldlm_extent_add_lock(struct ldlm_resource *res,
 
/* node extent initialize */
extent = >l_policy_data.l_extent;
-   interval_set(>li_node, extent->start, extent->end);
+   rc = interval_set(>li_node, extent->start, extent->end);
+   LASSERT(!rc);
 
root = >lr_itree[idx].lit_root;
found = interval_insert(>li_node, root);
diff --git a/drivers/staging/lustre/lustre/llite/range_lock.c 
b/drivers/staging/lustre/lustre/llite/range_lock.c
index 94c818f..14148a0 100644
--- a/drivers/staging/lustre/lustre/llite/range_lock.c
+++ b/drivers/staging/lustre/lustre/llite/range_lock.c
@@ -61,17 +61,23 @@ void range_lock_tree_init(struct range_lock_tree *tree)
  * Pre:  Caller should have allocated the range lock node.
  * Post: The range lock node is meant to cover [start, end] region
  */
-void range_lock_init(struct range_lock *lock, __u64 start, __u64 end)
+int range_lock_init(struct range_lock *lock, __u64 start, __u64 end)
 {
+   int rc;
+
memset(>rl_node, 0, sizeof(lock->rl_node));
if (end != LUSTRE_EOF)
end >>= PAGE_SHIFT;
-   interval_set(>rl_node, start >> PAGE_SHIFT, end);
+   rc = interval_set(>rl_node, start >> PAGE_SHIFT, end);
+   if (rc)
+   return rc;
+
INIT_LIST_HEAD(>rl_next_lock);
lock->rl_task = NULL;
lock->rl_lock_count = 0;
lock->rl_blocking_ranges = 0;
lock->rl_sequence = 0;
+   return rc;
 }
 
 static inline struct range_lock *next_lock(struct range_lock *lock)
diff --git a/drivers/staging/lustre/lustre/llite/range_lock.h 
b/drivers/staging/lustre/lustre/llite/range_lock.h
index c6d04a6..779091c 100644
--- a/drivers/staging/lustre/lustre/llite/range_lock.h
+++ b/drivers/staging/lustre/lustre/llite/range_lock.h
@@ -76,7 +76,7 @@ struct range_lock_tree {
 };
 
 void range_lock_tree_init(struct range_lock_tree *tree);
-void range_lock_init(struct range_lock *lock, __u64 start, __u64 end);
+int range_lock_init(struct range_lock *lock, __u64 start, __u64 end);
 int  range_lock(struct range_lock_tree *tree, struct range_lock *lock);
 void range_unlock(struct range_lock_tree *tree, struct range_lock *lock);
 #endif
-- 
1.8.3.1

[PATCH 35/60] staging: lustre: lov: ld_target could be NULL

2017-01-28 Thread James Simmons

From: Bobi Jam 

lov_device::ld_target[ost_idx] could be NULL if the OST target is
not filled in lov_device::ld_lov::lov_tgt_desc[ost_idx] yet.

Signed-off-by: Bobi Jam 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8018
Reviewed-on: http://review.whamcloud.com/21411
Reviewed-by: Jinshan Xiong 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/lov/lov_object.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c 
b/drivers/staging/lustre/lustre/lov/lov_object.c
index 9c4b5ab..977579c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -266,6 +266,13 @@ static int lov_init_raid0(const struct lu_env *env, struct 
lov_device *dev,
if (result != 0)
goto out;
 
+   if (!dev->ld_target[ost_idx]) {
+   CERROR("%s: OST %04x is not initialized\n",
+   lov2obd(dev->ld_lov)->obd_name, ost_idx);
+   result = -EIO;
+   goto out;
+   }
+
subdev = lovsub2cl_dev(dev->ld_target[ost_idx]);
subconf->u.coc_oinfo = oinfo;
LASSERTF(subdev, "not init ost %d\n", ost_idx);
-- 
1.8.3.1

[PATCH 36/60] staging: lustre: header: remove assert from interval_set()

2017-01-28 Thread James Simmons

In the case of interval_tree.h only interval_set()
uses LASSERT which is removed in this patch and
interval_set() instead reports a real error. The
header libcfs.h for interval_tree.h is not needed
anymore so we can just use the standard linux
kernel headers instead.h

Signed-off-by: James Simmons 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6401
Reviewed-on: https://review.whamcloud.com/22522
Reviewed-on: https://review.whamcloud.com/24323
Reviewed-by: Frank Zago 
Reviewed-by: Dmitry Eremin 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/interval_tree.h | 12 
 drivers/staging/lustre/lustre/ldlm/ldlm_extent.c  |  5 +++--
 drivers/staging/lustre/lustre/llite/range_lock.c  | 10 --
 drivers/staging/lustre/lustre/llite/range_lock.h  |  2 +-
 4 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/interval_tree.h 
b/drivers/staging/lustre/lustre/include/interval_tree.h
index 5d387d3..0d4f92e 100644
--- a/drivers/staging/lustre/lustre/include/interval_tree.h
+++ b/drivers/staging/lustre/lustre/include/interval_tree.h
@@ -36,7 +36,9 @@
 #ifndef _INTERVAL_H__
 #define _INTERVAL_H__
 
-#include "../../include/linux/libcfs/libcfs.h" /* LASSERT. */
+#include 
+#include 
+#include 
 
 struct interval_node {
struct interval_node   *in_left;
@@ -73,13 +75,15 @@ static inline __u64 interval_high(struct interval_node 
*node)
return node->in_extent.end;
 }
 
-static inline void interval_set(struct interval_node *node,
-   __u64 start, __u64 end)
+static inline int interval_set(struct interval_node *node,
+  __u64 start, __u64 end)
 {
-   LASSERT(start <= end);
+   if (start > end)
+   return -ERANGE;
node->in_extent.start = start;
node->in_extent.end = end;
node->in_max_high = end;
+   return 0;
 }
 
 /*
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
index 5616ea4..08f97e2 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_extent.c
@@ -162,7 +162,7 @@ void ldlm_extent_add_lock(struct ldlm_resource *res,
struct interval_node *found, **root;
struct ldlm_interval *node;
struct ldlm_extent *extent;
-   int idx;
+   int idx, rc;
 
LASSERT(lock->l_granted_mode == lock->l_req_mode);
 
@@ -176,7 +176,8 @@ void ldlm_extent_add_lock(struct ldlm_resource *res,
 
/* node extent initialize */
extent = >l_policy_data.l_extent;
-   interval_set(>li_node, extent->start, extent->end);
+   rc = interval_set(>li_node, extent->start, extent->end);
+   LASSERT(!rc);
 
root = >lr_itree[idx].lit_root;
found = interval_insert(>li_node, root);
diff --git a/drivers/staging/lustre/lustre/llite/range_lock.c 
b/drivers/staging/lustre/lustre/llite/range_lock.c
index 94c818f..14148a0 100644
--- a/drivers/staging/lustre/lustre/llite/range_lock.c
+++ b/drivers/staging/lustre/lustre/llite/range_lock.c
@@ -61,17 +61,23 @@ void range_lock_tree_init(struct range_lock_tree *tree)
  * Pre:  Caller should have allocated the range lock node.
  * Post: The range lock node is meant to cover [start, end] region
  */
-void range_lock_init(struct range_lock *lock, __u64 start, __u64 end)
+int range_lock_init(struct range_lock *lock, __u64 start, __u64 end)
 {
+   int rc;
+
memset(>rl_node, 0, sizeof(lock->rl_node));
if (end != LUSTRE_EOF)
end >>= PAGE_SHIFT;
-   interval_set(>rl_node, start >> PAGE_SHIFT, end);
+   rc = interval_set(>rl_node, start >> PAGE_SHIFT, end);
+   if (rc)
+   return rc;
+
INIT_LIST_HEAD(>rl_next_lock);
lock->rl_task = NULL;
lock->rl_lock_count = 0;
lock->rl_blocking_ranges = 0;
lock->rl_sequence = 0;
+   return rc;
 }
 
 static inline struct range_lock *next_lock(struct range_lock *lock)
diff --git a/drivers/staging/lustre/lustre/llite/range_lock.h 
b/drivers/staging/lustre/lustre/llite/range_lock.h
index c6d04a6..779091c 100644
--- a/drivers/staging/lustre/lustre/llite/range_lock.h
+++ b/drivers/staging/lustre/lustre/llite/range_lock.h
@@ -76,7 +76,7 @@ struct range_lock_tree {
 };
 
 void range_lock_tree_init(struct range_lock_tree *tree);
-void range_lock_init(struct range_lock *lock, __u64 start, __u64 end);
+int range_lock_init(struct range_lock *lock, __u64 start, __u64 end);
 int  range_lock(struct range_lock_tree *tree, struct range_lock *lock);
 void range_unlock(struct range_lock_tree *tree, struct range_lock *lock);
 #endif
-- 
1.8.3.1

[PATCH 35/60] staging: lustre: lov: ld_target could be NULL

2017-01-28 Thread James Simmons

From: Bobi Jam 

lov_device::ld_target[ost_idx] could be NULL if the OST target is
not filled in lov_device::ld_lov::lov_tgt_desc[ost_idx] yet.

Signed-off-by: Bobi Jam 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8018
Reviewed-on: http://review.whamcloud.com/21411
Reviewed-by: Jinshan Xiong 
Reviewed-by: John L. Hammond 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/lov/lov_object.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/staging/lustre/lustre/lov/lov_object.c 
b/drivers/staging/lustre/lustre/lov/lov_object.c
index 9c4b5ab..977579c 100644
--- a/drivers/staging/lustre/lustre/lov/lov_object.c
+++ b/drivers/staging/lustre/lustre/lov/lov_object.c
@@ -266,6 +266,13 @@ static int lov_init_raid0(const struct lu_env *env, struct 
lov_device *dev,
if (result != 0)
goto out;
 
+   if (!dev->ld_target[ost_idx]) {
+   CERROR("%s: OST %04x is not initialized\n",
+   lov2obd(dev->ld_lov)->obd_name, ost_idx);
+   result = -EIO;
+   goto out;
+   }
+
subdev = lovsub2cl_dev(dev->ld_target[ost_idx]);
subconf->u.coc_oinfo = oinfo;
LASSERTF(subdev, "not init ost %d\n", ost_idx);
-- 
1.8.3.1

[PATCH 04/60] staging: lustre: mdc: quiet console message for known -EINTR

2017-01-28 Thread James Simmons

From: Andreas Dilger 

If a user process is waiting for MDS recovery during close, but the
process is interrupted, the file is still closed but it prints a
message on the console. Quiet the console message for -EINTR, since
this is expected behaviour.

Signed-off-by: Andreas Dilger 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6627
Reviewed-on: http://review.whamcloud.com/14911
Reviewed-by: Frank Zago 
Reviewed-by: Emoly Liu 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/llite/file.c | 29 +
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index 0ee02f1..a1e51a5 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -122,26 +122,25 @@ static int ll_close_inode_openhandle(struct obd_export 
*md_exp,
 enum mds_op_bias bias,
 void *data)
 {
-   struct obd_export *exp = ll_i2mdexp(inode);
+   const struct ll_inode_info *lli = ll_i2info(inode);
struct md_op_data *op_data;
struct ptlrpc_request *req = NULL;
-   struct obd_device *obd = class_exp2obd(exp);
int rc;
 
-   if (!obd) {
-   /*
-* XXX: in case of LMV, is this correct to access
-* ->exp_handle?
-*/
-   CERROR("Invalid MDC connection handle %#llx\n",
-  ll_i2mdexp(inode)->exp_handle.h_cookie);
+   if (!class_exp2obd(md_exp)) {
+   CERROR("%s: invalid MDC connection handle closing " DFID "\n",
+  ll_get_fsname(inode->i_sb, NULL, 0),
+  PFID(>lli_fid));
rc = 0;
goto out;
}
 
op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
+   /*
+* We leak openhandle and request here on error, but not much to be
+* done in OOM case since app won't retry close on error either.
+*/
if (!op_data) {
-   /* XXX We leak openhandle and request here. */
rc = -ENOMEM;
goto out;
}
@@ -170,10 +169,9 @@ static int ll_close_inode_openhandle(struct obd_export 
*md_exp,
}
 
rc = md_close(md_exp, op_data, och->och_mod, );
-   if (rc) {
-   CERROR("%s: inode "DFID" mdc close failed: rc = %d\n",
-  ll_i2mdexp(inode)->exp_obd->obd_name,
-  PFID(ll_inode2fid(inode)), rc);
+   if (rc && rc != -EINTR) {
+   CERROR("%s: inode " DFID " mdc close failed: rc = %d\n",
+  md_exp->exp_obd->obd_name, PFID(>lli_fid), rc);
}
 
if (op_data->op_bias & (MDS_HSM_RELEASE | MDS_CLOSE_LAYOUT_SWAP) &&
@@ -192,8 +190,7 @@ static int ll_close_inode_openhandle(struct obd_export 
*md_exp,
och->och_fh.cookie = DEAD_HANDLE_MAGIC;
kfree(och);
 
-   if (req) /* This is close request */
-   ptlrpc_req_finished(req);
+   ptlrpc_req_finished(req);
return rc;
 }
 
-- 
1.8.3.1

[PATCH 04/60] staging: lustre: mdc: quiet console message for known -EINTR

2017-01-28 Thread James Simmons

From: Andreas Dilger 

If a user process is waiting for MDS recovery during close, but the
process is interrupted, the file is still closed but it prints a
message on the console. Quiet the console message for -EINTR, since
this is expected behaviour.

Signed-off-by: Andreas Dilger 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-6627
Reviewed-on: http://review.whamcloud.com/14911
Reviewed-by: Frank Zago 
Reviewed-by: Emoly Liu 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/llite/file.c | 29 +
 1 file changed, 13 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/lustre/lustre/llite/file.c 
b/drivers/staging/lustre/lustre/llite/file.c
index 0ee02f1..a1e51a5 100644
--- a/drivers/staging/lustre/lustre/llite/file.c
+++ b/drivers/staging/lustre/lustre/llite/file.c
@@ -122,26 +122,25 @@ static int ll_close_inode_openhandle(struct obd_export 
*md_exp,
 enum mds_op_bias bias,
 void *data)
 {
-   struct obd_export *exp = ll_i2mdexp(inode);
+   const struct ll_inode_info *lli = ll_i2info(inode);
struct md_op_data *op_data;
struct ptlrpc_request *req = NULL;
-   struct obd_device *obd = class_exp2obd(exp);
int rc;
 
-   if (!obd) {
-   /*
-* XXX: in case of LMV, is this correct to access
-* ->exp_handle?
-*/
-   CERROR("Invalid MDC connection handle %#llx\n",
-  ll_i2mdexp(inode)->exp_handle.h_cookie);
+   if (!class_exp2obd(md_exp)) {
+   CERROR("%s: invalid MDC connection handle closing " DFID "\n",
+  ll_get_fsname(inode->i_sb, NULL, 0),
+  PFID(>lli_fid));
rc = 0;
goto out;
}
 
op_data = kzalloc(sizeof(*op_data), GFP_NOFS);
+   /*
+* We leak openhandle and request here on error, but not much to be
+* done in OOM case since app won't retry close on error either.
+*/
if (!op_data) {
-   /* XXX We leak openhandle and request here. */
rc = -ENOMEM;
goto out;
}
@@ -170,10 +169,9 @@ static int ll_close_inode_openhandle(struct obd_export 
*md_exp,
}
 
rc = md_close(md_exp, op_data, och->och_mod, );
-   if (rc) {
-   CERROR("%s: inode "DFID" mdc close failed: rc = %d\n",
-  ll_i2mdexp(inode)->exp_obd->obd_name,
-  PFID(ll_inode2fid(inode)), rc);
+   if (rc && rc != -EINTR) {
+   CERROR("%s: inode " DFID " mdc close failed: rc = %d\n",
+  md_exp->exp_obd->obd_name, PFID(>lli_fid), rc);
}
 
if (op_data->op_bias & (MDS_HSM_RELEASE | MDS_CLOSE_LAYOUT_SWAP) &&
@@ -192,8 +190,7 @@ static int ll_close_inode_openhandle(struct obd_export 
*md_exp,
och->och_fh.cookie = DEAD_HANDLE_MAGIC;
kfree(och);
 
-   if (req) /* This is close request */
-   ptlrpc_req_finished(req);
+   ptlrpc_req_finished(req);
return rc;
 }
 
-- 
1.8.3.1

[PATCH 32/60] staging: lustre: osc: limits the number of chunks in write RPC

2017-01-28 Thread James Simmons

From: Jinshan Xiong 

OSC has to make sure that it won't issue write RPCs with too many
chunks otherwise it will casue ZFS to create transactions much
bigger than DMU_MAX_ACCESS in size, which will end up with write
failure.

Signed-off-by: Jinshan Xiong 
Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8135
Reviewed-on: http://review.whamcloud.com/22369
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8632
Reviewed-on: http://review.whamcloud.com/22654
Reviewed-by: Andreas Dilger 
Reviewed-by: Patrick Farrell 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/osc/osc_cache.c | 124 ++
 1 file changed, 87 insertions(+), 37 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c 
b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 72dd554..0490478 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -1882,16 +1882,32 @@ static void osc_ap_completion(const struct lu_env *env, 
struct client_obd *cli,
   oap, osc, rc);
 }
 
+struct extent_rpc_data {
+   struct list_head   *erd_rpc_list;
+   unsigned interd_page_count;
+   unsigned interd_max_pages;
+   unsigned interd_max_chunks;
+};
+
+static inline unsigned osc_extent_chunks(const struct osc_extent *ext)
+{
+   struct client_obd *cli = osc_cli(ext->oe_obj);
+   unsigned ppc_bits = cli->cl_chunkbits - PAGE_SHIFT;
+
+   return (ext->oe_end >> ppc_bits) - (ext->oe_start >> ppc_bits) + 1;
+}
+
 /**
  * Try to add extent to one RPC. We need to think about the following things:
  * - # of pages must not be over max_pages_per_rpc
  * - extent must be compatible with previous ones
  */
 static int try_to_add_extent_for_io(struct client_obd *cli,
-   struct osc_extent *ext, struct list_head 
*rpclist,
-   unsigned int *pc, unsigned int *max_pages)
+   struct osc_extent *ext,
+   struct extent_rpc_data *data)
 {
struct osc_extent *tmp;
+   unsigned int chunk_count;
struct osc_async_page *oap = list_first_entry(>oe_pages,
  struct osc_async_page,
  oap_pending_item);
@@ -1899,19 +1915,22 @@ static int try_to_add_extent_for_io(struct client_obd 
*cli,
EASSERT((ext->oe_state == OES_CACHE || ext->oe_state == OES_LOCK_DONE),
ext);
 
-   *max_pages = max(ext->oe_mppr, *max_pages);
-   if (*pc + ext->oe_nr_pages > *max_pages)
+   chunk_count = osc_extent_chunks(ext);
+   if (chunk_count > data->erd_max_chunks)
+   return 0;
+
+   data->erd_max_pages = max(ext->oe_mppr, data->erd_max_pages);
+   if (data->erd_page_count + ext->oe_nr_pages > data->erd_max_pages)
return 0;
 
-   list_for_each_entry(tmp, rpclist, oe_link) {
+   list_for_each_entry(tmp, data->erd_rpc_list, oe_link) {
struct osc_async_page *oap2;
 
oap2 = list_first_entry(>oe_pages, struct osc_async_page,
oap_pending_item);
EASSERT(tmp->oe_owner == current, tmp);
if (oap2cl_page(oap)->cp_type != oap2cl_page(oap2)->cp_type) {
-   CDEBUG(D_CACHE, "Do not permit different type of IO"
-   " for a same RPC\n");
+   CDEBUG(D_CACHE, "Do not permit different type of IO in 
one RPC\n");
return 0;
}
 
@@ -1924,12 +1943,41 @@ static int try_to_add_extent_for_io(struct client_obd 
*cli,
break;
}
 
-   *pc += ext->oe_nr_pages;
-   list_move_tail(>oe_link, rpclist);
+   data->erd_max_chunks -= chunk_count;
+   data->erd_page_count += ext->oe_nr_pages;
+   list_move_tail(>oe_link, data->erd_rpc_list);
ext->oe_owner = current;
return 1;
 }
 
+static inline unsigned osc_max_write_chunks(const struct client_obd *cli)
+{
+   /*
+* LU-8135:
+*
+* The maximum size of a single transaction is about 64MB in ZFS.
+* #define DMU_MAX_ACCESS (64 * 1024 * 1024)
+*
+* Since ZFS is a copy-on-write file system, a single dirty page in
+* a chunk will result in the rewrite of the whole chunk, therefore
+* an RPC shouldn't be allowed to contain too many chunks otherwise
+* it will make transaction size much bigger than 64MB, especially
+* with big block size for ZFS.
+*
+* This piece of code is to make sure that OSC won't

[PATCH 32/60] staging: lustre: osc: limits the number of chunks in write RPC

2017-01-28 Thread James Simmons

From: Jinshan Xiong 

OSC has to make sure that it won't issue write RPCs with too many
chunks otherwise it will casue ZFS to create transactions much
bigger than DMU_MAX_ACCESS in size, which will end up with write
failure.

Signed-off-by: Jinshan Xiong 
Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8135
Reviewed-on: http://review.whamcloud.com/22369
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-8632
Reviewed-on: http://review.whamcloud.com/22654
Reviewed-by: Andreas Dilger 
Reviewed-by: Patrick Farrell 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/osc/osc_cache.c | 124 ++
 1 file changed, 87 insertions(+), 37 deletions(-)

diff --git a/drivers/staging/lustre/lustre/osc/osc_cache.c 
b/drivers/staging/lustre/lustre/osc/osc_cache.c
index 72dd554..0490478 100644
--- a/drivers/staging/lustre/lustre/osc/osc_cache.c
+++ b/drivers/staging/lustre/lustre/osc/osc_cache.c
@@ -1882,16 +1882,32 @@ static void osc_ap_completion(const struct lu_env *env, 
struct client_obd *cli,
   oap, osc, rc);
 }
 
+struct extent_rpc_data {
+   struct list_head   *erd_rpc_list;
+   unsigned interd_page_count;
+   unsigned interd_max_pages;
+   unsigned interd_max_chunks;
+};
+
+static inline unsigned osc_extent_chunks(const struct osc_extent *ext)
+{
+   struct client_obd *cli = osc_cli(ext->oe_obj);
+   unsigned ppc_bits = cli->cl_chunkbits - PAGE_SHIFT;
+
+   return (ext->oe_end >> ppc_bits) - (ext->oe_start >> ppc_bits) + 1;
+}
+
 /**
  * Try to add extent to one RPC. We need to think about the following things:
  * - # of pages must not be over max_pages_per_rpc
  * - extent must be compatible with previous ones
  */
 static int try_to_add_extent_for_io(struct client_obd *cli,
-   struct osc_extent *ext, struct list_head 
*rpclist,
-   unsigned int *pc, unsigned int *max_pages)
+   struct osc_extent *ext,
+   struct extent_rpc_data *data)
 {
struct osc_extent *tmp;
+   unsigned int chunk_count;
struct osc_async_page *oap = list_first_entry(>oe_pages,
  struct osc_async_page,
  oap_pending_item);
@@ -1899,19 +1915,22 @@ static int try_to_add_extent_for_io(struct client_obd 
*cli,
EASSERT((ext->oe_state == OES_CACHE || ext->oe_state == OES_LOCK_DONE),
ext);
 
-   *max_pages = max(ext->oe_mppr, *max_pages);
-   if (*pc + ext->oe_nr_pages > *max_pages)
+   chunk_count = osc_extent_chunks(ext);
+   if (chunk_count > data->erd_max_chunks)
+   return 0;
+
+   data->erd_max_pages = max(ext->oe_mppr, data->erd_max_pages);
+   if (data->erd_page_count + ext->oe_nr_pages > data->erd_max_pages)
return 0;
 
-   list_for_each_entry(tmp, rpclist, oe_link) {
+   list_for_each_entry(tmp, data->erd_rpc_list, oe_link) {
struct osc_async_page *oap2;
 
oap2 = list_first_entry(>oe_pages, struct osc_async_page,
oap_pending_item);
EASSERT(tmp->oe_owner == current, tmp);
if (oap2cl_page(oap)->cp_type != oap2cl_page(oap2)->cp_type) {
-   CDEBUG(D_CACHE, "Do not permit different type of IO"
-   " for a same RPC\n");
+   CDEBUG(D_CACHE, "Do not permit different type of IO in 
one RPC\n");
return 0;
}
 
@@ -1924,12 +1943,41 @@ static int try_to_add_extent_for_io(struct client_obd 
*cli,
break;
}
 
-   *pc += ext->oe_nr_pages;
-   list_move_tail(>oe_link, rpclist);
+   data->erd_max_chunks -= chunk_count;
+   data->erd_page_count += ext->oe_nr_pages;
+   list_move_tail(>oe_link, data->erd_rpc_list);
ext->oe_owner = current;
return 1;
 }
 
+static inline unsigned osc_max_write_chunks(const struct client_obd *cli)
+{
+   /*
+* LU-8135:
+*
+* The maximum size of a single transaction is about 64MB in ZFS.
+* #define DMU_MAX_ACCESS (64 * 1024 * 1024)
+*
+* Since ZFS is a copy-on-write file system, a single dirty page in
+* a chunk will result in the rewrite of the whole chunk, therefore
+* an RPC shouldn't be allowed to contain too many chunks otherwise
+* it will make transaction size much bigger than 64MB, especially
+* with big block size for ZFS.
+*
+* This piece of code is to make sure that OSC won't send write RPCs
+* with too many chunks. The maximum chunk size that an RPC can cover
+* is set to PTLRPC_MAX_BRW_SIZE, which is defined to 16MB.

[PATCH 06/60] staging: lustre: clio: revise readahead to support 16MB IO

2017-01-28 Thread James Simmons

From: Jinshan Xiong 

Read ahead currently doesn't handle 16MB RPC packets correctly
by assuming the packets are a default size instead of querying
the size. This work adjust the read ahead policy to issue
read ahead RPC by the underlying RPC size.

Signed-off-by: Jinshan Xiong 
Signed-off-by: Gu Zheng 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7990
Reviewed-on: http://review.whamcloud.com/19368
Reviewed-by: Andreas Dilger 
Reviewed-by: Li Xi 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   4 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c  |  10 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |  14 +-
 drivers/staging/lustre/lustre/llite/rw.c   | 195 ++---
 drivers/staging/lustre/lustre/osc/osc_io.c |   3 +-
 5 files changed, 114 insertions(+), 112 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h 
b/drivers/staging/lustre/lustre/include/cl_object.h
index a1b8301..813e71d 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1452,8 +1452,10 @@ struct cl_read_ahead {
 * cra_end is included.
 */
pgoff_t cra_end;
+   /* optimal RPC size for this read, by pages */
+   unsigned long cra_rpc_size;
/*
-* Release routine. If readahead holds resources underneath, this
+* Release callback. If readahead holds resources underneath, this
 * function should be called to release it.
 */
void (*cra_release)(const struct lu_env *env, void *cbdata);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 675e25b..95b8c76 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -351,13 +351,11 @@ int client_obd_setup(struct obd_device *obddev, struct 
lustre_cfg *lcfg)
cli->cl_supp_cksum_types = OBD_CKSUM_CRC32;
atomic_set(>cl_resends, OSC_DEFAULT_RESENDS);
 
-   /* This value may be reduced at connect time in
-* ptlrpc_connect_interpret() . We initialize it to only
-* 1MB until we know what the performance looks like.
-* In the future this should likely be increased. LU-1431
+   /*
+* Set it to possible maximum size. It may be reduced by ocd_brw_size
+* from OFD after connecting.
 */
-   cli->cl_max_pages_per_rpc = min_t(int, PTLRPC_MAX_BRW_PAGES,
- LNET_MTU >> PAGE_SHIFT);
+   cli->cl_max_pages_per_rpc = PTLRPC_MAX_BRW_PAGES;
 
/*
 * set cl_chunkbits default value to PAGE_CACHE_SHIFT,
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 2c72177..501957c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -281,10 +281,8 @@ static inline struct ll_inode_info *ll_i2info(struct inode 
*inode)
return container_of(inode, struct ll_inode_info, lli_vfs_inode);
 }
 
-/* default to about 40meg of readahead on a given system.  That much tied
- * up in 512k readahead requests serviced at 40ms each is about 1GB/s.
- */
-#define SBI_DEFAULT_READAHEAD_MAX (40UL << (20 - PAGE_SHIFT))
+/* default to about 64M of readahead on a given system. */
+#define SBI_DEFAULT_READAHEAD_MAX  (64UL << (20 - PAGE_SHIFT))
 
 /* default to read-ahead full files smaller than 2MB on the second read */
 #define SBI_DEFAULT_READAHEAD_WHOLE_MAX (2UL << (20 - PAGE_SHIFT))
@@ -321,6 +319,9 @@ struct ll_ra_info {
 struct ra_io_arg {
unsigned long ria_start;  /* start offset of read-ahead*/
unsigned long ria_end;/* end offset of read-ahead*/
+   unsigned long ria_reserved; /* reserved pages for read-ahead */
+   unsigned long ria_end_min;  /* minimum end to cover current read */
+   bool ria_eof;   /* reach end of file */
/* If stride read pattern is detected, ria_stoff means where
 * stride read is started. Note: for normal read-ahead, the
 * value here is meaningless, and also it will not be accessed
@@ -551,6 +552,11 @@ struct ll_readahead_state {
 */
unsigned long   ras_window_start, ras_window_len;
/*
+* Optimal RPC size. It decides how many pages will be sent
+* for each read-ahead.
+*/
+   unsigned long   ras_rpc_size;
+   /*
 * Where next read-ahead should start at. This lies within read-ahead
 * window. Read-ahead window is read in pieces rather than at once
 * because: 1. lustre limits total number of pages under read-ahead by
diff --git

[PATCH 33/60] staging: lustre: libcfs: avoid stomping on module param cpu_pattern

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

The function cfs_cpt_table_create_pattern() alters the string
passed to it. Currently we are passing in the module parameter
string cpu_pattern which is incorrect. Instead lets duplicate
the module parameter string and pass that to the function
cfs_cpt_table_create_pattern().

Signed-off-by: Liang Zhen 
Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5050
Reviewed-on: http://review.whamcloud.com/22377
Reviewed-by: James Simmons 
Reviewed-by: Olaf Weber 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 427e219..71a5b19 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -1050,7 +1050,15 @@ static int cfs_cpu_dead(unsigned int cpu)
ret = -EINVAL;
 
if (*cpu_pattern) {
-   cfs_cpt_table = cfs_cpt_table_create_pattern(cpu_pattern);
+   char *cpu_pattern_dup = kstrdup(cpu_pattern, GFP_KERNEL);
+
+   if (!cpu_pattern_dup) {
+   CERROR("Failed to duplicate cpu_pattern\n");
+   goto failed;
+   }
+
+   cfs_cpt_table = cfs_cpt_table_create_pattern(cpu_pattern_dup);
+   kfree(cpu_pattern_dup);
if (!cfs_cpt_table) {
CERROR("Failed to create cptab from pattern %s\n",
   cpu_pattern);
-- 
1.8.3.1

[PATCH 23/60] staging: lustre: lmv: remove unused placement parameter

2017-01-28 Thread James Simmons

From: "John L. Hammond" 

Remove the unused lmv.*.placement parameter along with supporting
functions and struct members.

Signed-off-by: John L. Hammond 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7674
Reviewed-on: http://review.whamcloud.com/18019
Reviewed-by: Ben Evans 
Reviewed-by: Frank Zago 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/obd.h   |  8 
 drivers/staging/lustre/lustre/lmv/lmv_obd.c   |  1 -
 drivers/staging/lustre/lustre/lmv/lproc_lmv.c | 68 ---
 3 files changed, 77 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index 6d3bd05..5c217c0 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -403,18 +403,10 @@ struct lmv_tgt_desc {
unsigned long   ltd_active:1; /* target up for requests */
 };
 
-enum placement_policy {
-   PLACEMENT_CHAR_POLICY   = 0,
-   PLACEMENT_NID_POLICY= 1,
-   PLACEMENT_INVAL_POLICY  = 2,
-   PLACEMENT_MAX_POLICY
-};
-
 struct lmv_obd {
int refcount;
struct lu_client_fldlmv_fld;
spinlock_t  lmv_lock;
-   enum placement_policy   lmv_placement;
struct lmv_desc desc;
struct obd_uuid cluuid;
struct obd_export   *exp;
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c 
b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 915415c..5926461 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1284,7 +1284,6 @@ static int lmv_setup(struct obd_device *obd, struct 
lustre_cfg *lcfg)
lmv->desc.ld_active_tgt_count = 0;
lmv->max_def_easize = 0;
lmv->max_easize = 0;
-   lmv->lmv_placement = PLACEMENT_CHAR_POLICY;
 
spin_lock_init(>lmv_lock);
mutex_init(>lmv_init_mutex);
diff --git a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c 
b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
index 14fbc9c..ff45802 100644
--- a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
+++ b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
@@ -50,73 +50,6 @@ static ssize_t numobd_show(struct kobject *kobj, struct 
attribute *attr,
 }
 LUSTRE_RO_ATTR(numobd);
 
-static const char *placement_name[] = {
-   [PLACEMENT_CHAR_POLICY] = "CHAR",
-   [PLACEMENT_NID_POLICY]  = "NID",
-   [PLACEMENT_INVAL_POLICY]  = "INVAL"
-};
-
-static enum placement_policy placement_name2policy(char *name, int len)
-{
-   int  i;
-
-   for (i = 0; i < PLACEMENT_MAX_POLICY; i++) {
-   if (!strncmp(placement_name[i], name, len))
-   return i;
-   }
-   return PLACEMENT_INVAL_POLICY;
-}
-
-static const char *placement_policy2name(enum placement_policy placement)
-{
-   LASSERT(placement < PLACEMENT_MAX_POLICY);
-   return placement_name[placement];
-}
-
-static ssize_t placement_show(struct kobject *kobj, struct attribute *attr,
- char *buf)
-{
-   struct obd_device *dev = container_of(kobj, struct obd_device,
- obd_kobj);
-   struct lmv_obd *lmv;
-
-   lmv = >u.lmv;
-   return sprintf(buf, "%s\n", placement_policy2name(lmv->lmv_placement));
-}
-
-#define MAX_POLICY_STRING_SIZE 64
-
-static ssize_t placement_store(struct kobject *kobj, struct attribute *attr,
-  const char *buffer,
-  size_t count)
-{
-   struct obd_device *dev = container_of(kobj, struct obd_device,
- obd_kobj);
-   char dummy[MAX_POLICY_STRING_SIZE + 1];
-   enum placement_policy policy;
-   struct lmv_obd *lmv = >u.lmv;
-
-   memcpy(dummy, buffer, MAX_POLICY_STRING_SIZE);
-
-   if (count > MAX_POLICY_STRING_SIZE)
-   count = MAX_POLICY_STRING_SIZE;
-
-   if (dummy[count - 1] == '\n')
-   count--;
-   dummy[count] = '\0';
-
-   policy = placement_name2policy(dummy, count);
-   if (policy != PLACEMENT_INVAL_POLICY) {
-   spin_lock(>lmv_lock);
-   lmv->lmv_placement = policy;
-   spin_unlock(>lmv_lock);
-   } else {
-   return -EINVAL;
-   }
-   return count;
-}
-LUSTRE_RW_ATTR(placement);
-
 static ssize_t activeobd_show(struct kobject *kobj, struct attribute *attr,
  char *buf)
 {
@@ -226,7 +159,6 @@ static int lmv_target_seq_open(struct inode *inode, struct 
file *file)
 static struct attribute *lmv_attrs[] = {
_attr_activeobd.attr,
_attr_numobd.attr,
-   _attr_placement.attr,
NULL,
 };
 
-- 
1.8.3.1

[PATCH 06/60] staging: lustre: clio: revise readahead to support 16MB IO

2017-01-28 Thread James Simmons

From: Jinshan Xiong 

Read ahead currently doesn't handle 16MB RPC packets correctly
by assuming the packets are a default size instead of querying
the size. This work adjust the read ahead policy to issue
read ahead RPC by the underlying RPC size.

Signed-off-by: Jinshan Xiong 
Signed-off-by: Gu Zheng 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7990
Reviewed-on: http://review.whamcloud.com/19368
Reviewed-by: Andreas Dilger 
Reviewed-by: Li Xi 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/cl_object.h  |   4 +-
 drivers/staging/lustre/lustre/ldlm/ldlm_lib.c  |  10 +-
 .../staging/lustre/lustre/llite/llite_internal.h   |  14 +-
 drivers/staging/lustre/lustre/llite/rw.c   | 195 ++---
 drivers/staging/lustre/lustre/osc/osc_io.c |   3 +-
 5 files changed, 114 insertions(+), 112 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/cl_object.h 
b/drivers/staging/lustre/lustre/include/cl_object.h
index a1b8301..813e71d 100644
--- a/drivers/staging/lustre/lustre/include/cl_object.h
+++ b/drivers/staging/lustre/lustre/include/cl_object.h
@@ -1452,8 +1452,10 @@ struct cl_read_ahead {
 * cra_end is included.
 */
pgoff_t cra_end;
+   /* optimal RPC size for this read, by pages */
+   unsigned long cra_rpc_size;
/*
-* Release routine. If readahead holds resources underneath, this
+* Release callback. If readahead holds resources underneath, this
 * function should be called to release it.
 */
void (*cra_release)(const struct lu_env *env, void *cbdata);
diff --git a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c 
b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
index 675e25b..95b8c76 100644
--- a/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
+++ b/drivers/staging/lustre/lustre/ldlm/ldlm_lib.c
@@ -351,13 +351,11 @@ int client_obd_setup(struct obd_device *obddev, struct 
lustre_cfg *lcfg)
cli->cl_supp_cksum_types = OBD_CKSUM_CRC32;
atomic_set(>cl_resends, OSC_DEFAULT_RESENDS);
 
-   /* This value may be reduced at connect time in
-* ptlrpc_connect_interpret() . We initialize it to only
-* 1MB until we know what the performance looks like.
-* In the future this should likely be increased. LU-1431
+   /*
+* Set it to possible maximum size. It may be reduced by ocd_brw_size
+* from OFD after connecting.
 */
-   cli->cl_max_pages_per_rpc = min_t(int, PTLRPC_MAX_BRW_PAGES,
- LNET_MTU >> PAGE_SHIFT);
+   cli->cl_max_pages_per_rpc = PTLRPC_MAX_BRW_PAGES;
 
/*
 * set cl_chunkbits default value to PAGE_CACHE_SHIFT,
diff --git a/drivers/staging/lustre/lustre/llite/llite_internal.h 
b/drivers/staging/lustre/lustre/llite/llite_internal.h
index 2c72177..501957c 100644
--- a/drivers/staging/lustre/lustre/llite/llite_internal.h
+++ b/drivers/staging/lustre/lustre/llite/llite_internal.h
@@ -281,10 +281,8 @@ static inline struct ll_inode_info *ll_i2info(struct inode 
*inode)
return container_of(inode, struct ll_inode_info, lli_vfs_inode);
 }
 
-/* default to about 40meg of readahead on a given system.  That much tied
- * up in 512k readahead requests serviced at 40ms each is about 1GB/s.
- */
-#define SBI_DEFAULT_READAHEAD_MAX (40UL << (20 - PAGE_SHIFT))
+/* default to about 64M of readahead on a given system. */
+#define SBI_DEFAULT_READAHEAD_MAX  (64UL << (20 - PAGE_SHIFT))
 
 /* default to read-ahead full files smaller than 2MB on the second read */
 #define SBI_DEFAULT_READAHEAD_WHOLE_MAX (2UL << (20 - PAGE_SHIFT))
@@ -321,6 +319,9 @@ struct ll_ra_info {
 struct ra_io_arg {
unsigned long ria_start;  /* start offset of read-ahead*/
unsigned long ria_end;/* end offset of read-ahead*/
+   unsigned long ria_reserved; /* reserved pages for read-ahead */
+   unsigned long ria_end_min;  /* minimum end to cover current read */
+   bool ria_eof;   /* reach end of file */
/* If stride read pattern is detected, ria_stoff means where
 * stride read is started. Note: for normal read-ahead, the
 * value here is meaningless, and also it will not be accessed
@@ -551,6 +552,11 @@ struct ll_readahead_state {
 */
unsigned long   ras_window_start, ras_window_len;
/*
+* Optimal RPC size. It decides how many pages will be sent
+* for each read-ahead.
+*/
+   unsigned long   ras_rpc_size;
+   /*
 * Where next read-ahead should start at. This lies within read-ahead
 * window. Read-ahead window is read in pieces rather than at once
 * because: 1. lustre limits total number of pages under read-ahead by
diff --git a/drivers/staging/lustre/lustre/llite/rw.c 
b/drivers/staging/lustre/lustre/llite/rw.c
index f10e092..18d3ccb 100644
--- a/drivers/staging/lustre/lustre/llite/rw.c
+++

[PATCH 33/60] staging: lustre: libcfs: avoid stomping on module param cpu_pattern

2017-01-28 Thread James Simmons

From: Dmitry Eremin 

The function cfs_cpt_table_create_pattern() alters the string
passed to it. Currently we are passing in the module parameter
string cpu_pattern which is incorrect. Instead lets duplicate
the module parameter string and pass that to the function
cfs_cpt_table_create_pattern().

Signed-off-by: Liang Zhen 
Signed-off-by: Dmitry Eremin 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-5050
Reviewed-on: http://review.whamcloud.com/22377
Reviewed-by: James Simmons 
Reviewed-by: Olaf Weber 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c 
b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
index 427e219..71a5b19 100644
--- a/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
+++ b/drivers/staging/lustre/lnet/libcfs/linux/linux-cpu.c
@@ -1050,7 +1050,15 @@ static int cfs_cpu_dead(unsigned int cpu)
ret = -EINVAL;
 
if (*cpu_pattern) {
-   cfs_cpt_table = cfs_cpt_table_create_pattern(cpu_pattern);
+   char *cpu_pattern_dup = kstrdup(cpu_pattern, GFP_KERNEL);
+
+   if (!cpu_pattern_dup) {
+   CERROR("Failed to duplicate cpu_pattern\n");
+   goto failed;
+   }
+
+   cfs_cpt_table = cfs_cpt_table_create_pattern(cpu_pattern_dup);
+   kfree(cpu_pattern_dup);
if (!cfs_cpt_table) {
CERROR("Failed to create cptab from pattern %s\n",
   cpu_pattern);
-- 
1.8.3.1

[PATCH 23/60] staging: lustre: lmv: remove unused placement parameter

2017-01-28 Thread James Simmons

From: "John L. Hammond" 

Remove the unused lmv.*.placement parameter along with supporting
functions and struct members.

Signed-off-by: John L. Hammond 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7674
Reviewed-on: http://review.whamcloud.com/18019
Reviewed-by: Ben Evans 
Reviewed-by: Frank Zago 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/include/obd.h   |  8 
 drivers/staging/lustre/lustre/lmv/lmv_obd.c   |  1 -
 drivers/staging/lustre/lustre/lmv/lproc_lmv.c | 68 ---
 3 files changed, 77 deletions(-)

diff --git a/drivers/staging/lustre/lustre/include/obd.h 
b/drivers/staging/lustre/lustre/include/obd.h
index 6d3bd05..5c217c0 100644
--- a/drivers/staging/lustre/lustre/include/obd.h
+++ b/drivers/staging/lustre/lustre/include/obd.h
@@ -403,18 +403,10 @@ struct lmv_tgt_desc {
unsigned long   ltd_active:1; /* target up for requests */
 };
 
-enum placement_policy {
-   PLACEMENT_CHAR_POLICY   = 0,
-   PLACEMENT_NID_POLICY= 1,
-   PLACEMENT_INVAL_POLICY  = 2,
-   PLACEMENT_MAX_POLICY
-};
-
 struct lmv_obd {
int refcount;
struct lu_client_fldlmv_fld;
spinlock_t  lmv_lock;
-   enum placement_policy   lmv_placement;
struct lmv_desc desc;
struct obd_uuid cluuid;
struct obd_export   *exp;
diff --git a/drivers/staging/lustre/lustre/lmv/lmv_obd.c 
b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
index 915415c..5926461 100644
--- a/drivers/staging/lustre/lustre/lmv/lmv_obd.c
+++ b/drivers/staging/lustre/lustre/lmv/lmv_obd.c
@@ -1284,7 +1284,6 @@ static int lmv_setup(struct obd_device *obd, struct 
lustre_cfg *lcfg)
lmv->desc.ld_active_tgt_count = 0;
lmv->max_def_easize = 0;
lmv->max_easize = 0;
-   lmv->lmv_placement = PLACEMENT_CHAR_POLICY;
 
spin_lock_init(>lmv_lock);
mutex_init(>lmv_init_mutex);
diff --git a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c 
b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
index 14fbc9c..ff45802 100644
--- a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
+++ b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
@@ -50,73 +50,6 @@ static ssize_t numobd_show(struct kobject *kobj, struct 
attribute *attr,
 }
 LUSTRE_RO_ATTR(numobd);
 
-static const char *placement_name[] = {
-   [PLACEMENT_CHAR_POLICY] = "CHAR",
-   [PLACEMENT_NID_POLICY]  = "NID",
-   [PLACEMENT_INVAL_POLICY]  = "INVAL"
-};
-
-static enum placement_policy placement_name2policy(char *name, int len)
-{
-   int  i;
-
-   for (i = 0; i < PLACEMENT_MAX_POLICY; i++) {
-   if (!strncmp(placement_name[i], name, len))
-   return i;
-   }
-   return PLACEMENT_INVAL_POLICY;
-}
-
-static const char *placement_policy2name(enum placement_policy placement)
-{
-   LASSERT(placement < PLACEMENT_MAX_POLICY);
-   return placement_name[placement];
-}
-
-static ssize_t placement_show(struct kobject *kobj, struct attribute *attr,
- char *buf)
-{
-   struct obd_device *dev = container_of(kobj, struct obd_device,
- obd_kobj);
-   struct lmv_obd *lmv;
-
-   lmv = >u.lmv;
-   return sprintf(buf, "%s\n", placement_policy2name(lmv->lmv_placement));
-}
-
-#define MAX_POLICY_STRING_SIZE 64
-
-static ssize_t placement_store(struct kobject *kobj, struct attribute *attr,
-  const char *buffer,
-  size_t count)
-{
-   struct obd_device *dev = container_of(kobj, struct obd_device,
- obd_kobj);
-   char dummy[MAX_POLICY_STRING_SIZE + 1];
-   enum placement_policy policy;
-   struct lmv_obd *lmv = >u.lmv;
-
-   memcpy(dummy, buffer, MAX_POLICY_STRING_SIZE);
-
-   if (count > MAX_POLICY_STRING_SIZE)
-   count = MAX_POLICY_STRING_SIZE;
-
-   if (dummy[count - 1] == '\n')
-   count--;
-   dummy[count] = '\0';
-
-   policy = placement_name2policy(dummy, count);
-   if (policy != PLACEMENT_INVAL_POLICY) {
-   spin_lock(>lmv_lock);
-   lmv->lmv_placement = policy;
-   spin_unlock(>lmv_lock);
-   } else {
-   return -EINVAL;
-   }
-   return count;
-}
-LUSTRE_RW_ATTR(placement);
-
 static ssize_t activeobd_show(struct kobject *kobj, struct attribute *attr,
  char *buf)
 {
@@ -226,7 +159,6 @@ static int lmv_target_seq_open(struct inode *inode, struct 
file *file)
 static struct attribute *lmv_attrs[] = {
_attr_activeobd.attr,
_attr_numobd.attr,
-   _attr_placement.attr,
NULL,
 };
 
-- 
1.8.3.1

[PATCH 31/60] staging: lustre: clio: sync write should update mtime

2017-01-28 Thread James Simmons

From: Niu Yawei 

Sync write should update m/ctime promptly, otherwise, stale m/ctime
could be updated on the OST object by the sync write RPC.

Signed-off-by: Niu Yawei 
Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-7310
Reviewed-on: http://review.whamcloud.com/21063
Reviewed-by: John L. Hammond 
Reviewed-by: Bobi Jam 
Reviewed-by: Oleg Drokin 
Signed-off-by: James Simmons 
---
 drivers/staging/lustre/lustre/osc/osc_io.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/drivers/staging/lustre/lustre/osc/osc_io.c 
b/drivers/staging/lustre/lustre/osc/osc_io.c
index 7e5cd3a..3e61f5e 100644
--- a/drivers/staging/lustre/lustre/osc/osc_io.c
+++ b/drivers/staging/lustre/lustre/osc/osc_io.c
@@ -210,6 +210,18 @@ static int osc_io_submit(const struct lu_env *env,
if (queued > 0)
result = osc_queue_sync_pages(env, osc, , cmd, brw_flags);
 
+   /* Update c/mtime for sync write. LU-7310 */
+   if (qout->pl_nr > 0 && !result) {
+   struct cl_attr *attr = _env_info(env)->oti_attr;
+   struct cl_object *obj = ios->cis_obj;
+
+   cl_object_attr_lock(obj);
+   attr->cat_mtime = LTIME_S(CURRENT_TIME);
+   attr->cat_ctime = attr->cat_mtime;
+   cl_object_attr_update(env, obj, attr, CAT_MTIME | CAT_CTIME);
+   cl_object_attr_unlock(obj);
+   }
+
CDEBUG(D_INFO, "%d/%d %d\n", qin->pl_nr, qout->pl_nr, result);
return qout->pl_nr > 0 ? 0 : result;
 }
-- 
1.8.3.1

[PATCH 00/60] staging: lustre: batches of fixes for lustre client

2017-01-28 Thread James Simmons

Batch of missing fixes for lustre for the upstream client.

Alex Zhuravlev (1):
  staging: lustre: obdclass: do not call lu_site_purge() for single object 
exceed

Alexander Boyko (1):
  staging: lustre: ptlrpc: skip lock if export failed

Andreas Dilger (3):
  staging: lustre: mdc: quiet console message for known -EINTR
  staging: lustre: obdclass: add more info to sysfs version string
  staging: lustre: llite: handle inactive OSTs better in statfs

Andriy Skulysh (1):
  staging: lustre: ldlm: ASSERTION(flock->blocking_export!=0) failed

Ann Koehler (1):
  staging: lustre: obd: RCU stalls in lu_cache_shrink_count()

Ben Evans (1):
  staging: lustre: lustre: Remove old commented out code

Bobi Jam (3):
  staging: lustre: clio: add cl_page LRU shrinker
  staging: lustre: lov: ld_target could be NULL
  staging: lustre: llite: specify READA debug mask for ras_update

Bruno Faccini (1):
  staging: lustre: obdclass: health_check to report unhealthy upon LBUG

Dmitry Eremin (6):
  staging: lustre: llite: Setting xattr are properly checked with and without 
ACLs
  staging: lustre: libcfs: avoid stomping on module param cpu_pattern
  staging: lustre: libcfs: default CPT matches NUMA topology
  staging: lustre: libcfs: fix error messages
  staging: lustre: ptlrpc: remove unused pc->pc_env
  staging: lustre: ptlrpc: update MODULE_PARAM_DESC in ptlrpcd.c

Fan Yong (4):
  staging: lustre: fid: fix race in fid allocation
  staging: lustre: mgc: handle config_llog_data::cld_refcount properly
  staging: lustre: ptlrpc: comment for FLD_QUERY RPC reply swab
  staging: lustre: linkea: linkEA size limitation

Giuseppe Di Natale (1):
  staging: lustre: lmv: Correctly generate target_obd

James Simmons (7):
  staging: lustre: header: remove assert from interval_set()
  staging: libcfs: remove integer types abstraction from libcfs
  staging: lustre: socklnd: remove socklnd_init_msg
  staging: lustre: obd: move s3 in lmd_parse to inner loop
  staging: lustre: osc: avoid 64 divide in osc_cache_too_much
  staging: lustre: ptlrpc : remove userland usage from ptlrpc
  staging: lustre: libcfs: fix minimum size check for libcfs ioctl

Jeremy Filizetti (1):
  staging: lustre: ldlm: Restore connect flags on failure

Jinshan Xiong (4):
  staging: lustre: llite: Remove access of stripe in ll_setattr_raw
  staging: lustre: clio: revise readahead to support 16MB IO
  staging: lustre: llite: don't ignore layout for group lock request
  staging: lustre: osc: limits the number of chunks in write RPC

John L. Hammond (5):
  staging: lustre: llite: remove obsolete comment for ll_unlink()
  staging: lustre: ptlrpc: correct use of list_add_tail()
  staging: lustre: lmv: remove unused placement parameter
  staging: lustre: obd: remove OBD_NOTIFY_CREATE
  staging: lustre: mdc: avoid returning freed request

Lai Siyao (2):
  staging: lustre: statahead: drop support for remote entry
  staging: lustre: llite: normal user can't set FS default stripe

Liang Zhen (1):
  staging: lustre: ksocklnd: ignore timedout TX on closing connection

Nathaniel Clark (1):
  staging: lustre: lov: Ensure correct operation for large object sizes

Niu Yawei (4):
  staging: lustre: ptlrpc: set proper mbits for EINPROGRESS resend
  staging: lustre: clio: sync write should update mtime
  staging: ptlrpc: leaked rs on difficult reply
  staging: lustre: ptlrpc: update replay cursor when close during replay

Oleg Drokin (1):
  staging: lustre: llite: Trust creates in revalidate too.

Patrick Farrell (1):
  staging: lustre: mdc: Make IT_OPEN take lookup bits lock

Rahul Deshmukh (1):
  staging: lustre: llite: Adding timed wait in ll_umount_begin

Steve Guminski (3):
  staging: lustre: osc: osc_match_base prototype differs from declaration
  staging: lustre: libcfs: Change positional struct initializers to C99
  staging: lustre: fid: Change positional struct initializers to C99

Ulka Vaze (1):
  staging: lustre: lmv: Error not handled for lmv_find_target

Vladimir Saveliev (1):
  staging: lustre: ptlrpc: allow blocking asts to be delayed

Yang Sheng (1):
  staging: lustre: llite: don't invoke direct_IO for the EOF case

frank zago (1):
  staging: lustre: hsm: stack overrun in hai_dump_data_field

wang di (2):
  staging: lustre: llite: check request != NULL in ll_migrate
  staging: lustre: lmv: remove nlink check in lmv_revalidate_slaves

 .../lustre/include/linux/libcfs/libcfs_crypto.h|  60 +--
 .../lustre/include/linux/libcfs/linux/libcfs.h |   4 -
 .../staging/lustre/include/linux/lnet/socklnd.h|   9 -
 .../staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c |   2 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd.c|   2 +-
 .../staging/lustre/lnet/klnds/socklnd/socklnd_cb.c |  29 +--
 drivers/staging/lustre/lnet/libcfs/debug.c |   2 +-
 .../staging/lustre/lnet/libcfs/linux/linux-cpu.c   |  17 +-
 .../lustre/lnet/libcfs/linux/linux-module.c|   2 +-
 drivers/staging/lustre/lnet/libcfs/workitem.c  |   2 +-

1 2 3 4 5 6 7 >

1 - 100 of 660 matches

Mail list logo