On Thu, Sep 10, 2015 at 10:19 PM, Tejun Heo wrote:
> Hello, Parav.
>
> On Wed, Sep 09, 2015 at 09:27:40AM +0530, Parav Pandit wrote:
>> This is one old white paper, but most of the reasoning still holds true on
>> RDMA.
>> http://h10032.www1.hp.com/ctg/Manual/
On Thu, Sep 10, 2015 at 10:19 PM, Tejun Heo <t...@kernel.org> wrote:
> Hello, Parav.
>
> On Wed, Sep 09, 2015 at 09:27:40AM +0530, Parav Pandit wrote:
>> This is one old white paper, but most of the reasoning still holds true on
>> RDMA.
>> http://h10032.www
On Fri, Sep 11, 2015 at 9:34 AM, Tejun Heo <t...@kernel.org> wrote:
> Hello, Parav.
>
> On Fri, Sep 11, 2015 at 09:09:58AM +0530, Parav Pandit wrote:
>> The fact is that user level application uses hardware resources.
>> Verbs layer is software abstraction for
On Fri, Sep 11, 2015 at 1:52 AM, Tejun Heo <t...@kernel.org> wrote:
> Hello, Parav.
>
> On Thu, Sep 10, 2015 at 11:16:49PM +0530, Parav Pandit wrote:
>> >> These resources include are- QP (queue pair) to transfer data, CQ
>> >> (Completion queue) to indicat
On Tue, Sep 8, 2015 at 8:53 PM, Tejun Heo wrote:
> Hello, Parav.
>
> On Tue, Sep 08, 2015 at 02:08:16AM +0530, Parav Pandit wrote:
>> Currently user space applications can easily take away all the rdma
>> device specific resources such as AH, CQ, QP, MR etc. Due to which
On Tue, Sep 8, 2015 at 7:20 PM, Haggai Eran wrote:
> On 08/09/2015 13:18, Parav Pandit wrote:
>>> >
>>>> >> + * RDMA resource limits are hierarchical, so the highest configured
>>>> >> limit of
>>>> >> + * the hierarchy i
On Tue, Sep 8, 2015 at 2:06 PM, Haggai Eran wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> +void devcgroup_rdma_uncharge_resource(struct ib_ucontext *ucontext,
>> + enum devcgroup_rdma_rt type, int num)
>> +{
>> + st
On Tue, Sep 8, 2015 at 2:10 PM, Haggai Eran wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> +static void init_ucontext_lists(struct ib_ucontext *ucontext)
>> +{
>> + INIT_LIST_HEAD(>pd_list);
>> + INIT_LIST_HEAD(>mr_list);
>> + INIT_LIST_
On Tue, Sep 8, 2015 at 1:52 PM, Haggai Eran wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> +/* RDMA resources from device cgroup perspective */
>> +enum devcgroup_rdma_rt {
>> + DEVCG_RDMA_RES_TYPE_UCTX,
>> + DEVCG_RDMA_RES_TYPE_CQ,
>
On Tue, Sep 8, 2015 at 1:54 PM, Haggai Eran wrote:
> On 08/09/2015 10:04, Parav Pandit wrote:
>> On Tue, Sep 8, 2015 at 11:18 AM, Haggai Eran wrote:
>>> On 07/09/2015 23:38, Parav Pandit wrote:
>>>> @@ -2676,7 +2686,7 @@ static inline int thread_group_emp
On Tue, Sep 8, 2015 at 11:18 AM, Haggai Eran wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> @@ -2676,7 +2686,7 @@ static inline int thread_group_empty(struct
>> task_struct *p)
>> * Protects ->fs, ->files, ->mm, ->group_info, ->comm, keyring
&
On Tue, Sep 8, 2015 at 11:01 AM, Haggai Eran wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
>> index 8b64221..cdbdd60 100644
>> --- a/include/linux/device_cgroup.h
>> +++ b/include/linux
On Tue, Sep 8, 2015 at 1:54 PM, Haggai Eran <hagg...@mellanox.com> wrote:
> On 08/09/2015 10:04, Parav Pandit wrote:
>> On Tue, Sep 8, 2015 at 11:18 AM, Haggai Eran <hagg...@mellanox.com> wrote:
>>> On 07/09/2015 23:38, Parav Pandit wrote:
>>>&g
On Tue, Sep 8, 2015 at 11:01 AM, Haggai Eran <hagg...@mellanox.com> wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
>> index 8b64221..cdbdd60 100644
>> --- a/include/linux/device_cgro
On Tue, Sep 8, 2015 at 11:18 AM, Haggai Eran <hagg...@mellanox.com> wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> @@ -2676,7 +2686,7 @@ static inline int thread_group_empty(struct
>> task_struct *p)
>> * Protects ->fs, ->files, ->mm, ->group_inf
On Tue, Sep 8, 2015 at 2:06 PM, Haggai Eran <hagg...@mellanox.com> wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> +void devcgroup_rdma_uncharge_resource(struct ib_ucontext *ucontext,
>> + enum devcgroup_rdma_rt type, int num)
>> +{
On Tue, Sep 8, 2015 at 2:10 PM, Haggai Eran <hagg...@mellanox.com> wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> +static void init_ucontext_lists(struct ib_ucontext *ucontext)
>> +{
>> + INIT_LIST_HEAD(>pd_list);
>> + INIT_LIST_HEAD(>mr_
On Tue, Sep 8, 2015 at 1:52 PM, Haggai Eran <hagg...@mellanox.com> wrote:
> On 07/09/2015 23:38, Parav Pandit wrote:
>> +/* RDMA resources from device cgroup perspective */
>> +enum devcgroup_rdma_rt {
>> + DEVCG_RDMA_RES_TYPE_UCTX,
>
On Tue, Sep 8, 2015 at 7:20 PM, Haggai Eran <hagg...@mellanox.com> wrote:
> On 08/09/2015 13:18, Parav Pandit wrote:
>>> >
>>>> >> + * RDMA resource limits are hierarchical, so the highest configured
>>>> >> limit of
>>>> &g
On Tue, Sep 8, 2015 at 8:53 PM, Tejun Heo <t...@kernel.org> wrote:
> Hello, Parav.
>
> On Tue, Sep 08, 2015 at 02:08:16AM +0530, Parav Pandit wrote:
>> Currently user space applications can easily take away all the rdma
>> device specific resources such as AH, CQ, QP,
.
Parav
On Tue, Sep 8, 2015 at 2:08 AM, Parav Pandit wrote:
> Currently user space applications can easily take away all the rdma
> device specific resources such as AH, CQ, QP, MR etc. Due to which other
> applications in other cgroup or kernel space ULPs may not even get chance
> to alloc
Modified device cgroup documentation to reflect its dual purpose
without creating new cgroup subsystem for rdma.
Added documentation to describe functionality and usage of device cgroup
extension for RDMA.
Signed-off-by: Parav Pandit
---
Documentation/cgroups/devices.txt | 32
-by: Parav Pandit
---
include/linux/device_rdma_cgroup.h | 83
security/device_rdma_cgroup.c | 422 +
2 files changed, 505 insertions(+)
create mode 100644 include/linux/device_rdma_cgroup.h
create mode 100644 security/device_rdma_cgroup.c
diff
for configuring max limit of each rdma
resource and one file for querying controllers current resource usage.
Signed-off-by: Parav Pandit
---
include/linux/device_cgroup.h | 53 +++
security/device_cgroup.c | 119 +-
2 files changed, 136
Added RDMA device resource tracking object per task.
Added comments to capture usage of task lock by device cgroup
for rdma.
Signed-off-by: Parav Pandit
---
include/linux/sched.h | 12 +++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/include/linux/sched.h b/include
),
it passes associated ucontext pointer during uncharge, so that
rdma cgroup controller can correctly free the resource of right
task and right cgroup.
Signed-off-by: Parav Pandit
---
drivers/infiniband/core/uverbs_cmd.c | 139 +-
drivers/infiniband/core/uverbs_main.c
Added RDMA resource tracking object of device cgroup.
Signed-off-by: Parav Pandit
---
security/Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/security/Makefile b/security/Makefile
index c9bfbc8..c9ad56d 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -23,6 +23,7 @@ obj
Added user configuration option to enable/disable RDMA resource tracking
feature of device cgroup as sub module.
Signed-off-by: Parav Pandit
---
init/Kconfig | 12
1 file changed, 12 insertions(+)
diff --git a/init/Kconfig b/init/Kconfig
index 2184b34..089db85 100644
--- a/init
of other resources and capabilities.
Parav Pandit (7):
devcg: Added user option to rdma resource tracking.
devcg: Added rdma resource tracking module.
devcg: Added infrastructure for rdma device cgroup.
devcg: Added rdma resource tracker object per task
devcg: device cgroup's extension
Modified device cgroup documentation to reflect its dual purpose
without creating new cgroup subsystem for rdma.
Added documentation to describe functionality and usage of device cgroup
extension for RDMA.
Signed-off-by: Parav Pandit <pandit.pa...@gmail.com>
---
Documentation/c
.
Parav
On Tue, Sep 8, 2015 at 2:08 AM, Parav Pandit <pandit.pa...@gmail.com> wrote:
> Currently user space applications can easily take away all the rdma
> device specific resources such as AH, CQ, QP, MR etc. Due to which other
> applications in other cgroup or kernel space ULPs m
Added user configuration option to enable/disable RDMA resource tracking
feature of device cgroup as sub module.
Signed-off-by: Parav Pandit <pandit.pa...@gmail.com>
---
init/Kconfig | 12
1 file changed, 12 insertions(+)
diff --git a/init/Kconfig b/init/Kconfig
index 2
of other resources and capabilities.
Parav Pandit (7):
devcg: Added user option to rdma resource tracking.
devcg: Added rdma resource tracking module.
devcg: Added infrastructure for rdma device cgroup.
devcg: Added rdma resource tracker object per task
devcg: device cgroup's extension
-by: Parav Pandit <pandit.pa...@gmail.com>
---
include/linux/device_rdma_cgroup.h | 83
security/device_rdma_cgroup.c | 422 +
2 files changed, 505 insertions(+)
create mode 100644 include/linux/device_rdma_cgroup.h
create mode 100644 se
for configuring max limit of each rdma
resource and one file for querying controllers current resource usage.
Signed-off-by: Parav Pandit <pandit.pa...@gmail.com>
---
include/linux/device_cgroup.h | 53 +++
security/device_cgroup.c
Added RDMA device resource tracking object per task.
Added comments to capture usage of task lock by device cgroup
for rdma.
Signed-off-by: Parav Pandit <pandit.pa...@gmail.com>
---
include/linux/sched.h | 12 +++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/i
),
it passes associated ucontext pointer during uncharge, so that
rdma cgroup controller can correctly free the resource of right
task and right cgroup.
Signed-off-by: Parav Pandit <pandit.pa...@gmail.com>
---
drivers/infiniband/core/uverbs_cmd.c | 139 +-
drivers/infi
Added RDMA resource tracking object of device cgroup.
Signed-off-by: Parav Pandit <pandit.pa...@gmail.com>
---
security/Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/security/Makefile b/security/Makefile
index c9bfbc8..c9ad56d 100644
--- a/security/Makefile
+++ b/security/Ma
On Thu, Jun 18, 2015 at 9:29 PM, Jon Derrick wrote:
> On Thu, Jun 18, 2015 at 04:13:50PM +0530, Parav Pandit wrote:
>> Kernel thread nvme_thread and driver load process can be executing
>> in parallel on different CPU. This leads to race condition whenever
>> nvme_alloc
On Thu, Jun 18, 2015 at 9:29 PM, Jon Derrick jonathan.derr...@intel.com wrote:
On Thu, Jun 18, 2015 at 04:13:50PM +0530, Parav Pandit wrote:
Kernel thread nvme_thread and driver load process can be executing
in parallel on different CPU. This leads to race condition whenever
nvme_alloc_queue
that it maintains the
order and and data dependency read barrier in reader thread ensures
that cpu cache is synced.
Signed-off-by: Parav Pandit
---
drivers/block/nvme-core.c | 12 ++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme
that it maintains the
order and and data dependency read barrier in reader thread ensures
that cpu cache is synced.
Signed-off-by: Parav Pandit parav.pan...@avagotech.com
---
drivers/block/nvme-core.c | 12 ++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/drivers/block/nvme
I am sorry. By mistake sent the same patch which was already sent few
days back. Its pending for merge.
On Tue, Jun 2, 2015 at 7:41 PM, Parav Pandit wrote:
> From: Parav Pandit
>
> Moved code for reusing at few places:
> 1. Moved lba_shift related calculation code to macro for conv
From: Parav Pandit
Moved SQ doorbell pressing code to smaller function to avoid
code duplication at 3 places.
nvme_submit_cmd is low level posting function which never fails. Removed
checks around its return status which was always success.
Signed-off-by: Parav Pandit
---
drivers/block/nvme
From: Parav Pandit
Moved code for reusing at few places:
1. Moved lba_shift related calculation code to macro for converting block
to/from len.
2. Moved req_len to nlb calculation to inline function.
Signed-off-by: Parav Pandit
---
drivers/block/nvme-core.c | 10 +-
drivers/block
On Fri, May 22, 2015 at 10:22 AM, Parav Pandit
wrote:
> On Fri, May 22, 2015 at 2:15 AM, J Freyensee
> wrote:
>> On Wed, 2015-05-20 at 16:43 -0400, Parav Pandit wrote:
>>> nvme_queue structure made 64B cache friendly so that majority of the
>>> data eleme
On Fri, May 22, 2015 at 10:22 AM, Parav Pandit
parav.pan...@avagotech.com wrote:
On Fri, May 22, 2015 at 2:15 AM, J Freyensee
james_p_freyen...@linux.intel.com wrote:
On Wed, 2015-05-20 at 16:43 -0400, Parav Pandit wrote:
nvme_queue structure made 64B cache friendly so that majority
I am sorry. By mistake sent the same patch which was already sent few
days back. Its pending for merge.
On Tue, Jun 2, 2015 at 7:41 PM, Parav Pandit parav.pan...@avagotech.com wrote:
From: Parav Pandit parav.pan...@avagotech.com
Moved code for reusing at few places:
1. Moved lba_shift related
From: Parav Pandit parav.pan...@avagotech.com
Moved code for reusing at few places:
1. Moved lba_shift related calculation code to macro for converting block
to/from len.
2. Moved req_len to nlb calculation to inline function.
Signed-off-by: Parav Pandit parav.pan...@avagotech.com
---
drivers
From: Parav Pandit parav.pan...@avagotech.com
Moved SQ doorbell pressing code to smaller function to avoid
code duplication at 3 places.
nvme_submit_cmd is low level posting function which never fails. Removed
checks around its return status which was always success.
Signed-off-by: Parav Pandit
On Fri, May 22, 2015 at 11:17 PM, Keith Busch wrote:
> On Fri, 22 May 2015, Parav Pandit wrote:
>>
>> I agree to it that nvmeq won't be null after mb(); That alone is not
>> sufficient.
>>
>> What I have proposed in previous email is,
>>
>> Converting,
On Fri, May 22, 2015 at 10:37 PM, Keith Busch wrote:
> On Fri, 22 May 2015, Parav Pandit wrote:
>>
>> On Fri, May 22, 2015 at 9:53 PM, Keith Busch
>> wrote:
>>>
>>> A memory barrier before incrementing the dev->queue_count (and assigning
>>> t
On Fri, May 22, 2015 at 9:53 PM, Keith Busch wrote:
> On Fri, 22 May 2015, Parav Pandit wrote:
>>
>> During normal positive path probe,
>> (a) device is added to dev_list in nvme_dev_start()
>> (b) nvme_kthread got created, which will eventually refers to
>> d
On Fri, May 22, 2015 at 8:41 PM, Keith Busch wrote:
> On Fri, 22 May 2015, Parav Pandit wrote:
>>
>> On Fri, May 22, 2015 at 8:18 PM, Keith Busch
>> wrote:
>>>
>>> The rcu protection on nvme queues was removed with the blk-mq conversion
>>> as we
On Fri, May 22, 2015 at 8:18 PM, Keith Busch wrote:
> On Thu, 21 May 2015, Parav Pandit wrote:
>>
>> On Fri, May 22, 2015 at 1:04 AM, Keith Busch
>> wrote:
>>>
>>> The q_lock is held to protect polling from reading inconsistent data.
>>
>>
>
On Fri, May 22, 2015 at 8:18 PM, Keith Busch keith.bu...@intel.com wrote:
On Thu, 21 May 2015, Parav Pandit wrote:
On Fri, May 22, 2015 at 1:04 AM, Keith Busch keith.bu...@intel.com
wrote:
The q_lock is held to protect polling from reading inconsistent data.
ah, yes. I can see
On Fri, May 22, 2015 at 8:41 PM, Keith Busch keith.bu...@intel.com wrote:
On Fri, 22 May 2015, Parav Pandit wrote:
On Fri, May 22, 2015 at 8:18 PM, Keith Busch keith.bu...@intel.com
wrote:
The rcu protection on nvme queues was removed with the blk-mq conversion
as we rely on that layer
On Fri, May 22, 2015 at 9:53 PM, Keith Busch keith.bu...@intel.com wrote:
On Fri, 22 May 2015, Parav Pandit wrote:
During normal positive path probe,
(a) device is added to dev_list in nvme_dev_start()
(b) nvme_kthread got created, which will eventually refers to
dev-queues[qid] to check
On Fri, May 22, 2015 at 11:17 PM, Keith Busch keith.bu...@intel.com wrote:
On Fri, 22 May 2015, Parav Pandit wrote:
I agree to it that nvmeq won't be null after mb(); That alone is not
sufficient.
What I have proposed in previous email is,
Converting,
struct nvme_queue *nvmeq = dev
On Fri, May 22, 2015 at 10:37 PM, Keith Busch keith.bu...@intel.com wrote:
On Fri, 22 May 2015, Parav Pandit wrote:
On Fri, May 22, 2015 at 9:53 PM, Keith Busch keith.bu...@intel.com
wrote:
A memory barrier before incrementing the dev-queue_count (and assigning
the pointer in the array
On Fri, May 22, 2015 at 2:15 AM, J Freyensee
wrote:
> On Wed, 2015-05-20 at 16:43 -0400, Parav Pandit wrote:
>> nvme_queue structure made 64B cache friendly so that majority of the
>> data elements of the structure during IO and completion path can be
>> found in typical
On Fri, May 22, 2015 at 1:04 AM, Keith Busch wrote:
> On Thu, 21 May 2015, Parav Pandit wrote:
>>
>> Avoid diabling interrupt and holding q_lock for the queue
>> which is just getting initialized.
>>
>> With this change, online_queues is also incremented without
On Fri, May 22, 2015 at 12:09 AM, Jens Axboe wrote:
> On 05/21/2015 06:12 PM, Parav Pandit wrote:
>>
>> Avoid diabling interrupt and holding q_lock for the queue
>> which is just getting initialized.
>>
>> With this change, online_queues is also incremented w
protect device wide
online_queues variable anyway.
Signed-off-by: Parav Pandit
---
drivers/block/nvme-core.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 58041c7..7f09e5e 100644
--- a/drivers/block/nvme-core.c
+++ b/drivers/block/nvme
On Fri, May 22, 2015 at 1:04 AM, Keith Busch keith.bu...@intel.com wrote:
On Thu, 21 May 2015, Parav Pandit wrote:
Avoid diabling interrupt and holding q_lock for the queue
which is just getting initialized.
With this change, online_queues is also incremented without
lock during queue setup
On Fri, May 22, 2015 at 2:15 AM, J Freyensee
james_p_freyen...@linux.intel.com wrote:
On Wed, 2015-05-20 at 16:43 -0400, Parav Pandit wrote:
nvme_queue structure made 64B cache friendly so that majority of the
data elements of the structure during IO and completion path can be
found in typical
protect device wide
online_queues variable anyway.
Signed-off-by: Parav Pandit parav.pan...@avagotech.com
---
drivers/block/nvme-core.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
index 58041c7..7f09e5e 100644
--- a/drivers/block/nvme
On Fri, May 22, 2015 at 12:09 AM, Jens Axboe ax...@kernel.dk wrote:
On 05/21/2015 06:12 PM, Parav Pandit wrote:
Avoid diabling interrupt and holding q_lock for the queue
which is just getting initialized.
With this change, online_queues is also incremented without
lock during queue setup
of the structure.
Elements which are not used in frequent IO path are moved at the
end of structure.
Signed-off-by: Parav Pandit
---
drivers/block/nvme-core.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
On Wed, May 20, 2015 at 6:50 PM, Matthew Wilcox wrote:
> On Wed, May 20, 2015 at 02:01:03PM -0400, Parav Pandit wrote:
>> nvme_queue structure made 64B cache friendly so that majority of the
>> data elements of the structure during IO and completion path can be
>> found
of the structure.
Elements which are not used in frequent IO path are moved at the end of
structure.
Signed-off-by: Parav Pandit
---
drivers/block/nvme-core.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/block/nvme-core.c b/drivers/block/nvme-core.c
Moved code for reusing at few places:
1. Moved lba_shift related calculation code to macro for converting block
to/from len.
2. Moved req_len to nlb calculation to inline function.
Signed-off-by: Parav Pandit
---
drivers/block/nvme-core.c | 10 +-
drivers/block/nvme-scsi.c | 10
Moved code for reusing at few places:
1. Moved lba_shift related calculation code to macro for converting block
to/from len.
2. Moved req_len to nlb calculation to inline function.
Signed-off-by: Parav Pandit parav.pan...@avagotech.com
---
drivers/block/nvme-core.c | 10 +-
drivers
of the structure.
Elements which are not used in frequent IO path are moved at the end of
structure.
Signed-off-by: Parav Pandit parav.pan...@avagotech.com
---
drivers/block/nvme-core.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/block/nvme-core.c b
On Wed, May 20, 2015 at 6:50 PM, Matthew Wilcox wi...@linux.intel.com wrote:
On Wed, May 20, 2015 at 02:01:03PM -0400, Parav Pandit wrote:
nvme_queue structure made 64B cache friendly so that majority of the
data elements of the structure during IO and completion path can be
found in typical
of the structure.
Elements which are not used in frequent IO path are moved at the
end of structure.
Signed-off-by: Parav Pandit parav.pan...@avagotech.com
---
drivers/block/nvme-core.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/block/nvme-core.c b
tand that if my interrupt handler is going to
be called most of the time then it is very likely to
happen that OS will flush the same, but there is no
guarantee for it.
Regards,
Parav Pandit
Get your own web ad
is going to
be called most of the time then it is very likely to
happen that OS will flush the same, but there is no
guarantee for it.
Regards,
Parav Pandit
Get your own web address.
Have a HUGE year through
601 - 678 of 678 matches
Mail list logo