Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Seth Jennings
On 07/11/2012 05:42 PM, Nitin Gupta wrote:
> On Wed, Jul 11, 2012 at 1:32 PM, Seth Jennings
>  wrote:
>> On 07/11/2012 01:26 PM, Nitin Gupta wrote:

>>> Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
>>> zs_unmap_object() we would just do the reverse and restore objects as in
>>> figure-1.
>>
>> Hey Nitin, thanks for the feedback.
>>
>> Correct me if I'm wrong, but it seems like you wouldn't be able to map
>> ob2 while ob1 was mapped with this design.  You'd need some sort of
>> zspage level protection against concurrent object mappings.  The
>> code for that protection might cancel any benefit you would gain by
>> doing it this way.
>>
> 
> Do you think blocking access of just one particular object (or
> blocking an entire zspage, for simplicity) for a short time would be
> an issue, apart from the complexity of implementing per zspage
> locking?

It would only need to prevent the mapping of the temporarily displaced
object, but I said zspage because I don't know how we would do
per-object locking.  I actually don't know how we would do zspage
locking either unless there is a lock in the struct page we can use.

Either way, I think it is a complexity I think we'd be better to avoid
for now.  I'm trying to get zsmalloc in shape to bring into mainline, so
I'm really focusing on portability first and low hanging performance
fruit second. This optimization would be more like top-of-the-tree
performance fruit :-/

However, if you want to try it out, don't let me stop you :)

Thanks,
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Nitin Gupta
On Wed, Jul 11, 2012 at 1:32 PM, Seth Jennings
 wrote:
> On 07/11/2012 01:26 PM, Nitin Gupta wrote:
>> On 07/02/2012 02:15 PM, Seth Jennings wrote:
>>> This patch replaces the page table assisted object mapping
>>> method, which has x86 dependencies, with a arch-independent
>>> method that does a simple copy into a temporary per-cpu
>>> buffer.
>>>
>>> While a copy seems like it would be worse than mapping the pages,
>>> tests demonstrate the copying is always faster and, in the case of
>>> running inside a KVM guest, roughly 4x faster.
>>>
>>> Signed-off-by: Seth Jennings 
>>> ---
>>>  drivers/staging/zsmalloc/Kconfig |4 --
>>>  drivers/staging/zsmalloc/zsmalloc-main.c |   99 
>>> +-
>>>  drivers/staging/zsmalloc/zsmalloc_int.h  |5 +-
>>>  3 files changed, 72 insertions(+), 36 deletions(-)
>>>
>>
>>
>>>  struct mapping_area {
>>> -struct vm_struct *vm;
>>> -pte_t *vm_ptes[2];
>>> -char *vm_addr;
>>> +char *vm_buf; /* copy buffer for objects that span pages */
>>> +char *vm_addr; /* address of kmap_atomic()'ed pages */
>>>  };
>>>
>>
>> I think we can reduce the copying overhead by not copying an entire
>> compressed object to another (per-cpu) buffer. The basic idea of the
>> method below is to:
>>  - Copy only the amount of data that spills over into the next page
>>  - No need for a separate buffer to copy into
>>
>> Currently, we store objects that split across pages as:
>>
>> +-Page1-+
>> | |
>> | |
>> |---| <-- obj-1 off: 0
>> |   |
>> +---+ <-- obj-1 off: s'
>>
>> +-Page2-+ <-- obj-1 off: s'
>> ||
>> |---| <-- obj-1 off: obj1_size, obj-2 off: 0
>> ||
>> |---| <-- obj-2 off: obj2_size
>> +---+
>>
>> But now we would store it as:
>>
>> +-Page1-+
>> | |
>> |---| <-- obj-1 off: s''
>> | |
>> |   |
>> +---+ <-- obj-1 off: obj1_size
>>
>> +-Page2-+ <-- obj-1 off: 0
>> ||
>> |---| <-- obj-1 off: s'', obj-2 off: 0
>> ||
>> |---| <-- obj-2 off: obj2_size
>> +---+
>>
>> When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
>> be swapped with ob1'. This swapping can be done in-place using simple
>> xor swap algorithm. So, after swap, page-1 and page-2 will look like:
>>
>> +-Page1-+
>> | |
>> |---| <-- obj-2 off: 0
>> | |
>> ||
>> +---+ <-- obj-2 off: (obj1_size - s'')
>>
>> +-Page2-+ <-- obj-1 off: 0
>> | |
>> ||
>> |---| <-- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
>> |   |
>> +---+ <-- obj-2 off: obj2_size
>>
>> Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
>> zs_unmap_object() we would just do the reverse and restore objects as in
>> figure-1.
>
> Hey Nitin, thanks for the feedback.
>
> Correct me if I'm wrong, but it seems like you wouldn't be able to map
> ob2 while ob1 was mapped with this design.  You'd need some sort of
> zspage level protection against concurrent object mappings.  The
> code for that protection might cancel any benefit you would gain by
> doing it this way.
>

Do you think blocking access of just one particular object (or
blocking an entire zspage, for simplicity) for a short time would be
an issue, apart from the complexity of implementing per zspage
locking?

Thanks,
Nitin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Seth Jennings
On 07/11/2012 01:26 PM, Nitin Gupta wrote:
> On 07/02/2012 02:15 PM, Seth Jennings wrote:
>> This patch replaces the page table assisted object mapping
>> method, which has x86 dependencies, with a arch-independent
>> method that does a simple copy into a temporary per-cpu
>> buffer.
>>
>> While a copy seems like it would be worse than mapping the pages,
>> tests demonstrate the copying is always faster and, in the case of
>> running inside a KVM guest, roughly 4x faster.
>>
>> Signed-off-by: Seth Jennings 
>> ---
>>  drivers/staging/zsmalloc/Kconfig |4 --
>>  drivers/staging/zsmalloc/zsmalloc-main.c |   99 
>> +-
>>  drivers/staging/zsmalloc/zsmalloc_int.h  |5 +-
>>  3 files changed, 72 insertions(+), 36 deletions(-)
>>
> 
> 
>>  struct mapping_area {
>> -struct vm_struct *vm;
>> -pte_t *vm_ptes[2];
>> -char *vm_addr;
>> +char *vm_buf; /* copy buffer for objects that span pages */
>> +char *vm_addr; /* address of kmap_atomic()'ed pages */
>>  };
>>  
> 
> I think we can reduce the copying overhead by not copying an entire
> compressed object to another (per-cpu) buffer. The basic idea of the
> method below is to:
>  - Copy only the amount of data that spills over into the next page
>  - No need for a separate buffer to copy into
> 
> Currently, we store objects that split across pages as:
> 
> +-Page1-+
> | |
> | |
> |---| <-- obj-1 off: 0
> |   |
> +---+ <-- obj-1 off: s'
> 
> +-Page2-+ <-- obj-1 off: s'
> ||
> |---| <-- obj-1 off: obj1_size, obj-2 off: 0
> ||
> |---| <-- obj-2 off: obj2_size
> +---+
> 
> But now we would store it as:
> 
> +-Page1-+
> | |
> |---| <-- obj-1 off: s''
> | |
> |   |
> +---+ <-- obj-1 off: obj1_size
> 
> +-Page2-+ <-- obj-1 off: 0
> ||
> |---| <-- obj-1 off: s'', obj-2 off: 0
> ||
> |---| <-- obj-2 off: obj2_size
> +---+
> 
> When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
> be swapped with ob1'. This swapping can be done in-place using simple
> xor swap algorithm. So, after swap, page-1 and page-2 will look like:
> 
> +-Page1-+
> | |
> |---| <-- obj-2 off: 0
> | |
> ||
> +---+ <-- obj-2 off: (obj1_size - s'')
> 
> +-Page2-+ <-- obj-1 off: 0
> | |
> ||
> |---| <-- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
> |   |
> +---+ <-- obj-2 off: obj2_size
> 
> Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
> zs_unmap_object() we would just do the reverse and restore objects as in
> figure-1.

Hey Nitin, thanks for the feedback.

Correct me if I'm wrong, but it seems like you wouldn't be able to map
ob2 while ob1 was mapped with this design.  You'd need some sort of
zspage level protection against concurrent object mappings.  The
code for that protection might cancel any benefit you would gain by
doing it this way.

Thanks,
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Nitin Gupta
On 07/02/2012 02:15 PM, Seth Jennings wrote:
> This patch replaces the page table assisted object mapping
> method, which has x86 dependencies, with a arch-independent
> method that does a simple copy into a temporary per-cpu
> buffer.
> 
> While a copy seems like it would be worse than mapping the pages,
> tests demonstrate the copying is always faster and, in the case of
> running inside a KVM guest, roughly 4x faster.
>
> Signed-off-by: Seth Jennings 
> ---
>  drivers/staging/zsmalloc/Kconfig |4 --
>  drivers/staging/zsmalloc/zsmalloc-main.c |   99 
> +-
>  drivers/staging/zsmalloc/zsmalloc_int.h  |5 +-
>  3 files changed, 72 insertions(+), 36 deletions(-)
> 


>  struct mapping_area {
> - struct vm_struct *vm;
> - pte_t *vm_ptes[2];
> - char *vm_addr;
> + char *vm_buf; /* copy buffer for objects that span pages */
> + char *vm_addr; /* address of kmap_atomic()'ed pages */
>  };
>  

I think we can reduce the copying overhead by not copying an entire
compressed object to another (per-cpu) buffer. The basic idea of the
method below is to:
 - Copy only the amount of data that spills over into the next page
 - No need for a separate buffer to copy into

Currently, we store objects that split across pages as:

+-Page1-+
|   |
|   |
|---| <-- obj-1 off: 0
| |
+---+ <-- obj-1 off: s'

+-Page2-+ <-- obj-1 off: s'
||
|---| <-- obj-1 off: obj1_size, obj-2 off: 0
|  |
|---| <-- obj-2 off: obj2_size
+---+

But now we would store it as:

+-Page1-+
|   |
|---| <-- obj-1 off: s''
|   |
| |
+---+ <-- obj-1 off: obj1_size

+-Page2-+ <-- obj-1 off: 0
||
|---| <-- obj-1 off: s'', obj-2 off: 0
|  |
|---| <-- obj-2 off: obj2_size
+---+

When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
be swapped with ob1'. This swapping can be done in-place using simple
xor swap algorithm. So, after swap, page-1 and page-2 will look like:

+-Page1-+
|   |
|---| <-- obj-2 off: 0
|   |
||
+---+ <-- obj-2 off: (obj1_size - s'')

+-Page2-+ <-- obj-1 off: 0
|   |
|  |
|---| <-- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
| |
+---+ <-- obj-2 off: obj2_size

Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
zs_unmap_object() we would just do the reverse and restore objects as in
figure-1.

We can reduce the overhead even further by removing the need to
"restore" the object. For this, we would have to remove the assumption
in zsmalloc that the spilled part begins at offset 0 in the next page.
Not sure how feasible this would be.

Note that doing this transform is always possible since we ensure that
an object can never exceed system page size.

Thanks,
Nitin


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Minchan Kim
On 07/11/2012 12:29 AM, Seth Jennings wrote:
> On 07/09/2012 09:21 PM, Minchan Kim wrote:
>> On 07/03/2012 06:15 AM, Seth Jennings wrote:
> 
>>> +static void zs_copy_map_object(char *buf, struct page *firstpage,
>>> +   int off, int size)
>>
>> firstpage is rather misleading.
>> As you know, we use firstpage term for real firstpage of zspage but
>> in case of zs_copy_map_object, it could be a middle page of zspage.
>> So I would like to use "page" instead of firstpage.
> 
> Accepted.
> 
>>> +{
>>> +   struct page *pages[2];
>>> +   int sizes[2];
>>> +   void *addr;
>>> +
>>> +   pages[0] = firstpage;
>>> +   pages[1] = get_next_page(firstpage);
>>> +   BUG_ON(!pages[1]);
>>> +
>>> +   sizes[0] = PAGE_SIZE - off;
>>> +   sizes[1] = size - sizes[0];
>>> +
>>> +   /* disable page faults to match kmap_atomic() return conditions */
>>> +   pagefault_disable();
>>
>> If I understand your intention correctly, you want to prevent calling
>> this function on non-atomic context. Right?
> 
> This is moved to zs_map_object() in a later patch, but the
> point is to provide uniform return conditions, regardless of
> whether the object to be mapped is contained in a single
> page or spans two pages.  kmap_atomic() disables page
> faults, so I did it here to create symmetry.  The result is

The one I want to comment out is why we should disable page fault.
ie, if we don't disable page fault, what's happen?

As I read the comment of kmap_atomic about pagefault_disable, 
it seems that for preventing reentrant bug in preemptive kernel
while it catch page fault during atomic context in non-preemptive kernel.
But I'm not sure so Ccing Peter.


> that zs_map_object always returns with preemption and page
> faults disabled.
> 
> Also, Greg already merged these patches so I'll have to
> incorporate these changes as a separate patch.
> 
> Thanks,
> Seth
> 


-- 
Kind regards,
Minchan Kim


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Minchan Kim
On 07/11/2012 12:29 AM, Seth Jennings wrote:
 On 07/09/2012 09:21 PM, Minchan Kim wrote:
 On 07/03/2012 06:15 AM, Seth Jennings wrote:
 snip
 +static void zs_copy_map_object(char *buf, struct page *firstpage,
 +   int off, int size)

 firstpage is rather misleading.
 As you know, we use firstpage term for real firstpage of zspage but
 in case of zs_copy_map_object, it could be a middle page of zspage.
 So I would like to use page instead of firstpage.
 
 Accepted.
 
 +{
 +   struct page *pages[2];
 +   int sizes[2];
 +   void *addr;
 +
 +   pages[0] = firstpage;
 +   pages[1] = get_next_page(firstpage);
 +   BUG_ON(!pages[1]);
 +
 +   sizes[0] = PAGE_SIZE - off;
 +   sizes[1] = size - sizes[0];
 +
 +   /* disable page faults to match kmap_atomic() return conditions */
 +   pagefault_disable();

 If I understand your intention correctly, you want to prevent calling
 this function on non-atomic context. Right?
 
 This is moved to zs_map_object() in a later patch, but the
 point is to provide uniform return conditions, regardless of
 whether the object to be mapped is contained in a single
 page or spans two pages.  kmap_atomic() disables page
 faults, so I did it here to create symmetry.  The result is

The one I want to comment out is why we should disable page fault.
ie, if we don't disable page fault, what's happen?

As I read the comment of kmap_atomic about pagefault_disable, 
it seems that for preventing reentrant bug in preemptive kernel
while it catch page fault during atomic context in non-preemptive kernel.
But I'm not sure so Ccing Peter.


 that zs_map_object always returns with preemption and page
 faults disabled.
 
 Also, Greg already merged these patches so I'll have to
 incorporate these changes as a separate patch.
 
 Thanks,
 Seth
 


-- 
Kind regards,
Minchan Kim


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Nitin Gupta
On 07/02/2012 02:15 PM, Seth Jennings wrote:
 This patch replaces the page table assisted object mapping
 method, which has x86 dependencies, with a arch-independent
 method that does a simple copy into a temporary per-cpu
 buffer.
 
 While a copy seems like it would be worse than mapping the pages,
 tests demonstrate the copying is always faster and, in the case of
 running inside a KVM guest, roughly 4x faster.

 Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com
 ---
  drivers/staging/zsmalloc/Kconfig |4 --
  drivers/staging/zsmalloc/zsmalloc-main.c |   99 
 +-
  drivers/staging/zsmalloc/zsmalloc_int.h  |5 +-
  3 files changed, 72 insertions(+), 36 deletions(-)
 


  struct mapping_area {
 - struct vm_struct *vm;
 - pte_t *vm_ptes[2];
 - char *vm_addr;
 + char *vm_buf; /* copy buffer for objects that span pages */
 + char *vm_addr; /* address of kmap_atomic()'ed pages */
  };
  

I think we can reduce the copying overhead by not copying an entire
compressed object to another (per-cpu) buffer. The basic idea of the
method below is to:
 - Copy only the amount of data that spills over into the next page
 - No need for a separate buffer to copy into

Currently, we store objects that split across pages as:

+-Page1-+
|   |
|   |
|---| -- obj-1 off: 0
|ob1' |
+---+ -- obj-1 off: s'

+-Page2-+ -- obj-1 off: s'
|ob1''|
|---| -- obj-1 off: obj1_size, obj-2 off: 0
|ob2  |
|---| -- obj-2 off: obj2_size
+---+

But now we would store it as:

+-Page1-+
|   |
|---| -- obj-1 off: s''
|   |
|ob1' |
+---+ -- obj-1 off: obj1_size

+-Page2-+ -- obj-1 off: 0
|ob1''|
|---| -- obj-1 off: s'', obj-2 off: 0
|ob2  |
|---| -- obj-2 off: obj2_size
+---+

When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
be swapped with ob1'. This swapping can be done in-place using simple
xor swap algorithm. So, after swap, page-1 and page-2 will look like:

+-Page1-+
|   |
|---| -- obj-2 off: 0
|   |
|ob2''|
+---+ -- obj-2 off: (obj1_size - s'')

+-Page2-+ -- obj-1 off: 0
|   |
|ob1  |
|---| -- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
|ob2' |
+---+ -- obj-2 off: obj2_size

Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
zs_unmap_object() we would just do the reverse and restore objects as in
figure-1.

We can reduce the overhead even further by removing the need to
restore the object. For this, we would have to remove the assumption
in zsmalloc that the spilled part begins at offset 0 in the next page.
Not sure how feasible this would be.

Note that doing this transform is always possible since we ensure that
an object can never exceed system page size.

Thanks,
Nitin


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Seth Jennings
On 07/11/2012 01:26 PM, Nitin Gupta wrote:
 On 07/02/2012 02:15 PM, Seth Jennings wrote:
 This patch replaces the page table assisted object mapping
 method, which has x86 dependencies, with a arch-independent
 method that does a simple copy into a temporary per-cpu
 buffer.

 While a copy seems like it would be worse than mapping the pages,
 tests demonstrate the copying is always faster and, in the case of
 running inside a KVM guest, roughly 4x faster.

 Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com
 ---
  drivers/staging/zsmalloc/Kconfig |4 --
  drivers/staging/zsmalloc/zsmalloc-main.c |   99 
 +-
  drivers/staging/zsmalloc/zsmalloc_int.h  |5 +-
  3 files changed, 72 insertions(+), 36 deletions(-)

 
 
  struct mapping_area {
 -struct vm_struct *vm;
 -pte_t *vm_ptes[2];
 -char *vm_addr;
 +char *vm_buf; /* copy buffer for objects that span pages */
 +char *vm_addr; /* address of kmap_atomic()'ed pages */
  };
  
 
 I think we can reduce the copying overhead by not copying an entire
 compressed object to another (per-cpu) buffer. The basic idea of the
 method below is to:
  - Copy only the amount of data that spills over into the next page
  - No need for a separate buffer to copy into
 
 Currently, we store objects that split across pages as:
 
 +-Page1-+
 | |
 | |
 |---| -- obj-1 off: 0
 |ob1'   |
 +---+ -- obj-1 off: s'
 
 +-Page2-+ -- obj-1 off: s'
 |ob1''|
 |---| -- obj-1 off: obj1_size, obj-2 off: 0
 |ob2|
 |---| -- obj-2 off: obj2_size
 +---+
 
 But now we would store it as:
 
 +-Page1-+
 | |
 |---| -- obj-1 off: s''
 | |
 |ob1'   |
 +---+ -- obj-1 off: obj1_size
 
 +-Page2-+ -- obj-1 off: 0
 |ob1''|
 |---| -- obj-1 off: s'', obj-2 off: 0
 |ob2|
 |---| -- obj-2 off: obj2_size
 +---+
 
 When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
 be swapped with ob1'. This swapping can be done in-place using simple
 xor swap algorithm. So, after swap, page-1 and page-2 will look like:
 
 +-Page1-+
 | |
 |---| -- obj-2 off: 0
 | |
 |ob2''|
 +---+ -- obj-2 off: (obj1_size - s'')
 
 +-Page2-+ -- obj-1 off: 0
 | |
 |ob1|
 |---| -- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
 |ob2'   |
 +---+ -- obj-2 off: obj2_size
 
 Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
 zs_unmap_object() we would just do the reverse and restore objects as in
 figure-1.

Hey Nitin, thanks for the feedback.

Correct me if I'm wrong, but it seems like you wouldn't be able to map
ob2 while ob1 was mapped with this design.  You'd need some sort of
zspage level protection against concurrent object mappings.  The
code for that protection might cancel any benefit you would gain by
doing it this way.

Thanks,
Seth

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Nitin Gupta
On Wed, Jul 11, 2012 at 1:32 PM, Seth Jennings
sjenn...@linux.vnet.ibm.com wrote:
 On 07/11/2012 01:26 PM, Nitin Gupta wrote:
 On 07/02/2012 02:15 PM, Seth Jennings wrote:
 This patch replaces the page table assisted object mapping
 method, which has x86 dependencies, with a arch-independent
 method that does a simple copy into a temporary per-cpu
 buffer.

 While a copy seems like it would be worse than mapping the pages,
 tests demonstrate the copying is always faster and, in the case of
 running inside a KVM guest, roughly 4x faster.

 Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com
 ---
  drivers/staging/zsmalloc/Kconfig |4 --
  drivers/staging/zsmalloc/zsmalloc-main.c |   99 
 +-
  drivers/staging/zsmalloc/zsmalloc_int.h  |5 +-
  3 files changed, 72 insertions(+), 36 deletions(-)



  struct mapping_area {
 -struct vm_struct *vm;
 -pte_t *vm_ptes[2];
 -char *vm_addr;
 +char *vm_buf; /* copy buffer for objects that span pages */
 +char *vm_addr; /* address of kmap_atomic()'ed pages */
  };


 I think we can reduce the copying overhead by not copying an entire
 compressed object to another (per-cpu) buffer. The basic idea of the
 method below is to:
  - Copy only the amount of data that spills over into the next page
  - No need for a separate buffer to copy into

 Currently, we store objects that split across pages as:

 +-Page1-+
 | |
 | |
 |---| -- obj-1 off: 0
 |ob1'   |
 +---+ -- obj-1 off: s'

 +-Page2-+ -- obj-1 off: s'
 |ob1''|
 |---| -- obj-1 off: obj1_size, obj-2 off: 0
 |ob2|
 |---| -- obj-2 off: obj2_size
 +---+

 But now we would store it as:

 +-Page1-+
 | |
 |---| -- obj-1 off: s''
 | |
 |ob1'   |
 +---+ -- obj-1 off: obj1_size

 +-Page2-+ -- obj-1 off: 0
 |ob1''|
 |---| -- obj-1 off: s'', obj-2 off: 0
 |ob2|
 |---| -- obj-2 off: obj2_size
 +---+

 When object-1 (ob1) is to be mapped, part (size: s'-0) of object-2 will
 be swapped with ob1'. This swapping can be done in-place using simple
 xor swap algorithm. So, after swap, page-1 and page-2 will look like:

 +-Page1-+
 | |
 |---| -- obj-2 off: 0
 | |
 |ob2''|
 +---+ -- obj-2 off: (obj1_size - s'')

 +-Page2-+ -- obj-1 off: 0
 | |
 |ob1|
 |---| -- obj-1 off: obj1_size, obj-2 off: (obj1_size - s'')
 |ob2'   |
 +---+ -- obj-2 off: obj2_size

 Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
 zs_unmap_object() we would just do the reverse and restore objects as in
 figure-1.

 Hey Nitin, thanks for the feedback.

 Correct me if I'm wrong, but it seems like you wouldn't be able to map
 ob2 while ob1 was mapped with this design.  You'd need some sort of
 zspage level protection against concurrent object mappings.  The
 code for that protection might cancel any benefit you would gain by
 doing it this way.


Do you think blocking access of just one particular object (or
blocking an entire zspage, for simplicity) for a short time would be
an issue, apart from the complexity of implementing per zspage
locking?

Thanks,
Nitin
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-11 Thread Seth Jennings
On 07/11/2012 05:42 PM, Nitin Gupta wrote:
 On Wed, Jul 11, 2012 at 1:32 PM, Seth Jennings
 sjenn...@linux.vnet.ibm.com wrote:
 On 07/11/2012 01:26 PM, Nitin Gupta wrote:
snip
 Now obj-1 lies completely within page-2, so can be kmap'ed as usual. On
 zs_unmap_object() we would just do the reverse and restore objects as in
 figure-1.

 Hey Nitin, thanks for the feedback.

 Correct me if I'm wrong, but it seems like you wouldn't be able to map
 ob2 while ob1 was mapped with this design.  You'd need some sort of
 zspage level protection against concurrent object mappings.  The
 code for that protection might cancel any benefit you would gain by
 doing it this way.

 
 Do you think blocking access of just one particular object (or
 blocking an entire zspage, for simplicity) for a short time would be
 an issue, apart from the complexity of implementing per zspage
 locking?

It would only need to prevent the mapping of the temporarily displaced
object, but I said zspage because I don't know how we would do
per-object locking.  I actually don't know how we would do zspage
locking either unless there is a lock in the struct page we can use.

Either way, I think it is a complexity I think we'd be better to avoid
for now.  I'm trying to get zsmalloc in shape to bring into mainline, so
I'm really focusing on portability first and low hanging performance
fruit second. This optimization would be more like top-of-the-tree
performance fruit :-/

However, if you want to try it out, don't let me stop you :)

Thanks,
Seth

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-10 Thread Seth Jennings
On 07/09/2012 09:21 PM, Minchan Kim wrote:
> On 07/03/2012 06:15 AM, Seth Jennings wrote:

>> +static void zs_copy_map_object(char *buf, struct page *firstpage,
>> +int off, int size)
> 
> firstpage is rather misleading.
> As you know, we use firstpage term for real firstpage of zspage but
> in case of zs_copy_map_object, it could be a middle page of zspage.
> So I would like to use "page" instead of firstpage.

Accepted.

>> +{
>> +struct page *pages[2];
>> +int sizes[2];
>> +void *addr;
>> +
>> +pages[0] = firstpage;
>> +pages[1] = get_next_page(firstpage);
>> +BUG_ON(!pages[1]);
>> +
>> +sizes[0] = PAGE_SIZE - off;
>> +sizes[1] = size - sizes[0];
>> +
>> +/* disable page faults to match kmap_atomic() return conditions */
>> +pagefault_disable();
> 
> If I understand your intention correctly, you want to prevent calling
> this function on non-atomic context. Right?

This is moved to zs_map_object() in a later patch, but the
point is to provide uniform return conditions, regardless of
whether the object to be mapped is contained in a single
page or spans two pages.  kmap_atomic() disables page
faults, so I did it here to create symmetry.  The result is
that zs_map_object always returns with preemption and page
faults disabled.

Also, Greg already merged these patches so I'll have to
incorporate these changes as a separate patch.

Thanks,
Seth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-10 Thread Seth Jennings
On 07/09/2012 09:21 PM, Minchan Kim wrote:
 On 07/03/2012 06:15 AM, Seth Jennings wrote:
snip
 +static void zs_copy_map_object(char *buf, struct page *firstpage,
 +int off, int size)
 
 firstpage is rather misleading.
 As you know, we use firstpage term for real firstpage of zspage but
 in case of zs_copy_map_object, it could be a middle page of zspage.
 So I would like to use page instead of firstpage.

Accepted.

 +{
 +struct page *pages[2];
 +int sizes[2];
 +void *addr;
 +
 +pages[0] = firstpage;
 +pages[1] = get_next_page(firstpage);
 +BUG_ON(!pages[1]);
 +
 +sizes[0] = PAGE_SIZE - off;
 +sizes[1] = size - sizes[0];
 +
 +/* disable page faults to match kmap_atomic() return conditions */
 +pagefault_disable();
 
 If I understand your intention correctly, you want to prevent calling
 this function on non-atomic context. Right?

This is moved to zs_map_object() in a later patch, but the
point is to provide uniform return conditions, regardless of
whether the object to be mapped is contained in a single
page or spans two pages.  kmap_atomic() disables page
faults, so I did it here to create symmetry.  The result is
that zs_map_object always returns with preemption and page
faults disabled.

Also, Greg already merged these patches so I'll have to
incorporate these changes as a separate patch.

Thanks,
Seth

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-09 Thread Minchan Kim
On 07/03/2012 06:15 AM, Seth Jennings wrote:
> This patch replaces the page table assisted object mapping
> method, which has x86 dependencies, with a arch-independent
> method that does a simple copy into a temporary per-cpu
> buffer.
> 
> While a copy seems like it would be worse than mapping the pages,
> tests demonstrate the copying is always faster and, in the case of
> running inside a KVM guest, roughly 4x faster.
> 
> Signed-off-by: Seth Jennings 
> ---
>  drivers/staging/zsmalloc/Kconfig |4 --
>  drivers/staging/zsmalloc/zsmalloc-main.c |   99 
> +-
>  drivers/staging/zsmalloc/zsmalloc_int.h  |5 +-
>  3 files changed, 72 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/staging/zsmalloc/Kconfig 
> b/drivers/staging/zsmalloc/Kconfig
> index a5ab720..9084565 100644
> --- a/drivers/staging/zsmalloc/Kconfig
> +++ b/drivers/staging/zsmalloc/Kconfig
> @@ -1,9 +1,5 @@
>  config ZSMALLOC
>   tristate "Memory allocator for compressed pages"
> - # X86 dependency is because of the use of __flush_tlb_one and set_pte
> - # in zsmalloc-main.c.
> - # TODO: convert these to portable functions
> - depends on X86
>   default n
>   help
> zsmalloc is a slab-based memory allocator designed to store
> diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c 
> b/drivers/staging/zsmalloc/zsmalloc-main.c
> index 10b0d60..a7a6f22 100644
> --- a/drivers/staging/zsmalloc/zsmalloc-main.c
> +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
> @@ -470,6 +470,57 @@ static struct page *find_get_zspage(struct size_class 
> *class)
>   return page;
>  }
>  
> +static void zs_copy_map_object(char *buf, struct page *firstpage,
> + int off, int size)

firstpage is rather misleading.
As you know, we use firstpage term for real firstpage of zspage but
in case of zs_copy_map_object, it could be a middle page of zspage.
So I would like to use "page" instead of firstpage.

> +{
> + struct page *pages[2];
> + int sizes[2];
> + void *addr;
> +
> + pages[0] = firstpage;
> + pages[1] = get_next_page(firstpage);
> + BUG_ON(!pages[1]);
> +
> + sizes[0] = PAGE_SIZE - off;
> + sizes[1] = size - sizes[0];
> +
> + /* disable page faults to match kmap_atomic() return conditions */
> + pagefault_disable();

If I understand your intention correctly, you want to prevent calling
this function on non-atomic context. Right?
Please write down description more clearly as point of view what's happen
if we didn't.

> +
> + /* copy object to per-cpu buffer */
> + addr = kmap_atomic(pages[0]);
> + memcpy(buf, addr + off, sizes[0]);
> + kunmap_atomic(addr);
> + addr = kmap_atomic(pages[1]);
> + memcpy(buf + sizes[0], addr, sizes[1]);
> + kunmap_atomic(addr);
> +}
> +
> +static void zs_copy_unmap_object(char *buf, struct page *firstpage,
> + int off, int size)

Ditto firstpage.

Otherwise, Looks good to me.

-- 
Kind regards,
Minchan Kim


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] zsmalloc: remove x86 dependency

2012-07-09 Thread Minchan Kim
On 07/03/2012 06:15 AM, Seth Jennings wrote:
 This patch replaces the page table assisted object mapping
 method, which has x86 dependencies, with a arch-independent
 method that does a simple copy into a temporary per-cpu
 buffer.
 
 While a copy seems like it would be worse than mapping the pages,
 tests demonstrate the copying is always faster and, in the case of
 running inside a KVM guest, roughly 4x faster.
 
 Signed-off-by: Seth Jennings sjenn...@linux.vnet.ibm.com
 ---
  drivers/staging/zsmalloc/Kconfig |4 --
  drivers/staging/zsmalloc/zsmalloc-main.c |   99 
 +-
  drivers/staging/zsmalloc/zsmalloc_int.h  |5 +-
  3 files changed, 72 insertions(+), 36 deletions(-)
 
 diff --git a/drivers/staging/zsmalloc/Kconfig 
 b/drivers/staging/zsmalloc/Kconfig
 index a5ab720..9084565 100644
 --- a/drivers/staging/zsmalloc/Kconfig
 +++ b/drivers/staging/zsmalloc/Kconfig
 @@ -1,9 +1,5 @@
  config ZSMALLOC
   tristate Memory allocator for compressed pages
 - # X86 dependency is because of the use of __flush_tlb_one and set_pte
 - # in zsmalloc-main.c.
 - # TODO: convert these to portable functions
 - depends on X86
   default n
   help
 zsmalloc is a slab-based memory allocator designed to store
 diff --git a/drivers/staging/zsmalloc/zsmalloc-main.c 
 b/drivers/staging/zsmalloc/zsmalloc-main.c
 index 10b0d60..a7a6f22 100644
 --- a/drivers/staging/zsmalloc/zsmalloc-main.c
 +++ b/drivers/staging/zsmalloc/zsmalloc-main.c
 @@ -470,6 +470,57 @@ static struct page *find_get_zspage(struct size_class 
 *class)
   return page;
  }
  
 +static void zs_copy_map_object(char *buf, struct page *firstpage,
 + int off, int size)

firstpage is rather misleading.
As you know, we use firstpage term for real firstpage of zspage but
in case of zs_copy_map_object, it could be a middle page of zspage.
So I would like to use page instead of firstpage.

 +{
 + struct page *pages[2];
 + int sizes[2];
 + void *addr;
 +
 + pages[0] = firstpage;
 + pages[1] = get_next_page(firstpage);
 + BUG_ON(!pages[1]);
 +
 + sizes[0] = PAGE_SIZE - off;
 + sizes[1] = size - sizes[0];
 +
 + /* disable page faults to match kmap_atomic() return conditions */
 + pagefault_disable();

If I understand your intention correctly, you want to prevent calling
this function on non-atomic context. Right?
Please write down description more clearly as point of view what's happen
if we didn't.

 +
 + /* copy object to per-cpu buffer */
 + addr = kmap_atomic(pages[0]);
 + memcpy(buf, addr + off, sizes[0]);
 + kunmap_atomic(addr);
 + addr = kmap_atomic(pages[1]);
 + memcpy(buf + sizes[0], addr, sizes[1]);
 + kunmap_atomic(addr);
 +}
 +
 +static void zs_copy_unmap_object(char *buf, struct page *firstpage,
 + int off, int size)

Ditto firstpage.

Otherwise, Looks good to me.

-- 
Kind regards,
Minchan Kim


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/