Re: [PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings
On 04/03/13 21:22, Konrad Rzeszutek Wilk wrote: >> @@ -194,14 +260,15 @@ static void add_persistent_gnt(struct rb_root *root, >> else if (persistent_gnt->gnt > this->gnt) >> new = &((*new)->rb_right); >> else { >> - pr_alert(DRV_PFX " trying to add a gref that's already >> in the tree\n"); >> - BUG(); >> + pr_alert_ratelimited(DRV_PFX " trying to add a gref >> that's already in the tree\n"); >> + return -EINVAL; > > That looks like a seperate bug-fix patch? Especially the pr_alert_ratelimited > part? Not really, the way we added granted frames before this patch, it was never possible to add a persistent grant with the same gref twice. With the changes introduced in this patch we first map the grants and then we try to make them persistent by adding them to the tree. So it is possible for a frontend to craft a malicious request that has the same gref in all segments, and when we try to add them to the tree of persistent grants we would hit the BUG, that's why we need to ratelimit the alert (to prevent flooding), and return EINVAL instead of crashing. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings
On 04/03/13 21:22, Konrad Rzeszutek Wilk wrote: [...] >> @@ -535,13 +604,17 @@ purge_gnt_list: >> msecs_to_jiffies(xen_blkif_lru_interval); >> } >> >> + remove_free_pages(blkif, xen_blkif_max_buffer_pages); >> + >> if (log_stats && time_after(jiffies, blkif->st_print)) >> print_stats(blkif); >> } >> >> + remove_free_pages(blkif, 0); > > What purpose does that have? This removes all the pages from the pool before closing down. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings
On 04/03/13 21:22, Konrad Rzeszutek Wilk wrote: [...] @@ -535,13 +604,17 @@ purge_gnt_list: msecs_to_jiffies(xen_blkif_lru_interval); } + remove_free_pages(blkif, xen_blkif_max_buffer_pages); + if (log_stats time_after(jiffies, blkif-st_print)) print_stats(blkif); } + remove_free_pages(blkif, 0); What purpose does that have? This removes all the pages from the pool before closing down. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings
On 04/03/13 21:22, Konrad Rzeszutek Wilk wrote: @@ -194,14 +260,15 @@ static void add_persistent_gnt(struct rb_root *root, else if (persistent_gnt-gnt this-gnt) new = ((*new)-rb_right); else { - pr_alert(DRV_PFX trying to add a gref that's already in the tree\n); - BUG(); + pr_alert_ratelimited(DRV_PFX trying to add a gref that's already in the tree\n); + return -EINVAL; That looks like a seperate bug-fix patch? Especially the pr_alert_ratelimited part? Not really, the way we added granted frames before this patch, it was never possible to add a persistent grant with the same gref twice. With the changes introduced in this patch we first map the grants and then we try to make them persistent by adding them to the tree. So it is possible for a frontend to craft a malicious request that has the same gref in all segments, and when we try to add them to the tree of persistent grants we would hit the BUG, that's why we need to ratelimit the alert (to prevent flooding), and return EINVAL instead of crashing. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings
On Thu, Feb 28, 2013 at 11:28:51AM +0100, Roger Pau Monne wrote: > Using balloon pages for all granted pages allows us to simplify the > logic in blkback, specially in the xen_blkbk_map function, since now especially > we can decide if we want to map a grant persistently or not after we > have actually mapped it. This could not be done before because > persistent grants used ballooned pages, and non-persistent grants used ^^^whereas > pages from the kernel. > > This patch also introduces several changes, the first one is that the > list of free pages is no longer global, now each blkback instance has > it's own list of free pages that can be used to map grants. Also, a > run time parameter (max_buffer_pages) has been added in order to tune > the maximum number of free pages each blkback instance will keep in > it's buffer. > > Signed-off-by: Roger Pau Monné > Cc: xen-de...@lists.xen.org > Cc: Konrad Rzeszutek Wilk > --- > drivers/block/xen-blkback/blkback.c | 278 > +++ > drivers/block/xen-blkback/common.h |5 + > drivers/block/xen-blkback/xenbus.c |3 + You also need a Documentation/ABI/sysfs-stable/... file. > 3 files changed, 159 insertions(+), 127 deletions(-) > > diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c > index b5e7495..ba27fc3 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -101,6 +101,21 @@ module_param_named(lru_num_clean, > xen_blkif_lru_num_clean, int, 0644); > MODULE_PARM_DESC(lru_num_clean, > "Number of persistent grants to unmap when the list is full"); > > +/* > + * Maximum number of unused free pages to keep in the internal buffer. > + * Setting this to a value too low will reduce memory used in each backend, > + * but can have a performance penalty. > + * > + * A sane value is xen_blkif_reqs * BLKIF_MAX_SEGMENTS_PER_REQUEST, but can Should that value be default value then? > + * be set to a lower value that might degrade performance on some intensive > + * IO workloads. > + */ > + > +static int xen_blkif_max_buffer_pages = 1024; > +module_param_named(max_buffer_pages, xen_blkif_max_buffer_pages, int, 0644); > +MODULE_PARM_DESC(max_buffer_pages, > +"Maximum number of free pages to keep in each block backend buffer"); > + > /* Run-time switchable: /sys/module/blkback/parameters/ */ > static unsigned int log_stats; > module_param(log_stats, int, 0644); > @@ -120,6 +135,7 @@ struct pending_req { > int status; > struct list_headfree_list; > struct persistent_gnt > *persistent_gnts[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > + struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > }; > > #define BLKBACK_INVALID_HANDLE (~0) > @@ -131,8 +147,6 @@ struct xen_blkbk { > /* And its spinlock. */ > spinlock_t pending_free_lock; > wait_queue_head_t pending_free_wq; > - /* The list of all pages that are available. */ > - struct page **pending_pages; > /* And the grant handles that are available. */ > grant_handle_t *pending_grant_handles; > }; > @@ -151,14 +165,66 @@ static inline int vaddr_pagenr(struct pending_req *req, > int seg) > BLKIF_MAX_SEGMENTS_PER_REQUEST + seg; > } > > -#define pending_page(req, seg) pending_pages[vaddr_pagenr(req, seg)] > +static inline int get_free_page(struct xen_blkif *blkif, struct page **page) > +{ > + unsigned long flags; > + > + spin_lock_irqsave(>free_pages_lock, flags); > + if (list_empty(>free_pages)) { > + BUG_ON(blkif->free_pages_num != 0); > + spin_unlock_irqrestore(>free_pages_lock, flags); > + return alloc_xenballooned_pages(1, page, false); > + } > + BUG_ON(blkif->free_pages_num == 0); > + page[0] = list_first_entry(>free_pages, struct page, lru); > + list_del([0]->lru); > + blkif->free_pages_num--; > + spin_unlock_irqrestore(>free_pages_lock, flags); > + > + return 0; > +} > + > +static inline void put_free_pages(struct xen_blkif *blkif, struct page > **page, > + int num) > +{ > + unsigned long flags; > + int i; > + > + spin_lock_irqsave(>free_pages_lock, flags); > + for (i = 0; i < num; i++) > + list_add([i]->lru, >free_pages); > + blkif->free_pages_num += num; > + spin_unlock_irqrestore(>free_pages_lock, flags); > +} > > -static inline unsigned long vaddr(struct pending_req *req, int seg) > +static inline void remove_free_pages(struct xen_blkif *blkif, int num) Perhaps 'shrink_free_pagepool'? > { > - unsigned long pfn = page_to_pfn(blkbk->pending_page(req, seg)); > - return (unsigned long)pfn_to_kaddr(pfn); > + /* Remove requested pages in batches of 10 */ > + struct page *page[10]; Hrmp. #define! > + unsigned long flags; > +
Re: [PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings
On Thu, Feb 28, 2013 at 11:28:51AM +0100, Roger Pau Monne wrote: Using balloon pages for all granted pages allows us to simplify the logic in blkback, specially in the xen_blkbk_map function, since now especially we can decide if we want to map a grant persistently or not after we have actually mapped it. This could not be done before because persistent grants used ballooned pages, and non-persistent grants used ^^^whereas pages from the kernel. This patch also introduces several changes, the first one is that the list of free pages is no longer global, now each blkback instance has it's own list of free pages that can be used to map grants. Also, a run time parameter (max_buffer_pages) has been added in order to tune the maximum number of free pages each blkback instance will keep in it's buffer. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: xen-de...@lists.xen.org Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- drivers/block/xen-blkback/blkback.c | 278 +++ drivers/block/xen-blkback/common.h |5 + drivers/block/xen-blkback/xenbus.c |3 + You also need a Documentation/ABI/sysfs-stable/... file. 3 files changed, 159 insertions(+), 127 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index b5e7495..ba27fc3 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -101,6 +101,21 @@ module_param_named(lru_num_clean, xen_blkif_lru_num_clean, int, 0644); MODULE_PARM_DESC(lru_num_clean, Number of persistent grants to unmap when the list is full); +/* + * Maximum number of unused free pages to keep in the internal buffer. + * Setting this to a value too low will reduce memory used in each backend, + * but can have a performance penalty. + * + * A sane value is xen_blkif_reqs * BLKIF_MAX_SEGMENTS_PER_REQUEST, but can Should that value be default value then? + * be set to a lower value that might degrade performance on some intensive + * IO workloads. + */ + +static int xen_blkif_max_buffer_pages = 1024; +module_param_named(max_buffer_pages, xen_blkif_max_buffer_pages, int, 0644); +MODULE_PARM_DESC(max_buffer_pages, +Maximum number of free pages to keep in each block backend buffer); + /* Run-time switchable: /sys/module/blkback/parameters/ */ static unsigned int log_stats; module_param(log_stats, int, 0644); @@ -120,6 +135,7 @@ struct pending_req { int status; struct list_headfree_list; struct persistent_gnt *persistent_gnts[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; }; #define BLKBACK_INVALID_HANDLE (~0) @@ -131,8 +147,6 @@ struct xen_blkbk { /* And its spinlock. */ spinlock_t pending_free_lock; wait_queue_head_t pending_free_wq; - /* The list of all pages that are available. */ - struct page **pending_pages; /* And the grant handles that are available. */ grant_handle_t *pending_grant_handles; }; @@ -151,14 +165,66 @@ static inline int vaddr_pagenr(struct pending_req *req, int seg) BLKIF_MAX_SEGMENTS_PER_REQUEST + seg; } -#define pending_page(req, seg) pending_pages[vaddr_pagenr(req, seg)] +static inline int get_free_page(struct xen_blkif *blkif, struct page **page) +{ + unsigned long flags; + + spin_lock_irqsave(blkif-free_pages_lock, flags); + if (list_empty(blkif-free_pages)) { + BUG_ON(blkif-free_pages_num != 0); + spin_unlock_irqrestore(blkif-free_pages_lock, flags); + return alloc_xenballooned_pages(1, page, false); + } + BUG_ON(blkif-free_pages_num == 0); + page[0] = list_first_entry(blkif-free_pages, struct page, lru); + list_del(page[0]-lru); + blkif-free_pages_num--; + spin_unlock_irqrestore(blkif-free_pages_lock, flags); + + return 0; +} + +static inline void put_free_pages(struct xen_blkif *blkif, struct page **page, + int num) +{ + unsigned long flags; + int i; + + spin_lock_irqsave(blkif-free_pages_lock, flags); + for (i = 0; i num; i++) + list_add(page[i]-lru, blkif-free_pages); + blkif-free_pages_num += num; + spin_unlock_irqrestore(blkif-free_pages_lock, flags); +} -static inline unsigned long vaddr(struct pending_req *req, int seg) +static inline void remove_free_pages(struct xen_blkif *blkif, int num) Perhaps 'shrink_free_pagepool'? { - unsigned long pfn = page_to_pfn(blkbk-pending_page(req, seg)); - return (unsigned long)pfn_to_kaddr(pfn); + /* Remove requested pages in batches of 10 */ + struct page *page[10]; Hrmp. #define! + unsigned long flags; + int num_pages = 0; unsigned int + +
[PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings
Using balloon pages for all granted pages allows us to simplify the logic in blkback, specially in the xen_blkbk_map function, since now we can decide if we want to map a grant persistently or not after we have actually mapped it. This could not be done before because persistent grants used ballooned pages, and non-persistent grants used pages from the kernel. This patch also introduces several changes, the first one is that the list of free pages is no longer global, now each blkback instance has it's own list of free pages that can be used to map grants. Also, a run time parameter (max_buffer_pages) has been added in order to tune the maximum number of free pages each blkback instance will keep in it's buffer. Signed-off-by: Roger Pau Monné Cc: xen-de...@lists.xen.org Cc: Konrad Rzeszutek Wilk --- drivers/block/xen-blkback/blkback.c | 278 +++ drivers/block/xen-blkback/common.h |5 + drivers/block/xen-blkback/xenbus.c |3 + 3 files changed, 159 insertions(+), 127 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index b5e7495..ba27fc3 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -101,6 +101,21 @@ module_param_named(lru_num_clean, xen_blkif_lru_num_clean, int, 0644); MODULE_PARM_DESC(lru_num_clean, "Number of persistent grants to unmap when the list is full"); +/* + * Maximum number of unused free pages to keep in the internal buffer. + * Setting this to a value too low will reduce memory used in each backend, + * but can have a performance penalty. + * + * A sane value is xen_blkif_reqs * BLKIF_MAX_SEGMENTS_PER_REQUEST, but can + * be set to a lower value that might degrade performance on some intensive + * IO workloads. + */ + +static int xen_blkif_max_buffer_pages = 1024; +module_param_named(max_buffer_pages, xen_blkif_max_buffer_pages, int, 0644); +MODULE_PARM_DESC(max_buffer_pages, +"Maximum number of free pages to keep in each block backend buffer"); + /* Run-time switchable: /sys/module/blkback/parameters/ */ static unsigned int log_stats; module_param(log_stats, int, 0644); @@ -120,6 +135,7 @@ struct pending_req { int status; struct list_headfree_list; struct persistent_gnt *persistent_gnts[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; }; #define BLKBACK_INVALID_HANDLE (~0) @@ -131,8 +147,6 @@ struct xen_blkbk { /* And its spinlock. */ spinlock_t pending_free_lock; wait_queue_head_t pending_free_wq; - /* The list of all pages that are available. */ - struct page **pending_pages; /* And the grant handles that are available. */ grant_handle_t *pending_grant_handles; }; @@ -151,14 +165,66 @@ static inline int vaddr_pagenr(struct pending_req *req, int seg) BLKIF_MAX_SEGMENTS_PER_REQUEST + seg; } -#define pending_page(req, seg) pending_pages[vaddr_pagenr(req, seg)] +static inline int get_free_page(struct xen_blkif *blkif, struct page **page) +{ + unsigned long flags; + + spin_lock_irqsave(>free_pages_lock, flags); + if (list_empty(>free_pages)) { + BUG_ON(blkif->free_pages_num != 0); + spin_unlock_irqrestore(>free_pages_lock, flags); + return alloc_xenballooned_pages(1, page, false); + } + BUG_ON(blkif->free_pages_num == 0); + page[0] = list_first_entry(>free_pages, struct page, lru); + list_del([0]->lru); + blkif->free_pages_num--; + spin_unlock_irqrestore(>free_pages_lock, flags); + + return 0; +} + +static inline void put_free_pages(struct xen_blkif *blkif, struct page **page, + int num) +{ + unsigned long flags; + int i; + + spin_lock_irqsave(>free_pages_lock, flags); + for (i = 0; i < num; i++) + list_add([i]->lru, >free_pages); + blkif->free_pages_num += num; + spin_unlock_irqrestore(>free_pages_lock, flags); +} -static inline unsigned long vaddr(struct pending_req *req, int seg) +static inline void remove_free_pages(struct xen_blkif *blkif, int num) { - unsigned long pfn = page_to_pfn(blkbk->pending_page(req, seg)); - return (unsigned long)pfn_to_kaddr(pfn); + /* Remove requested pages in batches of 10 */ + struct page *page[10]; + unsigned long flags; + int num_pages = 0; + + spin_lock_irqsave(>free_pages_lock, flags); + while (blkif->free_pages_num > num) { + BUG_ON(list_empty(>free_pages)); + page[num_pages] = list_first_entry(>free_pages, + struct page, lru); + list_del([num_pages]->lru); + blkif->free_pages_num--; + if (++num_pages == 10) { +
[PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings
Using balloon pages for all granted pages allows us to simplify the logic in blkback, specially in the xen_blkbk_map function, since now we can decide if we want to map a grant persistently or not after we have actually mapped it. This could not be done before because persistent grants used ballooned pages, and non-persistent grants used pages from the kernel. This patch also introduces several changes, the first one is that the list of free pages is no longer global, now each blkback instance has it's own list of free pages that can be used to map grants. Also, a run time parameter (max_buffer_pages) has been added in order to tune the maximum number of free pages each blkback instance will keep in it's buffer. Signed-off-by: Roger Pau Monné roger@citrix.com Cc: xen-de...@lists.xen.org Cc: Konrad Rzeszutek Wilk konrad.w...@oracle.com --- drivers/block/xen-blkback/blkback.c | 278 +++ drivers/block/xen-blkback/common.h |5 + drivers/block/xen-blkback/xenbus.c |3 + 3 files changed, 159 insertions(+), 127 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index b5e7495..ba27fc3 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -101,6 +101,21 @@ module_param_named(lru_num_clean, xen_blkif_lru_num_clean, int, 0644); MODULE_PARM_DESC(lru_num_clean, Number of persistent grants to unmap when the list is full); +/* + * Maximum number of unused free pages to keep in the internal buffer. + * Setting this to a value too low will reduce memory used in each backend, + * but can have a performance penalty. + * + * A sane value is xen_blkif_reqs * BLKIF_MAX_SEGMENTS_PER_REQUEST, but can + * be set to a lower value that might degrade performance on some intensive + * IO workloads. + */ + +static int xen_blkif_max_buffer_pages = 1024; +module_param_named(max_buffer_pages, xen_blkif_max_buffer_pages, int, 0644); +MODULE_PARM_DESC(max_buffer_pages, +Maximum number of free pages to keep in each block backend buffer); + /* Run-time switchable: /sys/module/blkback/parameters/ */ static unsigned int log_stats; module_param(log_stats, int, 0644); @@ -120,6 +135,7 @@ struct pending_req { int status; struct list_headfree_list; struct persistent_gnt *persistent_gnts[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; }; #define BLKBACK_INVALID_HANDLE (~0) @@ -131,8 +147,6 @@ struct xen_blkbk { /* And its spinlock. */ spinlock_t pending_free_lock; wait_queue_head_t pending_free_wq; - /* The list of all pages that are available. */ - struct page **pending_pages; /* And the grant handles that are available. */ grant_handle_t *pending_grant_handles; }; @@ -151,14 +165,66 @@ static inline int vaddr_pagenr(struct pending_req *req, int seg) BLKIF_MAX_SEGMENTS_PER_REQUEST + seg; } -#define pending_page(req, seg) pending_pages[vaddr_pagenr(req, seg)] +static inline int get_free_page(struct xen_blkif *blkif, struct page **page) +{ + unsigned long flags; + + spin_lock_irqsave(blkif-free_pages_lock, flags); + if (list_empty(blkif-free_pages)) { + BUG_ON(blkif-free_pages_num != 0); + spin_unlock_irqrestore(blkif-free_pages_lock, flags); + return alloc_xenballooned_pages(1, page, false); + } + BUG_ON(blkif-free_pages_num == 0); + page[0] = list_first_entry(blkif-free_pages, struct page, lru); + list_del(page[0]-lru); + blkif-free_pages_num--; + spin_unlock_irqrestore(blkif-free_pages_lock, flags); + + return 0; +} + +static inline void put_free_pages(struct xen_blkif *blkif, struct page **page, + int num) +{ + unsigned long flags; + int i; + + spin_lock_irqsave(blkif-free_pages_lock, flags); + for (i = 0; i num; i++) + list_add(page[i]-lru, blkif-free_pages); + blkif-free_pages_num += num; + spin_unlock_irqrestore(blkif-free_pages_lock, flags); +} -static inline unsigned long vaddr(struct pending_req *req, int seg) +static inline void remove_free_pages(struct xen_blkif *blkif, int num) { - unsigned long pfn = page_to_pfn(blkbk-pending_page(req, seg)); - return (unsigned long)pfn_to_kaddr(pfn); + /* Remove requested pages in batches of 10 */ + struct page *page[10]; + unsigned long flags; + int num_pages = 0; + + spin_lock_irqsave(blkif-free_pages_lock, flags); + while (blkif-free_pages_num num) { + BUG_ON(list_empty(blkif-free_pages)); + page[num_pages] = list_first_entry(blkif-free_pages, + struct page, lru); + list_del(page[num_pages]-lru); +