Re: [PATCH v12 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/11/2 下午2:33, Jiri Slaby 写道: On 28. 10. 21, 17:09, Xianting Tian wrote: As well known, hvc backend can register its opertions to hvc backend. the operations contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct', so hp->cons_outbuf is no longer the stack memory, we can use it in above cases safely. We also add lock to protect cons_outbuf instead of using the global lock of hvc. Introduce array cons_hvcs[] for hvc pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf and its lock. Hi, this is still overly complicated IMO. As I already noted in: https://lore.kernel.org/all/5b728c71-a754-b3ef-4ad3-6e33db1b7...@kernel.org/ this: = In fact, you need only a single char for the poll case (hvc_poll_put_char), so hvc_struct needs to contain only c, not an array. I ever did so in v10, and Greg suggested: So you have a lock for a character and a different one for a longer string? Why can they not just use the same lock? Why are 2 needed at all, can't you just use the first character of cons_outbuf[] instead? Surely you do not have 2 sends happening at the same time, right? https://lkml.org/lkml/2021/10/9/214 <https://lkml.org/lkml/2021/10/9/214> So I change to use one outbuf. OTOH, you need c[N_OUTBUF] in the console case (hvc_console_print), but not whole hvc_struct. So cons_hvcs should be an array of structs composed of only the lock and the buffer. It is ok for me. = And I would do it even simpler now. One c[N_OUTBUF] buffer for all consoles and a single lock. And "char c" in struct hvc_struct. No need for the complex logic in hvc_console_print. So you will implement this? I don't need to send further patches? Introduce array cons_early_outbuf[] to ensure the mechanism of early console still work normally. thanks,
[PATCH v12 0/2] make hvc pass dma capable memory to its backend
Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) V1 virtio-console: avoid DMA from vmalloc area https://lkml.org/lkml/2021/7/27/494 For v1 patch, Arnd Bergmann suggests to fix the issue in the first place: Make hvc pass DMA capable memory to put_chars() The fix suggestion is included in v2. V2 [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 [PATCH 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/1/9 For v2 patch, Arnd Bergmann suggests to make new buf part of the hvc_struct structure, and fix the compile issue. The fix suggestion is included in v3. V3 [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/3/1347 [PATCH v3 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/3/1348 For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. The fix suggestion is included in v4. V4 [PATCH v4 0/2] make hvc pass dma capable memory to its backend https://lkml.org/lkml/2021/8/5/1350 [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/5/1351 [PATCH v4 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/5/1352 For v4 patch, Arnd Bergmann suggests to introduce another array(cons_outbuf[]) for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. This fix included in this v5 patch. V5 Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment, use 'sizeof(long)' as dma alignment is wrong. fix it in v6. V6 It contains coding error, fix it in v7 and it worked normally according to test result. V7 Greg KH suggests to add test and code review developer, Jiri Slaby suggests to use lockless buffer and fix dma alignment in separate patch. fix above things in v8. V8 This contains coding error when switch to use new buffer. fix it in v9. V9 It didn't make things much clearer, it needs add more comments for new added buf. Add use lock to protect new added buffer. fix in v10. V10 Remove 'char outchar' and its lock from hvc_struct, adjust hvc_struct and use pahole to display the struct layout. fix it in v11. V11 Fix early console print issue, which is broken by v11, in v12. TEST STEPS* 1, config guest console=hvc0 2, start guest 3, login guest Welcome to Buildroot buildroot login: root # # cat /proc/cmdline console=hvc0 root=/dev/vda rw init=/sbin/init # drivers/tty/hvc/hvc_console.c | 48 --- drivers/tty/hvc/hvc_console.h | 30 +- drivers/char/virtio_console.c | 12 ++-- 3 file changed
[PATCH v12 2/2] virtio-console: remove unnecessary kmemdup()
This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, we can remove it. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v12 1/2] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the operations contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct', so hp->cons_outbuf is no longer the stack memory, we can use it in above cases safely. We also add lock to protect cons_outbuf instead of using the global lock of hvc. Introduce array cons_hvcs[] for hvc pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf and its lock. Introduce array cons_early_outbuf[] to ensure the mechanism of early console still work normally. With the patch, we can revert the fix c4baad5029. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 55 +-- drivers/tty/hvc/hvc_console.h | 30 ++- 2 files changed, 69 insertions(+), 16 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5957ab728..b9521f0f0 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,8 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES]; +static struct hvc_early_outbuf *cons_early_outbufs[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,9 +143,12 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; + spinlock_t *lock; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +158,27 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + hp = cons_hvcs[index]; + if (hp) { + c = hp->cons_outbuf; + lock = &hp->cons_outbuf_lock; + + /* +* To make free early console outbuf locklessly, +* free it here when we start to use hp's outbuf. +* The actual free action is executed only once. +*/ + if (cons_early_outbufs[index]) { + kfree(cons_early_outbufs[index]); + cons_early_outbufs[index] = NULL; + } + } else { + /* For early console print */ + c = cons_early_outbufs[index]->outbuf; + lock = &cons_early_outbufs[index]->lock; + } + + spin_lock_irqsave(lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] ==
Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()
在 2021/10/26 下午2:10, Greg KH 写道: On Tue, Oct 26, 2021 at 02:02:21PM +0800, Xianting Tian wrote: 在 2021/10/26 下午1:10, Jiri Slaby 写道: On 15. 10. 21, 4:46, Xianting Tian wrote: @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + hp = cons_hvcs[index]; + if (!hp) + return; You effectively make the console unusable until someone calls hvc_alloc() for this device, correct? This doesn't look right. Neither you describe this change of behaviour in the commit log. I mentioned such info in the commit log: 'Introduce another array(cons_hvcs[]) for hvc pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf and its lock.' After you pointed it out, I just found what you said make sense, I checked the code hvc_console_print() can support print before hvc_alloc() is called when someone use hvc_instantiate() for an early console discovery method. I send a patch to fix the issue? or these serial pathches reverted fisrtly then I resend new version patches? thanks Let me revert these now and you can send an updated version. OK, thanks. thanks, greg k-h
Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()
在 2021/10/26 下午1:10, Jiri Slaby 写道: On 15. 10. 21, 4:46, Xianting Tian wrote: @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + hp = cons_hvcs[index]; + if (!hp) + return; You effectively make the console unusable until someone calls hvc_alloc() for this device, correct? This doesn't look right. Neither you describe this change of behaviour in the commit log. I mentioned such info in the commit log: 'Introduce another array(cons_hvcs[]) for hvc pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf and its lock.' After you pointed it out, I just found what you said make sense, I checked the code hvc_console_print() can support print before hvc_alloc() is called when someone use hvc_instantiate() for an early console discovery method. I send a patch to fix the issue? or these serial pathches reverted fisrtly then I resend new version patches? thanks regards,
Re: [PATCH v11 0/3] make hvc pass dma capable memory to its backend
I am very glad to get this reply:) Thank you and other experts' kindly help and guide, which improved me a lot. 在 2021/10/21 下午4:35, Greg KH 写道: On Fri, Oct 15, 2021 at 10:46:55AM +0800, Xianting Tian wrote: Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) Thanks for sticking with this, looks much better now, all now queued up. greg k-h
Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()
hi Greg, Could I get your comments of this new version patches? thanks 在 2021/10/15 上午10:46, Xianting Tian 写道: As well known, hvc backend can register its opertions to hvc backend. the operations contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct', so hp->cons_outbuf is no longer the stack memory, we can use it in above cases safely. We also add lock to protect cons_outbuf instead of using the global lock of hvc. Introduce another array(cons_hvcs[]) for hvc pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf and its lock. With the patch, we can revert the fix c4baad5029. Signed-off-by: Xianting Tian Signed-off-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 36 --- drivers/tty/hvc/hvc_console.h | 21 +++- 2 files changed, 41 insertions(+), 16 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5957ab728..11f2463a1 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + hp = cons_hvcs[index]; + if (!hp) + return; + + c = hp->cons_outbuf; + + spin_lock_irqsave(&hp->cons_outbuf_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +879,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->c
Re: [PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()
Hi Greg and experts Is this version ok for you? thanks 在 2021/10/15 上午10:46, Xianting Tian 写道: As well known, hvc backend can register its opertions to hvc backend. the operations contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct', so hp->cons_outbuf is no longer the stack memory, we can use it in above cases safely. We also add lock to protect cons_outbuf instead of using the global lock of hvc. Introduce another array(cons_hvcs[]) for hvc pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf and its lock. With the patch, we can revert the fix c4baad5029. Signed-off-by: Xianting Tian Signed-off-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 36 --- drivers/tty/hvc/hvc_console.h | 21 +++- 2 files changed, 41 insertions(+), 16 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5957ab728..11f2463a1 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + hp = cons_hvcs[index]; + if (!hp) + return; + + c = hp->cons_outbuf; + + spin_lock_irqsave(&hp->cons_outbuf_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +879,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->c
[PATCH v11 2/3] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the operations contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct', so hp->cons_outbuf is no longer the stack memory, we can use it in above cases safely. We also add lock to protect cons_outbuf instead of using the global lock of hvc. Introduce another array(cons_hvcs[]) for hvc pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf and its lock. With the patch, we can revert the fix c4baad5029. Signed-off-by: Xianting Tian Signed-off-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 36 --- drivers/tty/hvc/hvc_console.h | 21 +++- 2 files changed, 41 insertions(+), 16 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5957ab728..11f2463a1 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + hp = cons_hvcs[index]; + if (!hp) + return; + + c = hp->cons_outbuf; + + spin_lock_irqsave(&hp->cons_outbuf_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +879,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->cons_outbuf_lock, flags); + hp->cons_outbuf[0] = ch; + n = hp->ops->put_chars(hp->vtermno, &hp->cons_outbuf[0],
[PATCH v11 3/3] virtio-console: remove unnecessary kmemdup()
This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, we can remove it. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v11 0/3] make hvc pass dma capable memory to its backend
Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) V1 virtio-console: avoid DMA from vmalloc area https://lkml.org/lkml/2021/7/27/494 For v1 patch, Arnd Bergmann suggests to fix the issue in the first place: Make hvc pass DMA capable memory to put_chars() The fix suggestion is included in v2. V2 [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 [PATCH 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/1/9 For v2 patch, Arnd Bergmann suggests to make new buf part of the hvc_struct structure, and fix the compile issue. The fix suggestion is included in v3. V3 [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/3/1347 [PATCH v3 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/3/1348 For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. The fix suggestion is included in v4. V4 [PATCH v4 0/2] make hvc pass dma capable memory to its backend https://lkml.org/lkml/2021/8/5/1350 [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/5/1351 [PATCH v4 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/5/1352 For v4 patch, Arnd Bergmann suggests to introduce another array(cons_outbuf[]) for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. This fix included in this v5 patch. V5 Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment, use 'sizeof(long)' as dma alignment is wrong. fix it in v6. V6 It contains coding error, fix it in v7 and it worked normally according to test result. V7 Greg KH suggests to add test and code review developer, Jiri Slaby suggests to use lockless buffer and fix dma alignment in separate patch. fix above things in v8. V8 This contains coding error when switch to use new buffer. fix it in v9. V9 It didn't make things much clearer, it needs add more comments for new added buf. Add use lock to protect new added buffer. fix in v10. V10 Remove 'char outchar' and its lock from hvc_struct, adjust hvc_struct and use pahole to display the struct layout. fix it in v11. TEST STEPS* 1, config guest console=hvc0 2, start guest 3, login guest Welcome to Buildroot buildroot login: root # # cat /proc/cmdline console=hvc0 root=/dev/vda rw init=/sbin/init # drivers/tty/hvc/hvc_console.c | 36 --- drivers/tty/hvc/hvc_console.h | 21 +++- drivers/char/virtio_console.c | 12 ++-- 3 file changed
[PATCH v11 1/3] tty: hvc: use correct dma alignment size
Use L1_CACHE_BYTES as the dma alignment size, use 'sizeof(long)' as dma alignment is wrong. Signed-off-by: Xianting Tian Signed-off-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..5957ab728 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -49,7 +49,7 @@ #define N_OUTBUF 16 #define N_INBUF16 -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long +#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; -- 2.17.1
Re: [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
在 2021/10/14 下午4:41, Greg KH 写道: On Thu, Oct 14, 2021 at 04:34:59PM +0800, Xianting Tian wrote: 在 2021/10/10 下午1:33, Greg KH 写道: On Sat, Oct 09, 2021 at 11:45:23PM +0800, Xianting Tian wrote: 在 2021/10/9 下午7:58, Greg KH 写道: Did you look at the placement using pahole as to how this structure now looks? thanks for all your commnts. for this one, do you mean I need to remove the blank line? thanks No, I mean to use the tool 'pahole' to see the structure layout that you just created and determine if it really is the best way to add these new fields, especially as you are adding huge buffers with odd alignment. thanks, Based on your comments, I removed 'char outchar', remian the position of 'int outbuf_size' unchanged to keep outbuf_size and lock in the same cache line. Now hvc_struct change as below, struct hvc_struct { struct tty_port port; spinlock_t lock; int index; int do_wakeup; - char *outbuf; int outbuf_size; int n_outbuf; uint32_t vtermno; @@ -48,6 +57,16 @@ struct hvc_struct { struct work_struct tty_resize; struct list_head next; unsigned long flags; + + /* + * the buf is used in hvc console api for putting chars, + * and also used in hvc_poll_put_char() for putting single char. + */ + char cons_outbuf[N_OUTBUF] __ALIGNED__; + spinlock_t cons_outbuf_lock; + + /* the buf is used for putting chars to tty */ + char outbuf[] __ALIGNED__; }; pahole for above hvc_struct as below, is it ok for you? do we need to pack the hole? thanks struct hvc_struct { struct tty_port port; /* 0 352 */ /* --- cacheline 5 boundary (320 bytes) was 32 bytes ago --- */ spinlock_t lock; /* 352 4 */ int index; /* 356 4 */ int do_wakeup; /* 360 4 */ int outbuf_size; /* 364 4 */ int n_outbuf; /* 368 4 */ uint32_t vtermno; /* 372 4 */ const struct hv_ops * ops; /* 376 8 */ /* --- cacheline 6 boundary (384 bytes) --- */ int irq_requested; /* 384 4 */ int data; /* 388 4 */ struct winsize ws; /* 392 8 */ struct work_struct tty_resize; /* 400 32 */ struct list_head next; /* 432 16 */ /* --- cacheline 7 boundary (448 bytes) --- */ long unsigned int flags; /* 448 8 */ /* XXX 56 bytes hole, try to pack */ /* --- cacheline 8 boundary (512 bytes) --- */ char cons_outbuf[16]; /* 512 16 */ spinlock_t cons_outbuf_lock; /* 528 4 */ /* XXX 44 bytes hole, try to pack */ Why not move the spinlock up above the cons_outbuf? Will that not be a bit better? thanks, I will move it avove cons_outbuf, and send v11 patches soon. Anyway, this is all fine, and much better than before, thanks. greg k-h
Re: [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
在 2021/10/10 下午1:33, Greg KH 写道: On Sat, Oct 09, 2021 at 11:45:23PM +0800, Xianting Tian wrote: 在 2021/10/9 下午7:58, Greg KH 写道: Did you look at the placement using pahole as to how this structure now looks? thanks for all your commnts. for this one, do you mean I need to remove the blank line? thanks No, I mean to use the tool 'pahole' to see the structure layout that you just created and determine if it really is the best way to add these new fields, especially as you are adding huge buffers with odd alignment. thanks, Based on your comments, I removed 'char outchar', remian the position of 'int outbuf_size' unchanged to keep outbuf_size and lock in the same cache line. Now hvc_struct change as below, struct hvc_struct { struct tty_port port; spinlock_t lock; int index; int do_wakeup; - char *outbuf; int outbuf_size; int n_outbuf; uint32_t vtermno; @@ -48,6 +57,16 @@ struct hvc_struct { struct work_struct tty_resize; struct list_head next; unsigned long flags; + + /* + * the buf is used in hvc console api for putting chars, + * and also used in hvc_poll_put_char() for putting single char. + */ + char cons_outbuf[N_OUTBUF] __ALIGNED__; + spinlock_t cons_outbuf_lock; + + /* the buf is used for putting chars to tty */ + char outbuf[] __ALIGNED__; }; pahole for above hvc_struct as below, is it ok for you? do we need to pack the hole? thanks struct hvc_struct { struct tty_port port; /* 0 352 */ /* --- cacheline 5 boundary (320 bytes) was 32 bytes ago --- */ spinlock_t lock; /* 352 4 */ int index; /* 356 4 */ int do_wakeup; /* 360 4 */ int outbuf_size; /* 364 4 */ int n_outbuf; /* 368 4 */ uint32_t vtermno; /* 372 4 */ const struct hv_ops * ops; /* 376 8 */ /* --- cacheline 6 boundary (384 bytes) --- */ int irq_requested; /* 384 4 */ int data; /* 388 4 */ struct winsize ws; /* 392 8 */ struct work_struct tty_resize; /* 400 32 */ struct list_head next; /* 432 16 */ /* --- cacheline 7 boundary (448 bytes) --- */ long unsigned int flags; /* 448 8 */ /* XXX 56 bytes hole, try to pack */ /* --- cacheline 8 boundary (512 bytes) --- */ char cons_outbuf[16]; /* 512 16 */ spinlock_t cons_outbuf_lock; /* 528 4 */ /* XXX 44 bytes hole, try to pack */ /* --- cacheline 9 boundary (576 bytes) --- */ char outbuf[0]; /* 576 0 */ /* size: 576, cachelines: 9, members: 17 */ /* sum members: 476, holes: 2, sum holes: 100 */ }; thanks, greg k-h
Re: [PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
在 2021/10/9 下午7:58, Greg KH 写道: Did you look at the placement using pahole as to how this structure now looks? thanks for all your commnts. for this one, do you mean I need to remove the blank line? thanks
[PATCH v10 2/3] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the operations contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. In this patch, add 'char cons_outbuf[]' as part of 'struct hvc_struct', so hp->cons_outbuf is no longer the stack memory, we can use it in above case 1. Add 'char outchar' as part of 'struct hvc_struct', we can use it in above case 2. We also add lock for each above buf to protect them separately instead of using the global lock of hvc. Introduce another array(cons_hvcs[]) for hvc pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find hvc's cons_outbuf and its lock. With the patch, we can revert the fix c4baad5029. Signed-off-by: Xianting Tian Signed-off-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 37 +-- drivers/tty/hvc/hvc_console.h | 24 +-- 2 files changed, 44 insertions(+), 17 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..4d8f112f2 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + hp = cons_hvcs[index]; + if (!hp) + return; + + c = hp->cons_outbuf; + + spin_lock_irqsave(&hp->cons_outbuf_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->cons_outbuf_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +879,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->outchar_lock, flags); +
[PATCH v10 3/3] virtio-console: remove unnecessary kmemdup()
This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, we can remove it. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v10 0/3] make hvc pass dma capable memory to its backend
Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) V1 virtio-console: avoid DMA from vmalloc area https://lkml.org/lkml/2021/7/27/494 For v1 patch, Arnd Bergmann suggests to fix the issue in the first place: Make hvc pass DMA capable memory to put_chars() The fix suggestion is included in v2. V2 [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 [PATCH 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/1/9 For v2 patch, Arnd Bergmann suggests to make new buf part of the hvc_struct structure, and fix the compile issue. The fix suggestion is included in v3. V3 [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/3/1347 [PATCH v3 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/3/1348 For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. The fix suggestion is included in v4. V4 [PATCH v4 0/2] make hvc pass dma capable memory to its backend https://lkml.org/lkml/2021/8/5/1350 [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/5/1351 [PATCH v4 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/5/1352 For v4 patch, Arnd Bergmann suggests to introduce another array(cons_outbuf[]) for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. This fix included in this v5 patch. V5 Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment, use 'sizeof(long)' as dma alignment is wrong. fix it in v6. V6 It contains coding error, fix it in v7 and it worked normally according to test result. V7 Greg KH suggests to add test and code review developer, Jiri Slaby suggests to use lockless buffer and fix dma alignment in separate patch. fix above things in v8. V8 This contains coding error when switch to use new buffer. fix it in v9. V9 It didn't make things much clearer, it needs add more comments for new added buf. Add use lock to protect new added buffer. fix in v10. TEST STEPS* 1, config guest console=hvc0 2, start guest 3, login guest Welcome to Buildroot buildroot login: root # # cat /proc/cmdline console=hvc0,115200 # drivers/tty/hvc/hvc_console.c | 38 +-- drivers/tty/hvc/hvc_console.h | 24 -- drivers/char/virtio_console.c | 12 ++-- 3 file changed
[PATCH v10 1/3] tty: hvc: use correct dma alignment size
Use L1_CACHE_BYTES as the dma alignment size, use 'sizeof(long)' is wrong. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..5957ab728 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -49,7 +49,7 @@ #define N_OUTBUF 16 #define N_INBUF16 -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long +#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; -- 2.17.1
Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()
thanks Greg, I will submit v9 patch for reviewing. Before, I was waiting for a new reply:( 在 2021/9/18 下午8:40, Greg KH 写道: On Sat, Sep 18, 2021 at 08:32:01PM +0800, Xianting Tian wrote: hi Will you consider to continue the disscussion of this patch? thanks I do not see a newer version of this series. thanks, greg k-h
Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()
hi Will you consider to continue the disscussion of this patch? thanks 在 2021/8/20 下午4:43, Xianting TIan 写道: 在 2021/8/20 下午2:49, Daniel Axtens 写道: Xianting Tian writes: As well known, hvc backend driver(eg, virtio-console) can register its operations to hvc framework. The operations can contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): We could also run into issues on powerpc where Andrew is working on adding vmap-stack but the opal hvc driver assumes that it is passed a buffer which is not in vmalloc space but in the linear mapping. So it would be good to fix this (or more clearly document what drivers can expect). 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the future. In this patch, we make 'char out_buf[N_OUTBUF]' and 'chat out_ch' part of 'struct hvc_struct', so both two buf are no longer the stack memory. we can use it in above two cases separately. Introduce another array(cons_outbufs[]) for buffer pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find the buffer, instead of traversing hp list. With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang struct hvc_struct { struct tty_port port; spinlock_t lock; int index; int do_wakeup; - char *outbuf; - int outbuf_size; int n_outbuf; uint32_t vtermno; const struct hv_ops *ops; @@ -48,6 +56,10 @@ struct hvc_struct { struct work_struct tty_resize; struct list_head next; unsigned long flags; + char out_ch; + char out_buf[N_OUTBUF] __ALIGNED__; + int outbuf_size; + char outbuf[0] __ALIGNED__; I'm trying to understand this patch but I am finding it very difficult to understand what the difference between `out_buf` and `outbuf` (without the underscore) is supposed to be. `out_buf` is statically sized and the size of `outbuf` is supposed to depend on the arguments to hvc_alloc(), but I can't quite figure out what the roles of each one are and their names are confusingly similiar! I looked briefly at the older revisions of the series but it didn't make things much clearer. Could you give them clearer names? thanks for the comments, It is indeed not easy to understand by the name. I will change it to a proper name if we have next version patch. Jiri Slaby is worring about the performance, because we need add two locks to protect 'out_ch' and 'out_buf' separately, the origin on-stack buffer is lockless. I don't know whether this solution can be accepted, just waiting for Jiri's further commtents. Also, looking at Documentation/process/deprecated.rst, it looks like maybe we want to use a 'flexible array member' instead: .. note:: If you are using struct_size() on a structure containing a zero-length or a one-element array as a trailing array member, please refactor such array usage and switch to a `flexible array member <#zero-length-and-one-element-arrays>`_ instead. I think we want: thanks, we should use [], not [0]. + char outbuf[] __ALIGNED__; Kind regards, Daniel
Re: [PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/20 下午2:49, Daniel Axtens 写道: Xianting Tian writes: As well known, hvc backend driver(eg, virtio-console) can register its operations to hvc framework. The operations can contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): We could also run into issues on powerpc where Andrew is working on adding vmap-stack but the opal hvc driver assumes that it is passed a buffer which is not in vmalloc space but in the linear mapping. So it would be good to fix this (or more clearly document what drivers can expect). 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the future. In this patch, we make 'char out_buf[N_OUTBUF]' and 'chat out_ch' part of 'struct hvc_struct', so both two buf are no longer the stack memory. we can use it in above two cases separately. Introduce another array(cons_outbufs[]) for buffer pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find the buffer, instead of traversing hp list. With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang struct hvc_struct { struct tty_port port; spinlock_t lock; int index; int do_wakeup; - char *outbuf; - int outbuf_size; int n_outbuf; uint32_t vtermno; const struct hv_ops *ops; @@ -48,6 +56,10 @@ struct hvc_struct { struct work_struct tty_resize; struct list_head next; unsigned long flags; + char out_ch; + char out_buf[N_OUTBUF] __ALIGNED__; + int outbuf_size; + char outbuf[0] __ALIGNED__; I'm trying to understand this patch but I am finding it very difficult to understand what the difference between `out_buf` and `outbuf` (without the underscore) is supposed to be. `out_buf` is statically sized and the size of `outbuf` is supposed to depend on the arguments to hvc_alloc(), but I can't quite figure out what the roles of each one are and their names are confusingly similiar! I looked briefly at the older revisions of the series but it didn't make things much clearer. Could you give them clearer names? thanks for the comments, It is indeed not easy to understand by the name. I will change it to a proper name if we have next version patch. Jiri Slaby is worring about the performance, because we need add two locks to protect 'out_ch' and 'out_buf' separately, the origin on-stack buffer is lockless. I don't know whether this solution can be accepted, just waiting for Jiri's further commtents. Also, looking at Documentation/process/deprecated.rst, it looks like maybe we want to use a 'flexible array member' instead: .. note:: If you are using struct_size() on a structure containing a zero-length or a one-element array as a trailing array member, please refactor such array usage and switch to a `flexible array member <#zero-length-and-one-element-arrays>`_ instead. I think we want: thanks, we should use [], not [0]. + char outbuf[] __ALIGNED__; Kind regards, Daniel
Re: [PATCH v7 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/18 上午11:17, Jiri Slaby 写道: Hi, On 17. 08. 21, 15:22, Xianting Tian wrote: As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. "operations". And there too: Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no longer the stack memory. we can use it in above two cases. In fact, you need only a single char for the poll case (hvc_poll_put_char), so hvc_struct needs to contain only c, not an array. OTOH, you need c[N_OUTBUF] in the console case (hvc_console_print), but not whole hvc_struct. So cons_hvcs should be an array of structs composed of only the lock and the buffer. Hum. Or maybe rethink and take care of the console case by kmemdup and be done with that case? What problem do you have with allocating 16 bytes? It should be quite easy and really fast (lockless) in most cases. On the contrary, your solution has to take a spinlock to access the global buffer. As I replyed before, this issue may can be solved just by adjust the alignment to L1_CACHE_BYTES or at least 16: #define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) Then, c[16] won't cross the pages, that is to say c[16]'s physical address is continuous. Could you comment this? I submitted v8, I found it still can't solve ths issue, even we create 'char out_buf[N_OUTBUF]' and 'chat out_ch' be part of 'struct hvc_struct', and use it separately, we still need lock to protect each buf. When we invloced lock, it will impact the hvc performance. So we can back to the original intention of this solution, just fix the kmemdup issue in virtio_console driver? Other fix is use L1_CACHE_BYTES as the alignment, use 'sizeof(long)' as dma alignment is wrong. And use struct_size() to calculate size of hvc_struct. This ought to be in separate patches. thanks,
[PATCH v8 3/3] virtio-console: remove unnecessary kmemdup()
This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, we can remove it. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v8 2/3] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend driver(eg, virtio-console) can register its operations to hvc framework. The operations can contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its operations. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the future. In this patch, we make 'char out_buf[N_OUTBUF]' and 'chat out_ch' part of 'struct hvc_struct', so both two buf are no longer the stack memory. we can use it in above two cases separately. Introduce another array(cons_outbufs[]) for buffer pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find the buffer, instead of traversing hp list. With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 27 --- drivers/tty/hvc/hvc_console.h | 16 ++-- 2 files changed, 26 insertions(+), 17 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..300e9c037 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static char *cons_outbufs[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,7 +142,7 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; @@ -163,6 +154,10 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + c = cons_outbufs[index]; + if (!c) + return; + while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -879,8 +874,10 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct hvc_struct *hp = tty->driver_data; int n; + hp->out_ch = ch; + do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + n = hp->ops->put_chars(hp->vtermno, hp->out_ch, 1); } while (n <= 0); } #endif @@ -922,8 +919,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, return ERR_PTR(err); } - hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size, - GFP_KERNEL); + hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL); if (!hp) return ERR_PTR(-ENOMEM); @@ -931,7 +927,6 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, hp->data = data; hp->ops = ops; hp->outbuf_size = outbuf_size; - hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))]; tty_port_init(&hp->port); hp->port.ops = &hvc
[PATCH v8 0/3] make hvc pass dma capable memory to its backend
Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) V1 virtio-console: avoid DMA from vmalloc area https://lkml.org/lkml/2021/7/27/494 For v1 patch, Arnd Bergmann suggests to fix the issue in the first place: Make hvc pass DMA capable memory to put_chars() The fix suggestion is included in v2. V2 [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 [PATCH 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/1/9 For v2 patch, Arnd Bergmann suggests to make new buf part of the hvc_struct structure, and fix the compile issue. The fix suggestion is included in v3. V3 [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/3/1347 [PATCH v3 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/3/1348 For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. The fix suggestion is included in v4. V4 [PATCH v4 0/2] make hvc pass dma capable memory to its backend https://lkml.org/lkml/2021/8/5/1350 [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/5/1351 [PATCH v4 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/5/1352 For v4 patch, Arnd Bergmann suggests to introduce another array(cons_outbuf[]) for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. This fix included in this v5 patch. V5 Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment, use 'sizeof(long)' as dma alignment is wrong. fix it in v6. V6 It contains coding error, fix it in v7 and it worked normally according to test result. V7 Greg KH suggests to add test and code review developer, Jiri Slaby suggests to use lockless buffer and fix dma alignment in separate patch. fix above things in v8. drivers/tty/hvc/hvc_console.c | 27 --- drivers/tty/hvc/hvc_console.h | 16 ++-- drivers/tty/hvc/hvc_console.h | 16 -- 3 file changed
[PATCH v8 1/3] tty: hvc: use correct dma alignment size
Use L1_CACHE_BYTES as the dma alignment size, use 'sizeof(long)' is wrong. Signed-off-by: Xianting Tian Reviewed-by: Shile Zhang --- drivers/tty/hvc/hvc_console.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..5957ab728 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -49,7 +49,7 @@ #define N_OUTBUF 16 #define N_INBUF16 -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long +#define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; -- 2.17.1
Re: [PATCH v7 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/18 上午11:17, Jiri Slaby 写道: Hi, On 17. 08. 21, 15:22, Xianting Tian wrote: As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. "operations". And there too: Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no longer the stack memory. we can use it in above two cases. In fact, you need only a single char for the poll case (hvc_poll_put_char), so hvc_struct needs to contain only c, not an array. OTOH, you need c[N_OUTBUF] in the console case (hvc_console_print), but not whole hvc_struct. So cons_hvcs should be an array of structs composed of only the lock and the buffer. Hum. Or maybe rethink and take care of the console case by kmemdup and be done with that case? What problem do you have with allocating 16 bytes? It should be quite easy and really fast (lockless) in most cases. On the contrary, your solution has to take a spinlock to access the global buffer. May be we can change hvc_struct as below, struct hvc_struct { ... char out_ch; char c[N_OUTBUF] __ALIGNED__; int outbuf_size; char outbuf[0] __ALIGNED__; }; c[N_OUTBUF] is only used for hvc_console_print(); out_ch is only used for hvc_poll_put_char(). Thus no competition exits, the spinlock can be removed. Then cons_hvcs array can only contains the buffer. Is it OK for you? thanks Other fix is use L1_CACHE_BYTES as the alignment, use 'sizeof(long)' as dma alignment is wrong. And use struct_size() to calculate size of hvc_struct. This ought to be in separate patches. OK, thanks thanks,
Re: [PATCH v7 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/17 下午11:48, Greg KH 写道: On Tue, Aug 17, 2021 at 09:22:59PM +0800, Xianting Tian wrote: We tested the patch, it worked normally. Who is "we"? Signed-off-by: Xianting Tian Like I said before, I need another developer from your company to review and sign-off on this patch (and the other one), before I am willing to look at it, based on the previous mistakes that have happened here. thanks, I will add the developer in v8 and also with fix a build warning, which I don't meet in my build process. thanks, greg k-h
[PATCH v7 2/2] virtio-console: remove unnecessary kmemdup()
hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, we can remove it. This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") Signed-off-by: Xianting Tian --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v7 1/2] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no longer the stack memory. we can use it in above two cases. Other fix is use L1_CACHE_BYTES as the alignment, use 'sizeof(long)' as dma alignment is wrong. And use struct_size() to calculate size of hvc_struct. Introduce another array(cons_hvcs[]) for hvc_struct pointers next to the cons_ops[] and vtermnos[] arrays. With the array, we can easily find the hp, instead of traversing hp list. With the patch, we can remove the fix c4baad5029. We tested the patch, it worked normally. Signed-off-by: Xianting Tian --- drivers/tty/hvc/hvc_console.c | 39 +-- drivers/tty/hvc/hvc_console.h | 16 -- 2 files changed, 38 insertions(+), 17 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..b882ceb5f 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static struct hvc_struct *cons_hvcs[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,9 +142,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +156,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + hp = cons_hvcs[index]; + if (!hp || !hp->c) + return; + + c = hp->c; + + spin_lock_irqsave(&hp->c_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +191,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->c_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +879,16 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; + + if (!hp || !hp->c) + return; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->c_lock, flags); + hp->c[0] = ch; +
[PATCH v7 0/2] make hvc pass dma capable memory to its backend
Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) V1 virtio-console: avoid DMA from vmalloc area https://lkml.org/lkml/2021/7/27/494 For v1 patch, Arnd Bergmann suggests to fix the issue in the first place: Make hvc pass DMA capable memory to put_chars() The fix suggestion is included in v2. V2 [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 [PATCH 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/1/9 For v2 patch, Arnd Bergmann suggests to make new buf part of the hvc_struct structure, and fix the compile issue. The fix suggestion is included in v3. V3 [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/3/1347 [PATCH v3 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/3/1348 For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. The fix suggestion is included in v4. V4 [PATCH v4 0/2] make hvc pass dma capable memory to its backend https://lkml.org/lkml/2021/8/5/1350 [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/5/1351 [PATCH v4 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/5/1352 For v4 patch, Arnd Bergmann suggests to introduce another array(cons_outbuf[]) for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. This fix included in this v5 patch. V5 Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment, use 'sizeof(long)' as dma alignment is wrong. fix it in v6. V6 It contains coding error, fix it in v7 and it worked normally according to test result. drivers/tty/hvc/hvc_console.c | 39 +-- drivers/tty/hvc/hvc_console.h | 16 -- drivers/tty/hvc/hvc_console.h | 16 -- 3 file changed
Re: [PATCH v6 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/13 下午1:53, Jiri Slaby 写道: Hi, On 12. 08. 21, 14:26, kernel test robot wrote: drivers/tty/hvc/hvc_console.c:190:26: warning: variable 'hp' is uninitialized when used here [-Wuninitialized] spin_unlock_irqrestore(&hp->c_lock, flags); ^~ drivers/tty/hvc/hvc_console.c:149:23: note: initialize the variable 'hp' to silence this warning struct hvc_struct *hp; ^ = NULL So you clearly didn't test your change as it would crash quite instantly. I wonder, where do you intend to get hp from in the console::print() hook? thanks, According to analysis, this issue may can be solved just by adjust the alignment to L1_CACHE_BYTES: #define __ALIGNED__ __attribute__((__aligned__(L1_CACHE_BYTES))) Our analysis as below, the original __ALIGNED__ is sizeof(long) which is 8 for 64bit cpu. char c[N_OUTBUF] __ALIGNED__; //c[16] __ALIGNED__; For 4K page, c[16] may cross the page when alignemnt is 8. In the case the physical address of c[16] is noncontiguous. |--8|..|-8-| PAGE_1 ..|---16--| c[16] .|-8-|.|-8-| PAGE_2 But when the alignment is L1_CACHE_BYTES(eg, 64), or at least N_OUTBUF(16), we have no dma issue as c[16] won't cross the page, the physical address of c[16] is contiguous. |64|.|64-| PAGE_3 ..|--c[16]--| Could you help comments this? thanks
Re: [PATCH v6 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/13 下午3:27, Greg KH 写道: On Thu, Aug 12, 2021 at 05:45:31PM +0800, Xianting Tian wrote: As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no longer the stack memory. we can use it in above two cases. Other fix is use L1_CACHE_BYTES as the alignment, use 'sizeof(long)' as dma alignment is wrong. And use struct_size() to calculate size of hvc_struct. Introduce another array(cons_outbuf[]) for the hp->c pointers next to the cons_ops[] and vtermnos[] arrays. With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Tested-by: Xianting Tian As the build shows, you obviously did not test this code :( Also, no need to add a tested-by line as that should be implicit if you wrote and signed off on it. I am going to ask you to get some help from some other developers at your company, and get them to test and sign off on this series before sending it out again, as there seems to be a bit of a disconnect as to what is actually needed to do when sending a patch for us to review. That is now a requirement for us to be able to take your changes here. thanks, Sorry for this. I tested V1-V4, But for V6, I take it for granted that there is no problem when I just switch to use array(cons_outbuf[]). I indeed didn't test it:( I will test it and find virtualization test expert to test again before sending next patch. greg k-h
Re: [PATCH v6 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/13 下午1:53, Jiri Slaby 写道: Hi, On 12. 08. 21, 14:26, kernel test robot wrote: drivers/tty/hvc/hvc_console.c:190:26: warning: variable 'hp' is uninitialized when used here [-Wuninitialized] spin_unlock_irqrestore(&hp->c_lock, flags); ^~ drivers/tty/hvc/hvc_console.c:149:23: note: initialize the variable 'hp' to silence this warning struct hvc_struct *hp; ^ = NULL So you clearly didn't test your change as it would crash quite instantly. I wonder, where do you intend to get hp from in the console::print() hook? I am very sorry for the inconvenience caused. This is caused by my carelessness:( I take it for granted that there is no problem when I just switch to use array(cons_outbuf[]). sorry agin. thanks,
[PATCH v6 2/2] virtio-console: remove unnecessary kmemdup()
hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, we can remove it. This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") Signed-off-by: Xianting Tian --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v6 1/2] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no longer the stack memory. we can use it in above two cases. Other fix is use L1_CACHE_BYTES as the alignment, use 'sizeof(long)' as dma alignment is wrong. And use struct_size() to calculate size of hvc_struct. Introduce another array(cons_outbuf[]) for the hp->c pointers next to the cons_ops[] and vtermnos[] arrays. With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Tested-by: Xianting Tian --- drivers/tty/hvc/hvc_console.c | 40 +-- drivers/tty/hvc/hvc_console.h | 16 -- 2 files changed, 38 insertions(+), 18 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..c56564eb7 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static char *cons_outbuf[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,18 +142,23 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) return; /* This console adapter was removed so it is not usable. */ - if (vtermnos[index] == -1) + if (vtermnos[index] == -1 || !cons_outbuf[index]) return; + c = cons_outbuf[index]; + + spin_lock_irqsave(&hp->c_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +187,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->c_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +875,19 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; + char *c; + + if (!hp || !cons_outbuf[hp->index]) + return; + + c = cons_outbuf[hp->index]; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->c_lock, flags); + c[0] = ch; + n = hp->ops->put_cha
[PATCH v6 0/2] make hvc pass dma capable memory to its backend
Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) V1 virtio-console: avoid DMA from vmalloc area https://lkml.org/lkml/2021/7/27/494 For v1 patch, Arnd Bergmann suggests to fix the issue in the first place: Make hvc pass DMA capable memory to put_chars() The fix suggestion is included in v2. V2 [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 [PATCH 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/1/9 For v2 patch, Arnd Bergmann suggests to make new buf part of the hvc_struct structure, and fix the compile issue. The fix suggestion is included in v3. V3 [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/3/1347 [PATCH v3 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/3/1348 For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. The fix suggestion is included in v4. V4 [PATCH v4 0/2] make hvc pass dma capable memory to its backend https://lkml.org/lkml/2021/8/5/1350 [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/5/1351 [PATCH v4 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/5/1352 For v4 patch, Arnd Bergmann suggests to introduce another array(cons_outbuf[]) for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. This fix included in this v5 patch. V5 Arnd Bergmann suggests to use "L1_CACHE_BYTES" as dma alignment, use 'sizeof(long)' as dma alignment is wrong. fix it in v6. drivers/char/virtio_console.c | 12 ++-- drivers/tty/hvc/hvc_console.c | 40 +-- drivers/tty/hvc/hvc_console.h | 16 -- 3 file changed
Re: [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/12 下午4:54, Arnd Bergmann 写道: On Thu, Aug 12, 2021 at 10:08 AM Xianting TIan wrote: 在 2021/8/6 下午10:51, Arnd Bergmann 写道: On Fri, Aug 6, 2021 at 5:01 AM Xianting Tian +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long I think you need a higher alignment for DMA buffers, instead of sizeof(long), I would suggest ARCH_DMA_MINALIGN. As some ARCH(eg, x86, riscv) doesn't define ARCH_DMA_MINALIG, so i think it 's better remain the code unchanged, I will send v5 patch soon. I think you could just use "L1_CACHE_BYTES" as the alignment in this case. This will make the structure slightly larger for architectures that do not have alignment constraints on DMA buffers, but using a smaller alignment is clearly wrong. Another option would be to use ARCH_KMALLOC_MINALIGN. yes, I unstand you, the align size must L1_CACHE_BYTES at least. Note that there is a patch to add ARCH_DMA_MINALIGN to riscv already, yes, I summited this patch, it is discussing, seems they don't want to apply it. as some implementations do not have coherent DMA. I had failed to realized though that on x86 you do not get an ARCH_DMA_MINALIGN definition. I didn't find the definition in arch/x86/include/asm/cache.h and other place, x86 is dma coherent, it may doesn't need it. Arnd
[PATCH v5 2/2] virtio-console: remove unnecessary kmemdup()
hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, we can remove it. This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") Signed-off-by: Xianting Tian --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v5 0/2] make hvc pass dma capable memory to its backend
Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) V1 virtio-console: avoid DMA from vmalloc area https://lkml.org/lkml/2021/7/27/494 For v1 patch, Arnd Bergmann suggests to fix the issue in the first place: Make hvc pass DMA capable memory to put_chars() The fix suggestion is included in v2. V2 [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 [PATCH 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/1/9 For v2 patch, Arnd Bergmann suggests to make new buf part of the hvc_struct structure, and fix the compile issue. The fix suggestion is included in v3. V3 [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/3/1347 [PATCH v3 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/3/1348 For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. The fix suggestion is included in v4. V4 [PATCH v4 0/2] make hvc pass dma capable memory to its backend https://lkml.org/lkml/2021/8/5/1350 [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/5/1351 [PATCH v4 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/5/1352 For v4 patch, Arnd Bergmann suggests to introduce another array(cons_outbuf[]) for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. This fix included in this v5 patch. drivers/char/virtio_console.c | 12 ++-- drivers/tty/hvc/hvc_console.c | 40 +-- drivers/tty/hvc/hvc_console.h | 16 -- 3 file changed
[PATCH v5 1/2] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no longer the stack memory. we can use it in above two cases. Other cleanup is use ARCH_DMA_MINALIGN for align, and make 'hp->outbuf' aligned and use struct_size() for the size parameter of kzalloc(). With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Tested-by: Xianting Tian --- drivers/tty/hvc/hvc_console.c | 40 +-- drivers/tty/hvc/hvc_console.h | 16 -- 2 files changed, 38 insertions(+), 18 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..c56564eb7 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -142,6 +132,7 @@ static int hvc_flush(struct hvc_struct *hp) static const struct hv_ops *cons_ops[MAX_NR_HVC_CONSOLES]; static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = {[0 ... MAX_NR_HVC_CONSOLES - 1] = -1}; +static char *cons_outbuf[MAX_NR_HVC_CONSOLES]; /* * Console APIs, NOT TTY. These APIs are available immediately when @@ -151,18 +142,23 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) return; /* This console adapter was removed so it is not usable. */ - if (vtermnos[index] == -1) + if (vtermnos[index] == -1 || !cons_outbuf[index]) return; + c = cons_outbuf[index]; + + spin_lock_irqsave(&hp->c_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +187,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->c_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +875,19 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; + char *c; + + if (!hp || !cons_outbuf[hp->index]) + return; + + c = cons_outbuf[hp->index]; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->c_lock, flags); + c[0] = ch; + n = hp->ops->put_chars(hp->vtermno, c, 1); + spin_unlock_irqrestore(&hp->c_lock, flags); } while (n <= 0);
Re: [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/6 下午10:51, Arnd Bergmann 写道: On Fri, Aug 6, 2021 at 5:01 AM Xianting Tian wrote: @@ -163,6 +155,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + list_for_each_entry(hp, &hvc_structs, next) + if (hp->vtermno == vtermnos[index]) + break; + + c = hp->c; + + spin_lock_irqsave(&hp->c_lock, flags); The loop looks like it might race against changes to the list. It seems strange that the print function has to actually search for the structure here. It may be better to have yet another array for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. +/* + * These sizes are most efficient for vio, because they are the + * native transfer size. We could make them selectable in the + * future to better deal with backends that want other buffer sizes. + */ +#define N_OUTBUF 16 +#define N_INBUF16 + +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long I think you need a higher alignment for DMA buffers, instead of sizeof(long), I would suggest ARCH_DMA_MINALIGN. As some ARCH(eg, x86, riscv) doesn't define ARCH_DMA_MINALIG, so i think it 's better remain the code unchanged, I will send v5 patch soon. Arnd
Re: [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/6 下午10:51, Arnd Bergmann 写道: On Fri, Aug 6, 2021 at 5:01 AM Xianting Tian wrote: @@ -163,6 +155,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + list_for_each_entry(hp, &hvc_structs, next) + if (hp->vtermno == vtermnos[index]) + break; + + c = hp->c; + + spin_lock_irqsave(&hp->c_lock, flags); The loop looks like it might race against changes to the list. It seems strange that the print function has to actually search for the structure here. It may be better to have yet another array for the buffer pointers next to the cons_ops[] and vtermnos[] arrays. I will make the change in v5, thanks. +/* + * These sizes are most efficient for vio, because they are the + * native transfer size. We could make them selectable in the + * future to better deal with backends that want other buffer sizes. + */ +#define N_OUTBUF 16 +#define N_INBUF16 + +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long I think you need a higher alignment for DMA buffers, instead of sizeof(long), I would suggest ARCH_DMA_MINALIGN. thanks, I will fix it in v5: #define __ALIGNED__ __attribute__((__aligned__(ARCH_DMA_MINALIGN))) Arnd
[PATCH v4 2/2] virtio-console: remove unnecessary kmemdup()
hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, we can remove it. This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") Signed-off-by: Xianting Tian --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. We make 'char c[N_OUTBUF]' part of 'struct hvc_struct', so hp->c is no longer the stack memory. we can use it in above two cases. Other cleanup is to make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Tested-by: Xianting Tian --- drivers/tty/hvc/hvc_console.c | 33 ++--- drivers/tty/hvc/hvc_console.h | 16 ++-- 2 files changed, 32 insertions(+), 17 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..3afdb169c 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -41,16 +41,6 @@ */ #define HVC_CLOSE_WAIT (HZ/100) /* 1/10 of a second */ -/* - * These sizes are most efficient for vio, because they are the - * native transfer size. We could make them selectable in the - * future to better deal with backends that want other buffer sizes. - */ -#define N_OUTBUF 16 -#define N_INBUF16 - -#define __ALIGNED__ __attribute__((__aligned__(sizeof(long - static struct tty_driver *hvc_driver; static struct task_struct *hvc_task; @@ -151,9 +141,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +155,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + list_for_each_entry(hp, &hvc_structs, next) + if (hp->vtermno == vtermnos[index]) + break; + + c = hp->c; + + spin_lock_irqsave(&hp->c_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +190,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->c_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +878,13 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->c_lock, flags); + hp->c[0] = ch; + n = hp->ops->put_chars(hp->vtermno, hp->c, 1); + spin_unlock_irqrestore(&hp->c_lock, flags); } while (n <= 0); } #endif @@ -922,8 +926,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, return ERR_PTR(err); } - hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size, - GFP_KERNEL); + hp = kzalloc(struct_size(hp, outbuf, outbuf_size), GFP_KERNEL); if (!hp) return ERR_PTR(-ENOMEM); @@ -931,13 +934,13 @@ struct hvc_struct *hvc_a
[PATCH v4 0/2] make hvc pass dma capable memory to its backend
Dear all, This patch series make hvc framework pass DMA capable memory to put_chars() of hvc backend(eg, virtio-console), and revert commit c4baad5029 ("virtio-console: avoid DMA from stack”) V1 virtio-console: avoid DMA from vmalloc area https://lkml.org/lkml/2021/7/27/494 For v1 patch, Arnd Bergmann suggests to fix the issue in the first place: Make hvc pass DMA capable memory to put_chars() The fix suggestion is included in v2. V2 [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 [PATCH 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/1/9 For v2 patch, Arnd Bergmann suggests to make new buf part of the hvc_struct structure, and fix the compile issue. The fix suggestion is included in v3. V3 [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/3/1347 [PATCH v3 2/2] virtio-console: remove unnecessary kmemdup() https://lkml.org/lkml/2021/8/3/1348 For v3 patch, Jiri Slaby suggests to make 'char c[N_OUTBUF]' part of hvc_struct, and make 'hp->outbuf' aligned and use struct_size() to calculate the size of hvc_struct. The fix suggestion is included in v4. drivers/char/virtio_console.c | 12 ++-- drivers/tty/hvc/hvc_console.c | 33 ++--- drivers/tty/hvc/hvc_console.h | 16 ++-- 3 file changed
Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/5 下午4:09, Jiri Slaby 写道: On 05. 08. 21, 9:58, Jiri Slaby wrote: Hi, On 04. 08. 21, 4:54, Xianting Tian wrote: @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, hp->outbuf_size = outbuf_size; hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))]; This deserves cleanup too. Why is "outbuf" not "char outbuf[0] __ALIGNED__" at the end of the structure? The allocation would be easier (using struct_size()) and this line would be gone completely. I will make the cleanup in v4. + /* + * hvc_con_outbuf is guaranteed to be aligned at least to the + * size(N_OUTBUF) by kmalloc(). + */ + hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL); + if (!hp->hvc_con_outbuf) + return ERR_PTR(-ENOMEM); This leaks hp, right? Actually, why don't you make char c[N_OUTBUF] __ALIGNED__; part of struct hvc_struct directly? thanks, it a good idea, I will change it in v4. BTW your 2 patches are still not threaded, that is hard to follow. + + spin_lock_init(&hp->hvc_con_lock); + tty_port_init(&hp->port); hp->port.ops = &hvc_port_ops; thanks,
Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/5 下午4:18, Greg KH 写道: On Thu, Aug 05, 2021 at 04:08:46PM +0800, Xianting Tian wrote: 在 2021/8/5 下午3:58, Jiri Slaby 写道: Hi, On 04. 08. 21, 4:54, Xianting Tian wrote: @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, hp->outbuf_size = outbuf_size; hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))]; + /* + * hvc_con_outbuf is guaranteed to be aligned at least to the + * size(N_OUTBUF) by kmalloc(). + */ + hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL); + if (!hp->hvc_con_outbuf) + return ERR_PTR(-ENOMEM); This leaks hp, right? BTW your 2 patches are still not threaded, that is hard to follow. yes, thanks, I found the bug, I am preparing to do this in v4. It is the first time I send series patches(number >1), I checked the method for sending series patch on LKML.org, I should send '0/2' which is the history info for series patches. Please use 'git send-email' to send the full series all at once, otherwise it is hard to make the emails threaded "by hand" if you do not do so. I got it, thanks for your guide:) thanks, greg k-h
Re: [PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
在 2021/8/5 下午3:58, Jiri Slaby 写道: Hi, On 04. 08. 21, 4:54, Xianting Tian wrote: @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, hp->outbuf_size = outbuf_size; hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))]; + /* + * hvc_con_outbuf is guaranteed to be aligned at least to the + * size(N_OUTBUF) by kmalloc(). + */ + hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL); + if (!hp->hvc_con_outbuf) + return ERR_PTR(-ENOMEM); This leaks hp, right? BTW your 2 patches are still not threaded, that is hard to follow. yes, thanks, I found the bug, I am preparing to do this in v4. It is the first time I send series patches(number >1), I checked the method for sending series patch on LKML.org, I should send '0/2' which is the history info for series patches. I will add 0/2 in v4, sorry again for this:( beside avove things, the solution in this patch is for you? thanks + + spin_lock_init(&hp->hvc_con_lock); + tty_port_init(&hp->port); hp->port.ops = &hvc_port_ops; thanks,
[PATCH v3 2/2] virtio-console: remove unnecessary kmemdup()
hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, remove it. This revert commit c4baad5029 ("virtio-console: avoid DMA from stack") Signed-off-by: Xianting Tian --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH v3 1/2] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the same issue if a new hvc backend using dma added in the furture. Considering lock competition of hp->outbuf, we created a new buffer hp->hvc_con_outbuf, which is aligned at least to N_OUTBUF, and use it in above two cases. With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Tested-by: Xianting Tian --- drivers/tty/hvc/hvc_console.c | 30 -- drivers/tty/hvc/hvc_console.h | 2 ++ 2 files changed, 30 insertions(+), 2 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..e5862989c 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -151,9 +151,11 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + unsigned long flags; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,6 +165,13 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + list_for_each_entry(hp, &hvc_structs, next) + if (hp->vtermno == vtermnos[index]) + break; + + c = hp->hvc_con_outbuf; + + spin_lock_irqsave(&hp->hvc_con_lock, flags); while (count > 0 || i > 0) { if (count > 0 && i < sizeof(c)) { if (b[n] == '\n' && !donecr) { @@ -191,6 +200,7 @@ static void hvc_console_print(struct console *co, const char *b, } } } + spin_unlock_irqrestore(&hp->hvc_con_lock, flags); hvc_console_flush(cons_ops[index], vtermnos[index]); } @@ -878,9 +888,15 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + unsigned long flags; + char *c; + c = hp->hvc_con_outbuf; do { - n = hp->ops->put_chars(hp->vtermno, &ch, 1); + spin_lock_irqsave(&hp->hvc_con_lock, flags); + c[0] = ch; + n = hp->ops->put_chars(hp->vtermno, c, 1); + spin_unlock_irqrestore(&hp->hvc_con_lock, flags); } while (n <= 0); } #endif @@ -933,6 +949,16 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, hp->outbuf_size = outbuf_size; hp->outbuf = &((char *)hp)[ALIGN(sizeof(*hp), sizeof(long))]; + /* +* hvc_con_outbuf is guaranteed to be aligned at least to the +* size(N_OUTBUF) by kmalloc(). +*/ + hp->hvc_con_outbuf = kzalloc(N_OUTBUF, GFP_KERNEL); + if (!hp->hvc_con_outbuf) + return ERR_PTR(-ENOMEM); + + spin_lock_init(&hp->hvc_con_lock); + tty_port_init(&hp->port); hp->port.ops = &hvc_port_ops; diff --git a/drivers/tty/hvc/hvc_console.h b/drivers/tty/hvc/hvc_console.h index 18d005814..8972c52de 100644 --- a/drivers/tty/hvc/hvc_console.h +++ b/drivers/tty/hvc/hvc_console.h @@ -48,6 +48,8 @@ struct hvc_struct { struct work_struct tty_resize; struct list_head next; unsigned long flags; + char *hvc_con_outbuf; + spinlock_t hvc_con_lock; }; /* implemented by a low level driver */ -- 2.17.1
Re: [PATCH 2/2] virtio-console: remove unnecessary kmemdup()
在 2021/8/2 下午3:25, Jiri Slaby 写道: Hi, why is this 2/2? I seem (Lore neither) to find 1/2. You didn't receive 1/2? [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 <https://lkml.org/lkml/2021/8/1/8> On 01. 08. 21, 7:16, Xianting Tian wrote: hvc framework will never pass stack memory to the put_chars() function, Am I blind or missing something? hvc_console_print(...) { char c[N_OUTBUF] ... cons_ops[index]->put_chars(vtermnos[index], c, i); The same here: hvc_poll_put_char(..., char ch) { ... n = hp->ops->put_chars(hp->vtermno, &ch, 1); AFAICS both of them *pass* a pointer to stack variable. yes, I discussed the issue with Arnd before in below thread, you can get the history, thanks https://lkml.org/lkml/2021/7/27/494 <https://lkml.org/lkml/2021/7/27/494> So the calling of kmemdup() is unnecessary, remove it. Fixes: c4baad5029 ("virtio-console: avoid DMA from stack") This patch doesn't "Fix" -- it reverts the commit. You should've CCed the author too. yes, we discussed ther issue in above thread, which we CCed the author. Signed-off-by: Xianting Tian --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /*
Re: [PATCH 2/2] virtio-console: remove unnecessary kmemdup()
在 2021/8/2 下午4:40, Jiri Slaby 写道: On 02. 08. 21, 10:32, Xianting Tian wrote: 在 2021/8/2 下午3:25, Jiri Slaby 写道: Hi, why is this 2/2? I seem (Lore neither) to find 1/2. You didn't receive 1/2? [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars() https://lkml.org/lkml/2021/8/1/8 <https://lkml.org/lkml/2021/8/1/8> Oh, I did, but it's not properly threaded. PLease fix your setup. Ok, thanks On 01. 08. 21, 7:16, Xianting Tian wrote: hvc framework will never pass stack memory to the put_chars() function, Am I blind or missing something? hvc_console_print(...) { char c[N_OUTBUF] ... cons_ops[index]->put_chars(vtermnos[index], c, i); The same here: hvc_poll_put_char(..., char ch) { ... n = hp->ops->put_chars(hp->vtermno, &ch, 1); AFAICS both of them *pass* a pointer to stack variable. yes, I discussed the issue with Arnd before in below thread, you can get the history, thanks https://lkml.org/lkml/2021/7/27/494 <https://lkml.org/lkml/2021/7/27/494> So is this a v2? You should have noted that. And what changed from v1 too. I think yes, I should mentioned it in this patch, sorry for that:( So the calling of kmemdup() is unnecessary, remove it. Fixes: c4baad5029 ("virtio-console: avoid DMA from stack") This patch doesn't "Fix" -- it reverts the commit. You should've CCed the author too. yes, we discussed ther issue in above thread, which we CCed the author. I don't see any input from the author? Anyway, 1/2 does not even build, so you will send v3 with all the above fixed, hopefully. yes, I will send v3 patch after I figured out a better solution based on Arnd's comments for the patch '1/2'. Do you have any other suggestion for the solution? thanks. thanks,
[PATCH 2/2] virtio-console: remove unnecessary kmemdup()
hvc framework will never pass stack memory to the put_chars() function, So the calling of kmemdup() is unnecessary, remove it. Fixes: c4baad5029 ("virtio-console: avoid DMA from stack") Signed-off-by: Xianting Tian --- drivers/char/virtio_console.c | 12 ++-- 1 file changed, 2 insertions(+), 10 deletions(-) diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c index 7eaf303a7..4ed3ffb1d 100644 --- a/drivers/char/virtio_console.c +++ b/drivers/char/virtio_console.c @@ -1117,8 +1117,6 @@ static int put_chars(u32 vtermno, const char *buf, int count) { struct port *port; struct scatterlist sg[1]; - void *data; - int ret; if (unlikely(early_put_chars)) return early_put_chars(vtermno, buf, count); @@ -1127,14 +1125,8 @@ static int put_chars(u32 vtermno, const char *buf, int count) if (!port) return -EPIPE; - data = kmemdup(buf, count, GFP_ATOMIC); - if (!data) - return -ENOMEM; - - sg_init_one(sg, data, count); - ret = __send_to_port(port, sg, 1, count, data, false); - kfree(data); - return ret; + sg_init_one(sg, buf, count); + return __send_to_port(port, sg, 1, count, (void *)buf, false); } /* -- 2.17.1
[PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars()
As well known, hvc backend can register its opertions to hvc backend. the opertions contain put_chars(), get_chars() and so on. Some hvc backend may do dma in its opertions. eg, put_chars() of virtio-console. But in the code of hvc framework, it may pass DMA incapable memory to put_chars() under a specific configuration, which is explained in commit c4baad5029(virtio-console: avoid DMA from stack): 1, c[] is on stack, hvc_console_print(): char c[N_OUTBUF] __ALIGNED__; cons_ops[index]->put_chars(vtermnos[index], c, i); 2, ch is on stack, static void hvc_poll_put_char(,,char ch) { struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); } while (n <= 0); } Commit c4baad5029 is just the fix to avoid DMA from stack memory, which is passed to virtio-console by hvc framework in above code. But I think the fix is aggressive, it directly uses kmemdup() to alloc new buffer from kmalloc area and do memcpy no matter the memory is in kmalloc area or not. But most importantly, it should better be fixed in the hvc framework, by changing it to never pass stack memory to the put_chars() function in the first place. Otherwise, we still face the smae issue if a new hvc backend using dma added in the furture. Considering lock competition of hp->outbuf and the complicated logic in hvc_console_print(), I didn’t use hp->outbuf, just allocate additional memory(length N_OUTBUF) and append it to hp->outbuf. For the issue in hvc_poll_put_char(), I use a static char to replace the char in stack. With the patch, we can remove the fix c4baad5029. Signed-off-by: Xianting Tian Tested-by: Xianting Tian --- drivers/tty/hvc/hvc_console.c | 14 +++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c index 5bb8c4e44..f7a7fa083 100644 --- a/drivers/tty/hvc/hvc_console.c +++ b/drivers/tty/hvc/hvc_console.c @@ -151,9 +151,10 @@ static uint32_t vtermnos[MAX_NR_HVC_CONSOLES] = static void hvc_console_print(struct console *co, const char *b, unsigned count) { - char c[N_OUTBUF] __ALIGNED__; + char *c; unsigned i = 0, n = 0; int r, donecr = 0, index = co->index; + struct hvc_struct *hp; /* Console access attempt outside of acceptable console range. */ if (index >= MAX_NR_HVC_CONSOLES) @@ -163,8 +164,14 @@ static void hvc_console_print(struct console *co, const char *b, if (vtermnos[index] == -1) return; + list_for_each_entry(hp, &hvc_structs, next) + if (hp->vtermno == vtermnos[index]) + break; + + c = &hp->outbuf[hp->outbuf_size]; + while (count > 0 || i > 0) { - if (count > 0 && i < sizeof(c)) { + if (count > 0 && i < N_OUTBUF) { if (b[n] == '\n' && !donecr) { c[i++] = '\r'; donecr = 1; @@ -878,6 +885,7 @@ static void hvc_poll_put_char(struct tty_driver *driver, int line, char ch) struct tty_struct *tty = driver->ttys[0]; struct hvc_struct *hp = tty->driver_data; int n; + static char ch = ch; do { n = hp->ops->put_chars(hp->vtermno, &ch, 1); @@ -922,7 +930,7 @@ struct hvc_struct *hvc_alloc(uint32_t vtermno, int data, return ERR_PTR(err); } - hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size, + hp = kzalloc(ALIGN(sizeof(*hp), sizeof(long)) + outbuf_size + N_OUTBUF, GFP_KERNEL); if (!hp) return ERR_PTR(-ENOMEM); -- 2.17.1
Re: [PATCH] virtio-console: avoid DMA from vmalloc area
在 2021/7/28 下午5:01, Arnd Bergmann 写道: On Wed, Jul 28, 2021 at 10:28 AM Xianting Tian wrote: 在 2021/7/28 下午3:25, Arnd Bergmann 写道: I checked several hvc backends, like drivers/tty/hvc/hvc_riscv_sbi.c, drivers/tty/hvc/hvc_iucv.c, drivers/tty/hvc/hvc_rtas.c, they don't use dma. I not finished all hvc backends check yet. But I think even if all hvc backends don't use dma currently, it is still possible that the hvc backend using dma will be added in the furture. So I agree with you it should better be fixed in the hvc framework, solve the issue in the first place. Ok, sounds good to me, no need to check more backends then. I see the hvc-console driver is listed as 'Odd Fixes' in the maintainer list, with nobody assigned other than the ppc kernel list (added to Cc). Once you come up with a fix in hvc_console.c, please send that to the tty maintainers, the ppc list and me, and I'll review it. OK, thanks, I will submit the patch ASAP :) Arnd