RE: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Matt, I submitted a v3 patch based on my comment below.. Seiji > -Original Message- > From: linux-efi-ow...@vger.kernel.org > [mailto:linux-efi-ow...@vger.kernel.org] On Behalf Of Seiji Aguchi > Sent: Wednesday, October 09, 2013 12:37 PM > To: Matt Fleming > Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; > tony.l...@intel.com; matt.flem...@intel.com; dle- > deve...@lists.sourceforge.net; Tomoki Sekiyama > Subject: RE: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs > entry until the scan is completed > > Thank you for reviewing. > In my understanding, your point is that all accesses to efivar_entry should > be done while holding __efivars->lock. > > > > @@ -88,8 +103,9 @@ static int efi_pstore_read_func(struct efivar_entry > > > *entry, void *data) > > > return 0; > > > > > > entry->var.DataSize = 1024; > > > - __efivar_entry_get(entry, >var.Attributes, > > > ->var.DataSize, entry->var.Data); > > > + efivar_entry_get(entry, >var.Attributes, > > > + >var.DataSize, entry->var.Data); > > > + > > > size = entry->var.DataSize; > > > > > > *cb_data->buf = kmemdup(entry->var.Data, size, GFP_KERNEL); > > > > This isn't safe to do without holding the __efivars->lock, because > > there's the potential for someone else to update entry->var.Data and > > entry->var.DataSize while you're in the middle of copying the data in > > kmemdup(). This could leak to an information leak, though I think you're > > safe from an out-of-bounds access because DataSize is never > 1024. > > > > I see... > Bu, kmemdup() cannot be called while holding the spinlock. > > So, for protecting efivar_entry, I will call kzalloc() before holding the > lock in efi_pstore_read(). > and use memcpy() in efi_pstore_read_func(). > > The pseudo code is as below. > > static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type, > struct pstore_info *psi) > { > *data.buf = kzalloc(1024, GFP_KERNEL); > if (!*data.buf) > return -ENOMEM; > > efivar_entry_iter_begin(); > size = efi_pstore_sysfs_entry_iter(, > (struct efivar_entry > **)>data); > efivar_entry_iter_end(); > if (size <= 0) > kfree(*data.buf); > return size; > } > > static int efi_pstore_read_func(struct efivar_entry *entry, void *data) > { > [...] > entry->var.DataSize = 1024; > __efivar_entry_get(entry, >var.Attributes, >>var.DataSize, entry->var.Data); > > size = entry->var.DataSize; > memcpy(*cb_data->buf, entry->var.Data, (size_t)min_t(unsigned > long, >1024, > size)); > return size; > } > > > > This doesn't look correct to me. You can't access 'entry' outside of the > > *_iter_begin() and *_iter_end() blocks. You can't do, > > > > efivar_entry_iter_end(): > > > > if (!entry->scanning) > > efivar_unregister(entry); > > > > because 'entry' may have already been freed by another CPU. > > I will fix it as follows. > > if (!entry->scanning) { > efivar_entry_iter_end(); > efivar_unregister(entry); > } else > efivar_entry_iter_end(); > > (efivar_unregister(entry) still runs concurrently. > But, it cannot move inside spinlock because kzalloc() may run while freeing > kobject.) > > Is it your expectation? > > Seiji > > -- > To unsubscribe from this list: send the line "unsubscribe linux-efi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Matt, I submitted a v3 patch based on my comment below.. Seiji -Original Message- From: linux-efi-ow...@vger.kernel.org [mailto:linux-efi-ow...@vger.kernel.org] On Behalf Of Seiji Aguchi Sent: Wednesday, October 09, 2013 12:37 PM To: Matt Fleming Cc: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; tony.l...@intel.com; matt.flem...@intel.com; dle- deve...@lists.sourceforge.net; Tomoki Sekiyama Subject: RE: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed Thank you for reviewing. In my understanding, your point is that all accesses to efivar_entry should be done while holding __efivars-lock. @@ -88,8 +103,9 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data) return 0; entry-var.DataSize = 1024; - __efivar_entry_get(entry, entry-var.Attributes, -entry-var.DataSize, entry-var.Data); + efivar_entry_get(entry, entry-var.Attributes, + entry-var.DataSize, entry-var.Data); + size = entry-var.DataSize; *cb_data-buf = kmemdup(entry-var.Data, size, GFP_KERNEL); This isn't safe to do without holding the __efivars-lock, because there's the potential for someone else to update entry-var.Data and entry-var.DataSize while you're in the middle of copying the data in kmemdup(). This could leak to an information leak, though I think you're safe from an out-of-bounds access because DataSize is never 1024. I see... Bu, kmemdup() cannot be called while holding the spinlock. So, for protecting efivar_entry, I will call kzalloc() before holding the lock in efi_pstore_read(). and use memcpy() in efi_pstore_read_func(). The pseudo code is as below. static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type, struct pstore_info *psi) { *data.buf = kzalloc(1024, GFP_KERNEL); if (!*data.buf) return -ENOMEM; efivar_entry_iter_begin(); size = efi_pstore_sysfs_entry_iter(data, (struct efivar_entry **)psi-data); efivar_entry_iter_end(); if (size = 0) kfree(*data.buf); return size; } static int efi_pstore_read_func(struct efivar_entry *entry, void *data) { [...] entry-var.DataSize = 1024; __efivar_entry_get(entry, entry-var.Attributes, entry-var.DataSize, entry-var.Data); size = entry-var.DataSize; memcpy(*cb_data-buf, entry-var.Data, (size_t)min_t(unsigned long, 1024, size)); return size; } This doesn't look correct to me. You can't access 'entry' outside of the *_iter_begin() and *_iter_end() blocks. You can't do, efivar_entry_iter_end(): if (!entry-scanning) efivar_unregister(entry); because 'entry' may have already been freed by another CPU. I will fix it as follows. if (!entry-scanning) { efivar_entry_iter_end(); efivar_unregister(entry); } else efivar_entry_iter_end(); (efivar_unregister(entry) still runs concurrently. But, it cannot move inside spinlock because kzalloc() may run while freeing kobject.) Is it your expectation? Seiji -- To unsubscribe from this list: send the line unsubscribe linux-efi in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Thank you for reviewing. In my understanding, your point is that all accesses to efivar_entry should be done while holding __efivars->lock. > > @@ -88,8 +103,9 @@ static int efi_pstore_read_func(struct efivar_entry > > *entry, void *data) > > return 0; > > > > entry->var.DataSize = 1024; > > - __efivar_entry_get(entry, >var.Attributes, > > - >var.DataSize, entry->var.Data); > > + efivar_entry_get(entry, >var.Attributes, > > +>var.DataSize, entry->var.Data); > > + > > size = entry->var.DataSize; > > > > *cb_data->buf = kmemdup(entry->var.Data, size, GFP_KERNEL); > > This isn't safe to do without holding the __efivars->lock, because > there's the potential for someone else to update entry->var.Data and > entry->var.DataSize while you're in the middle of copying the data in > kmemdup(). This could leak to an information leak, though I think you're > safe from an out-of-bounds access because DataSize is never > 1024. > I see... Bu, kmemdup() cannot be called while holding the spinlock. So, for protecting efivar_entry, I will call kzalloc() before holding the lock in efi_pstore_read(). and use memcpy() in efi_pstore_read_func(). The pseudo code is as below. static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type, struct pstore_info *psi) { *data.buf = kzalloc(1024, GFP_KERNEL); if (!*data.buf) return -ENOMEM; efivar_entry_iter_begin(); size = efi_pstore_sysfs_entry_iter(, (struct efivar_entry **)>data); efivar_entry_iter_end(); if (size <= 0) kfree(*data.buf); return size; } static int efi_pstore_read_func(struct efivar_entry *entry, void *data) { [...] entry->var.DataSize = 1024; __efivar_entry_get(entry, >var.Attributes, >var.DataSize, entry->var.Data); size = entry->var.DataSize; memcpy(*cb_data->buf, entry->var.Data, (size_t)min_t(unsigned long, 1024, size)); return size; } > This doesn't look correct to me. You can't access 'entry' outside of the > *_iter_begin() and *_iter_end() blocks. You can't do, > > efivar_entry_iter_end(): > > if (!entry->scanning) > efivar_unregister(entry); > > because 'entry' may have already been freed by another CPU. I will fix it as follows. if (!entry->scanning) { efivar_entry_iter_end(); efivar_unregister(entry); } else efivar_entry_iter_end(); (efivar_unregister(entry) still runs concurrently. But, it cannot move inside spinlock because kzalloc() may run while freeing kobject.) Is it your expectation? Seiji -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Thank you for reviewing. In my understanding, your point is that all accesses to efivar_entry should be done while holding __efivars-lock. @@ -88,8 +103,9 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data) return 0; entry-var.DataSize = 1024; - __efivar_entry_get(entry, entry-var.Attributes, - entry-var.DataSize, entry-var.Data); + efivar_entry_get(entry, entry-var.Attributes, +entry-var.DataSize, entry-var.Data); + size = entry-var.DataSize; *cb_data-buf = kmemdup(entry-var.Data, size, GFP_KERNEL); This isn't safe to do without holding the __efivars-lock, because there's the potential for someone else to update entry-var.Data and entry-var.DataSize while you're in the middle of copying the data in kmemdup(). This could leak to an information leak, though I think you're safe from an out-of-bounds access because DataSize is never 1024. I see... Bu, kmemdup() cannot be called while holding the spinlock. So, for protecting efivar_entry, I will call kzalloc() before holding the lock in efi_pstore_read(). and use memcpy() in efi_pstore_read_func(). The pseudo code is as below. static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type, struct pstore_info *psi) { *data.buf = kzalloc(1024, GFP_KERNEL); if (!*data.buf) return -ENOMEM; efivar_entry_iter_begin(); size = efi_pstore_sysfs_entry_iter(data, (struct efivar_entry **)psi-data); efivar_entry_iter_end(); if (size = 0) kfree(*data.buf); return size; } static int efi_pstore_read_func(struct efivar_entry *entry, void *data) { [...] entry-var.DataSize = 1024; __efivar_entry_get(entry, entry-var.Attributes, entry-var.DataSize, entry-var.Data); size = entry-var.DataSize; memcpy(*cb_data-buf, entry-var.Data, (size_t)min_t(unsigned long, 1024, size)); return size; } This doesn't look correct to me. You can't access 'entry' outside of the *_iter_begin() and *_iter_end() blocks. You can't do, efivar_entry_iter_end(): if (!entry-scanning) efivar_unregister(entry); because 'entry' may have already been freed by another CPU. I will fix it as follows. if (!entry-scanning) { efivar_entry_iter_end(); efivar_unregister(entry); } else efivar_entry_iter_end(); (efivar_unregister(entry) still runs concurrently. But, it cannot move inside spinlock because kzalloc() may run while freeing kobject.) Is it your expectation? Seiji -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
On Fri, 27 Sep, at 04:23:52PM, Seiji Aguchi wrote: > Change form v1 > - Rebase to 3.12-rc2 > > Currently, when mounting pstore file system, a read callback of efi_pstore > driver runs mutiple times as below. > > - In the first read callback, scan efivar_sysfs_list from head and pass > a kmsg buffer of a entry to an upper pstore layer. > - In the second read callback, rescan efivar_sysfs_list from the entry and > pass > another kmsg buffer to it. > - Repeat the scan and pass until the end of efivar_sysfs_list. > > In this process, an entry is read across the multiple read function calls. > To avoid race between the read and erasion, the whole process above is > protected by a spinlock, holding in open() and releasing in close(). > > At the same time, kmemdup() is called to pass the buffer to pstore filesystem > during it. > And then, it causes a following lockdep warning. > > To make the read callback runnable without taking spinlok, > holding off a deletion of sysfs entry if it happens while scanning it > via efi_pstore, and deleting it after the scan is completed. > > To implement it, this patch introduces two flags, scanning and deleting, > to efivar_entry. > Also, __efivar_entry_get() is removed because it was used in efi_pstore only. [...] > @@ -88,8 +103,9 @@ static int efi_pstore_read_func(struct efivar_entry > *entry, void *data) > return 0; > > entry->var.DataSize = 1024; > - __efivar_entry_get(entry, >var.Attributes, > ->var.DataSize, entry->var.Data); > + efivar_entry_get(entry, >var.Attributes, > + >var.DataSize, entry->var.Data); > + > size = entry->var.DataSize; > > *cb_data->buf = kmemdup(entry->var.Data, size, GFP_KERNEL); This isn't safe to do without holding the __efivars->lock, because there's the potential for someone else to update entry->var.Data and entry->var.DataSize while you're in the middle of copying the data in kmemdup(). This could leak to an information leak, though I think you're safe from an out-of-bounds access because DataSize is never > 1024. > +/** > + * __efi_pstore_scan_sysfs_exit > + * @entry: deleting entry > + * @turn_off_scanning: Check if a scanning flag should be turned off > + */ > +static inline void __efi_pstore_scan_sysfs_exit(struct efivar_entry *entry, > + bool turn_off_scanning) > +{ > + if (entry->deleting) { > + list_del(>list); > + efivar_entry_iter_end(); > + efivar_unregister(entry); > + efivar_entry_iter_begin(); > + } else if (turn_off_scanning) > + entry->scanning = false; > +} [...] > @@ -184,9 +305,17 @@ static int efi_pstore_erase_func(struct efivar_entry > *entry, void *data) > return 0; > } > > + if (entry->scanning) { > + /* > + * Skip deletion because this entry will be deleted > + * after scanning is completed. > + */ > + entry->deleting = true; > + } else > + list_del(>list); > + > /* found */ > __efivar_entry_delete(entry); > - list_del(>list); > > return 1; > } > @@ -216,7 +345,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 > id, int count, > found = __efivar_entry_iter(efi_pstore_erase_func, _sysfs_list, > , ); > efivar_entry_iter_end(); > > - if (found) > + if (found && !entry->scanning) > efivar_unregister(entry); > > return 0; > diff --git a/drivers/firmware/efi/efivars.c b/drivers/firmware/efi/efivars.c > index 8a7432a..831bc5c 100644 > --- a/drivers/firmware/efi/efivars.c > +++ b/drivers/firmware/efi/efivars.c > @@ -388,7 +388,8 @@ static ssize_t efivar_delete(struct file *filp, struct > kobject *kobj, > if (err) > return err; > > - efivar_unregister(entry); > + if (!entry->scanning) > + efivar_unregister(entry); > > /* It's dead Jim */ > return count; > diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c > index 391c67b..573ed92 100644 > --- a/drivers/firmware/efi/vars.c > +++ b/drivers/firmware/efi/vars.c > @@ -683,8 +683,16 @@ struct efivar_entry *efivar_entry_find(efi_char16_t > *name, efi_guid_t guid, > if (!found) > return NULL; > > - if (remove) > - list_del(>list); > + if (remove) { > + if (entry->scanning) { > + /* > + * The entry will be deleted > + * after scanning is completed. > + */ > + entry->deleting = true; > + } else > + list_del(>list); > + } > > return entry; > } This doesn't look correct to me. You can't access 'entry' outside of the *_iter_begin() and *_iter_end() blocks. You can't do,
Re: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
On Fri, 27 Sep, at 04:23:52PM, Seiji Aguchi wrote: Change form v1 - Rebase to 3.12-rc2 Currently, when mounting pstore file system, a read callback of efi_pstore driver runs mutiple times as below. - In the first read callback, scan efivar_sysfs_list from head and pass a kmsg buffer of a entry to an upper pstore layer. - In the second read callback, rescan efivar_sysfs_list from the entry and pass another kmsg buffer to it. - Repeat the scan and pass until the end of efivar_sysfs_list. In this process, an entry is read across the multiple read function calls. To avoid race between the read and erasion, the whole process above is protected by a spinlock, holding in open() and releasing in close(). At the same time, kmemdup() is called to pass the buffer to pstore filesystem during it. And then, it causes a following lockdep warning. To make the read callback runnable without taking spinlok, holding off a deletion of sysfs entry if it happens while scanning it via efi_pstore, and deleting it after the scan is completed. To implement it, this patch introduces two flags, scanning and deleting, to efivar_entry. Also, __efivar_entry_get() is removed because it was used in efi_pstore only. [...] @@ -88,8 +103,9 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data) return 0; entry-var.DataSize = 1024; - __efivar_entry_get(entry, entry-var.Attributes, -entry-var.DataSize, entry-var.Data); + efivar_entry_get(entry, entry-var.Attributes, + entry-var.DataSize, entry-var.Data); + size = entry-var.DataSize; *cb_data-buf = kmemdup(entry-var.Data, size, GFP_KERNEL); This isn't safe to do without holding the __efivars-lock, because there's the potential for someone else to update entry-var.Data and entry-var.DataSize while you're in the middle of copying the data in kmemdup(). This could leak to an information leak, though I think you're safe from an out-of-bounds access because DataSize is never 1024. +/** + * __efi_pstore_scan_sysfs_exit + * @entry: deleting entry + * @turn_off_scanning: Check if a scanning flag should be turned off + */ +static inline void __efi_pstore_scan_sysfs_exit(struct efivar_entry *entry, + bool turn_off_scanning) +{ + if (entry-deleting) { + list_del(entry-list); + efivar_entry_iter_end(); + efivar_unregister(entry); + efivar_entry_iter_begin(); + } else if (turn_off_scanning) + entry-scanning = false; +} [...] @@ -184,9 +305,17 @@ static int efi_pstore_erase_func(struct efivar_entry *entry, void *data) return 0; } + if (entry-scanning) { + /* + * Skip deletion because this entry will be deleted + * after scanning is completed. + */ + entry-deleting = true; + } else + list_del(entry-list); + /* found */ __efivar_entry_delete(entry); - list_del(entry-list); return 1; } @@ -216,7 +345,7 @@ static int efi_pstore_erase(enum pstore_type_id type, u64 id, int count, found = __efivar_entry_iter(efi_pstore_erase_func, efivar_sysfs_list, edata, entry); efivar_entry_iter_end(); - if (found) + if (found !entry-scanning) efivar_unregister(entry); return 0; diff --git a/drivers/firmware/efi/efivars.c b/drivers/firmware/efi/efivars.c index 8a7432a..831bc5c 100644 --- a/drivers/firmware/efi/efivars.c +++ b/drivers/firmware/efi/efivars.c @@ -388,7 +388,8 @@ static ssize_t efivar_delete(struct file *filp, struct kobject *kobj, if (err) return err; - efivar_unregister(entry); + if (!entry-scanning) + efivar_unregister(entry); /* It's dead Jim */ return count; diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c index 391c67b..573ed92 100644 --- a/drivers/firmware/efi/vars.c +++ b/drivers/firmware/efi/vars.c @@ -683,8 +683,16 @@ struct efivar_entry *efivar_entry_find(efi_char16_t *name, efi_guid_t guid, if (!found) return NULL; - if (remove) - list_del(entry-list); + if (remove) { + if (entry-scanning) { + /* + * The entry will be deleted + * after scanning is completed. + */ + entry-deleting = true; + } else + list_del(entry-list); + } return entry; } This doesn't look correct to me. You can't access 'entry' outside of the *_iter_begin() and *_iter_end() blocks. You can't do, efivar_entry_iter_end(): if (!entry-scanning) efivar_unregister(entry); because
Re: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
On 4 October 2013 16:46, Seiji Aguchi wrote: > Are there anyone who can review this bugfix? Sorry I haven't got to it yet (or the previous version). It is on my TODO list. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RE: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Are there anyone who can review this bugfix? Seiji > -Original Message- > From: linux-kernel-ow...@vger.kernel.org > [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Seiji Aguchi > Sent: Friday, September 27, 2013 4:24 PM > To: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; > tony.l...@intel.com; matt.flem...@intel.com > Cc: dle-deve...@lists.sourceforge.net; Tomoki Sekiyama > Subject: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry > until the scan is completed > > Change form v1 > - Rebase to 3.12-rc2 > > Currently, when mounting pstore file system, a read callback of efi_pstore > driver runs mutiple times as below. > > - In the first read callback, scan efivar_sysfs_list from head and pass > a kmsg buffer of a entry to an upper pstore layer. > - In the second read callback, rescan efivar_sysfs_list from the entry and > pass > another kmsg buffer to it. > - Repeat the scan and pass until the end of efivar_sysfs_list. > > In this process, an entry is read across the multiple read function calls. > To avoid race between the read and erasion, the whole process above is > protected by a spinlock, holding in open() and releasing in close(). > > At the same time, kmemdup() is called to pass the buffer to pstore filesystem > during it. > And then, it causes a following lockdep warning. > > To make the read callback runnable without taking spinlok, > holding off a deletion of sysfs entry if it happens while scanning it > via efi_pstore, and deleting it after the scan is completed. > > To implement it, this patch introduces two flags, scanning and deleting, > to efivar_entry. > Also, __efivar_entry_get() is removed because it was used in efi_pstore only. > > [1.143710] [ cut here ] > [1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 > lockdep_trace_alloc+0x104/0x110() > [1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) > [1.144058] Modules linked in: > > [1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2 > [1.144058] 0009 8800797e9ae0 816614a5 > 8800797e9b28 > [1.144058] 8800797e9b18 8105510d 0080 > 0046 > [1.144058] 00d0 03af 81ccd0c0 > 8800797e9b78 > [1.144058] Call Trace: > [1.144058] [] dump_stack+0x54/0x74 > [1.144058] [] warn_slowpath_common+0x7d/0xa0 > [1.144058] [] warn_slowpath_fmt+0x4c/0x50 > [1.144058] [] ? vsscanf+0x57f/0x7b0 > [1.144058] [] lockdep_trace_alloc+0x104/0x110 > [1.144058] [] __kmalloc_track_caller+0x50/0x280 > [1.144058] [] ? > efi_pstore_read_func.part.1+0x12b/0x170 > [1.144058] [] kmemdup+0x20/0x50 > [1.144058] [] efi_pstore_read_func.part.1+0x12b/0x170 > [1.144058] [] ? > efi_pstore_read_func.part.1+0x170/0x170 > [1.144058] [] efi_pstore_read_func+0xb4/0xe0 > [1.144058] [] __efivar_entry_iter+0xfb/0x120 > [1.144058] [] efi_pstore_read+0x3f/0x50 > [1.144058] [] pstore_get_records+0x9a/0x150 > [1.158207] [] ? selinux_d_instantiate+0x1c/0x20 > [1.158207] [] ? parse_options+0x80/0x80 > [1.158207] [] pstore_fill_super+0xa5/0xc0 > [1.158207] [] mount_single+0xa2/0xd0 > [1.158207] [] pstore_mount+0x18/0x20 > [1.158207] [] mount_fs+0x39/0x1b0 > [1.158207] [] ? __alloc_percpu+0x10/0x20 > [1.158207] [] vfs_kern_mount+0x63/0xf0 > [1.158207] [] do_mount+0x23e/0xa20 > [1.158207] [] ? strndup_user+0x4b/0xf0 > [1.158207] [] SyS_mount+0x83/0xc0 > [1.158207] [] system_call_fastpath+0x16/0x1b > [1.158207] ---[ end trace 61981bc62de9f6f4 ]--- > > Signed-off-by: Seiji Aguchi > --- > drivers/firmware/efi/efi-pstore.c | 145 > +++--- > drivers/firmware/efi/efivars.c| 3 +- > drivers/firmware/efi/vars.c | 39 +++--- > include/linux/efi.h | 4 +- > 4 files changed, 151 insertions(+), 40 deletions(-) > > diff --git a/drivers/firmware/efi/efi-pstore.c > b/drivers/firmware/efi/efi-pstore.c > index 5002d50..53001a5 100644 > --- a/drivers/firmware/efi/efi-pstore.c > +++ b/drivers/firmware/efi/efi-pstore.c > @@ -18,14 +18,12 @@ module_param_named(pstore_disable, > efivars_pstore_disable, bool, 0644); > > static int efi_pstore_open(struct pstore_info *psi) > { > - efivar_entry_iter_begin(); > psi->data = NULL; > return 0; > } > > static int efi_pstore_close(struct pstore_info *psi) > { > - efivar_entry_iter_end(); > psi->data = NULL; > return 0; > } > @@ -39,6 +37,23 @@ struct pstore_read_data { > char **buf; > }; > > +/** > + * efi_pstore_read_func > + * @entry: reading entry > + * @data: data of the entry > + * > + * This function runs in non-atomic context. > + * > + * Also, it returns a size of NVRAM entry logged via efi_pstore_write(). > + * pstore in accordance with the returned value as below. > + *
RE: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Are there anyone who can review this bugfix? Seiji -Original Message- From: linux-kernel-ow...@vger.kernel.org [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Seiji Aguchi Sent: Friday, September 27, 2013 4:24 PM To: linux-kernel@vger.kernel.org; linux-...@vger.kernel.org; tony.l...@intel.com; matt.flem...@intel.com Cc: dle-deve...@lists.sourceforge.net; Tomoki Sekiyama Subject: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed Change form v1 - Rebase to 3.12-rc2 Currently, when mounting pstore file system, a read callback of efi_pstore driver runs mutiple times as below. - In the first read callback, scan efivar_sysfs_list from head and pass a kmsg buffer of a entry to an upper pstore layer. - In the second read callback, rescan efivar_sysfs_list from the entry and pass another kmsg buffer to it. - Repeat the scan and pass until the end of efivar_sysfs_list. In this process, an entry is read across the multiple read function calls. To avoid race between the read and erasion, the whole process above is protected by a spinlock, holding in open() and releasing in close(). At the same time, kmemdup() is called to pass the buffer to pstore filesystem during it. And then, it causes a following lockdep warning. To make the read callback runnable without taking spinlok, holding off a deletion of sysfs entry if it happens while scanning it via efi_pstore, and deleting it after the scan is completed. To implement it, this patch introduces two flags, scanning and deleting, to efivar_entry. Also, __efivar_entry_get() is removed because it was used in efi_pstore only. [1.143710] [ cut here ] [1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110() [1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) [1.144058] Modules linked in: [1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2 [1.144058] 0009 8800797e9ae0 816614a5 8800797e9b28 [1.144058] 8800797e9b18 8105510d 0080 0046 [1.144058] 00d0 03af 81ccd0c0 8800797e9b78 [1.144058] Call Trace: [1.144058] [816614a5] dump_stack+0x54/0x74 [1.144058] [8105510d] warn_slowpath_common+0x7d/0xa0 [1.144058] [8105517c] warn_slowpath_fmt+0x4c/0x50 [1.144058] [8131290f] ? vsscanf+0x57f/0x7b0 [1.144058] [810bbd74] lockdep_trace_alloc+0x104/0x110 [1.144058] [81192da0] __kmalloc_track_caller+0x50/0x280 [1.144058] [815147bb] ? efi_pstore_read_func.part.1+0x12b/0x170 [1.144058] [8115b260] kmemdup+0x20/0x50 [1.144058] [815147bb] efi_pstore_read_func.part.1+0x12b/0x170 [1.144058] [81514800] ? efi_pstore_read_func.part.1+0x170/0x170 [1.144058] [815148b4] efi_pstore_read_func+0xb4/0xe0 [1.144058] [81512b7b] __efivar_entry_iter+0xfb/0x120 [1.144058] [8151428f] efi_pstore_read+0x3f/0x50 [1.144058] [8128d7ba] pstore_get_records+0x9a/0x150 [1.158207] [812af25c] ? selinux_d_instantiate+0x1c/0x20 [1.158207] [8128ce30] ? parse_options+0x80/0x80 [1.158207] [8128ced5] pstore_fill_super+0xa5/0xc0 [1.158207] [811ae7d2] mount_single+0xa2/0xd0 [1.158207] [8128ccf8] pstore_mount+0x18/0x20 [1.158207] [811ae8b9] mount_fs+0x39/0x1b0 [1.158207] [81160550] ? __alloc_percpu+0x10/0x20 [1.158207] [811c9493] vfs_kern_mount+0x63/0xf0 [1.158207] [811cbb0e] do_mount+0x23e/0xa20 [1.158207] [8115b51b] ? strndup_user+0x4b/0xf0 [1.158207] [811cc373] SyS_mount+0x83/0xc0 [1.158207] [81673cc2] system_call_fastpath+0x16/0x1b [1.158207] ---[ end trace 61981bc62de9f6f4 ]--- Signed-off-by: Seiji Aguchi seiji.agu...@hds.com --- drivers/firmware/efi/efi-pstore.c | 145 +++--- drivers/firmware/efi/efivars.c| 3 +- drivers/firmware/efi/vars.c | 39 +++--- include/linux/efi.h | 4 +- 4 files changed, 151 insertions(+), 40 deletions(-) diff --git a/drivers/firmware/efi/efi-pstore.c b/drivers/firmware/efi/efi-pstore.c index 5002d50..53001a5 100644 --- a/drivers/firmware/efi/efi-pstore.c +++ b/drivers/firmware/efi/efi-pstore.c @@ -18,14 +18,12 @@ module_param_named(pstore_disable, efivars_pstore_disable, bool, 0644); static int efi_pstore_open(struct pstore_info *psi) { - efivar_entry_iter_begin(); psi-data = NULL; return 0; } static int efi_pstore_close(struct pstore_info *psi) { - efivar_entry_iter_end(); psi-data = NULL; return 0; } @@ -39,6 +37,23 @@ struct pstore_read_data {
Re: [RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
On 4 October 2013 16:46, Seiji Aguchi seiji.agu...@hds.com wrote: Are there anyone who can review this bugfix? Sorry I haven't got to it yet (or the previous version). It is on my TODO list. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Change form v1 - Rebase to 3.12-rc2 Currently, when mounting pstore file system, a read callback of efi_pstore driver runs mutiple times as below. - In the first read callback, scan efivar_sysfs_list from head and pass a kmsg buffer of a entry to an upper pstore layer. - In the second read callback, rescan efivar_sysfs_list from the entry and pass another kmsg buffer to it. - Repeat the scan and pass until the end of efivar_sysfs_list. In this process, an entry is read across the multiple read function calls. To avoid race between the read and erasion, the whole process above is protected by a spinlock, holding in open() and releasing in close(). At the same time, kmemdup() is called to pass the buffer to pstore filesystem during it. And then, it causes a following lockdep warning. To make the read callback runnable without taking spinlok, holding off a deletion of sysfs entry if it happens while scanning it via efi_pstore, and deleting it after the scan is completed. To implement it, this patch introduces two flags, scanning and deleting, to efivar_entry. Also, __efivar_entry_get() is removed because it was used in efi_pstore only. [1.143710] [ cut here ] [1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110() [1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) [1.144058] Modules linked in: [1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2 [1.144058] 0009 8800797e9ae0 816614a5 8800797e9b28 [1.144058] 8800797e9b18 8105510d 0080 0046 [1.144058] 00d0 03af 81ccd0c0 8800797e9b78 [1.144058] Call Trace: [1.144058] [] dump_stack+0x54/0x74 [1.144058] [] warn_slowpath_common+0x7d/0xa0 [1.144058] [] warn_slowpath_fmt+0x4c/0x50 [1.144058] [] ? vsscanf+0x57f/0x7b0 [1.144058] [] lockdep_trace_alloc+0x104/0x110 [1.144058] [] __kmalloc_track_caller+0x50/0x280 [1.144058] [] ? efi_pstore_read_func.part.1+0x12b/0x170 [1.144058] [] kmemdup+0x20/0x50 [1.144058] [] efi_pstore_read_func.part.1+0x12b/0x170 [1.144058] [] ? efi_pstore_read_func.part.1+0x170/0x170 [1.144058] [] efi_pstore_read_func+0xb4/0xe0 [1.144058] [] __efivar_entry_iter+0xfb/0x120 [1.144058] [] efi_pstore_read+0x3f/0x50 [1.144058] [] pstore_get_records+0x9a/0x150 [1.158207] [] ? selinux_d_instantiate+0x1c/0x20 [1.158207] [] ? parse_options+0x80/0x80 [1.158207] [] pstore_fill_super+0xa5/0xc0 [1.158207] [] mount_single+0xa2/0xd0 [1.158207] [] pstore_mount+0x18/0x20 [1.158207] [] mount_fs+0x39/0x1b0 [1.158207] [] ? __alloc_percpu+0x10/0x20 [1.158207] [] vfs_kern_mount+0x63/0xf0 [1.158207] [] do_mount+0x23e/0xa20 [1.158207] [] ? strndup_user+0x4b/0xf0 [1.158207] [] SyS_mount+0x83/0xc0 [1.158207] [] system_call_fastpath+0x16/0x1b [1.158207] ---[ end trace 61981bc62de9f6f4 ]--- Signed-off-by: Seiji Aguchi --- drivers/firmware/efi/efi-pstore.c | 145 +++--- drivers/firmware/efi/efivars.c| 3 +- drivers/firmware/efi/vars.c | 39 +++--- include/linux/efi.h | 4 +- 4 files changed, 151 insertions(+), 40 deletions(-) diff --git a/drivers/firmware/efi/efi-pstore.c b/drivers/firmware/efi/efi-pstore.c index 5002d50..53001a5 100644 --- a/drivers/firmware/efi/efi-pstore.c +++ b/drivers/firmware/efi/efi-pstore.c @@ -18,14 +18,12 @@ module_param_named(pstore_disable, efivars_pstore_disable, bool, 0644); static int efi_pstore_open(struct pstore_info *psi) { - efivar_entry_iter_begin(); psi->data = NULL; return 0; } static int efi_pstore_close(struct pstore_info *psi) { - efivar_entry_iter_end(); psi->data = NULL; return 0; } @@ -39,6 +37,23 @@ struct pstore_read_data { char **buf; }; +/** + * efi_pstore_read_func + * @entry: reading entry + * @data: data of the entry + * + * This function runs in non-atomic context. + * + * Also, it returns a size of NVRAM entry logged via efi_pstore_write(). + * pstore in accordance with the returned value as below. + * + * size > 0: Got data of an entry logged via efi_pstore_write() successfully, + * and pstore filesystem will continue reading subsequent entries. + * size == 0: Entry was not logged via efi_pstore_write(), + *and efi_pstore driver will continue reading subsequent entries. + * size < 0: Failed to get data of entry logging via efi_pstore_write(), + * and pstore will stop reading entry. + */ static int efi_pstore_read_func(struct efivar_entry *entry, void *data) { efi_guid_t vendor = LINUX_EFI_CRASH_GUID; @@ -88,8 +103,9 @@ static int efi_pstore_read_func(struct efivar_entry *entry, void *data) return 0; entry->var.DataSize = 1024; - __efivar_entry_get(entry,
[RFC][PATCH v2] efivars,efi-pstore: Hold off deletion of sysfs entry until the scan is completed
Change form v1 - Rebase to 3.12-rc2 Currently, when mounting pstore file system, a read callback of efi_pstore driver runs mutiple times as below. - In the first read callback, scan efivar_sysfs_list from head and pass a kmsg buffer of a entry to an upper pstore layer. - In the second read callback, rescan efivar_sysfs_list from the entry and pass another kmsg buffer to it. - Repeat the scan and pass until the end of efivar_sysfs_list. In this process, an entry is read across the multiple read function calls. To avoid race between the read and erasion, the whole process above is protected by a spinlock, holding in open() and releasing in close(). At the same time, kmemdup() is called to pass the buffer to pstore filesystem during it. And then, it causes a following lockdep warning. To make the read callback runnable without taking spinlok, holding off a deletion of sysfs entry if it happens while scanning it via efi_pstore, and deleting it after the scan is completed. To implement it, this patch introduces two flags, scanning and deleting, to efivar_entry. Also, __efivar_entry_get() is removed because it was used in efi_pstore only. [1.143710] [ cut here ] [1.144058] WARNING: CPU: 1 PID: 1 at kernel/lockdep.c:2740 lockdep_trace_alloc+0x104/0x110() [1.144058] DEBUG_LOCKS_WARN_ON(irqs_disabled_flags(flags)) [1.144058] Modules linked in: [1.144058] CPU: 1 PID: 1 Comm: systemd Not tainted 3.11.0-rc5 #2 [1.144058] 0009 8800797e9ae0 816614a5 8800797e9b28 [1.144058] 8800797e9b18 8105510d 0080 0046 [1.144058] 00d0 03af 81ccd0c0 8800797e9b78 [1.144058] Call Trace: [1.144058] [816614a5] dump_stack+0x54/0x74 [1.144058] [8105510d] warn_slowpath_common+0x7d/0xa0 [1.144058] [8105517c] warn_slowpath_fmt+0x4c/0x50 [1.144058] [8131290f] ? vsscanf+0x57f/0x7b0 [1.144058] [810bbd74] lockdep_trace_alloc+0x104/0x110 [1.144058] [81192da0] __kmalloc_track_caller+0x50/0x280 [1.144058] [815147bb] ? efi_pstore_read_func.part.1+0x12b/0x170 [1.144058] [8115b260] kmemdup+0x20/0x50 [1.144058] [815147bb] efi_pstore_read_func.part.1+0x12b/0x170 [1.144058] [81514800] ? efi_pstore_read_func.part.1+0x170/0x170 [1.144058] [815148b4] efi_pstore_read_func+0xb4/0xe0 [1.144058] [81512b7b] __efivar_entry_iter+0xfb/0x120 [1.144058] [8151428f] efi_pstore_read+0x3f/0x50 [1.144058] [8128d7ba] pstore_get_records+0x9a/0x150 [1.158207] [812af25c] ? selinux_d_instantiate+0x1c/0x20 [1.158207] [8128ce30] ? parse_options+0x80/0x80 [1.158207] [8128ced5] pstore_fill_super+0xa5/0xc0 [1.158207] [811ae7d2] mount_single+0xa2/0xd0 [1.158207] [8128ccf8] pstore_mount+0x18/0x20 [1.158207] [811ae8b9] mount_fs+0x39/0x1b0 [1.158207] [81160550] ? __alloc_percpu+0x10/0x20 [1.158207] [811c9493] vfs_kern_mount+0x63/0xf0 [1.158207] [811cbb0e] do_mount+0x23e/0xa20 [1.158207] [8115b51b] ? strndup_user+0x4b/0xf0 [1.158207] [811cc373] SyS_mount+0x83/0xc0 [1.158207] [81673cc2] system_call_fastpath+0x16/0x1b [1.158207] ---[ end trace 61981bc62de9f6f4 ]--- Signed-off-by: Seiji Aguchi seiji.agu...@hds.com --- drivers/firmware/efi/efi-pstore.c | 145 +++--- drivers/firmware/efi/efivars.c| 3 +- drivers/firmware/efi/vars.c | 39 +++--- include/linux/efi.h | 4 +- 4 files changed, 151 insertions(+), 40 deletions(-) diff --git a/drivers/firmware/efi/efi-pstore.c b/drivers/firmware/efi/efi-pstore.c index 5002d50..53001a5 100644 --- a/drivers/firmware/efi/efi-pstore.c +++ b/drivers/firmware/efi/efi-pstore.c @@ -18,14 +18,12 @@ module_param_named(pstore_disable, efivars_pstore_disable, bool, 0644); static int efi_pstore_open(struct pstore_info *psi) { - efivar_entry_iter_begin(); psi-data = NULL; return 0; } static int efi_pstore_close(struct pstore_info *psi) { - efivar_entry_iter_end(); psi-data = NULL; return 0; } @@ -39,6 +37,23 @@ struct pstore_read_data { char **buf; }; +/** + * efi_pstore_read_func + * @entry: reading entry + * @data: data of the entry + * + * This function runs in non-atomic context. + * + * Also, it returns a size of NVRAM entry logged via efi_pstore_write(). + * pstore in accordance with the returned value as below. + * + * size 0: Got data of an entry logged via efi_pstore_write() successfully, + * and pstore filesystem will continue reading subsequent entries. + * size == 0: Entry was not logged via efi_pstore_write(), + *and efi_pstore driver will continue reading subsequent entries. + * size 0: