Re: [RFC] DMA mapping error check analysis
On Sep 10 Clemens Ladisch wrote: > fw_iso_buffer_map_dma() maps as many pages as it can, and saves in > ->page_count_mapped how many pages need unmapping. > > When fw_iso_buffer_map_dma() fails, ioctl_create_iso_context() does _not_ > call fw_iso_buffer_destroy() but takes care to not change the cdev's > state in any other way. So ioctl_create_iso_context() can be called > again and will then call fw_iso_buffer_map_dma(), which will happily > map the pages a second time, overwriting the previous mapped addresses. Indeed; thank you. I make a note to fix this when I get some time. -- Stefan Richter -=-===-- =--= -=-=- http://arcgraph.de/sr/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
Stefan Richter wrote: > On Sep 10 Shuah Khan wrote: http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis >>> File Name # of calls Status drivers/firewire/core-iso.c 1Unmap Broken drivers/firewire/ohci.c 1Unmap Broken >>> >>> In ohci.c, ar_context_release() takes care of cleanup. >>> >>> In core-iso.c, on failure, the callers are responsible to call >>> fw_iso_buffer_destroy() eventually. (ioctl_create_iso_context() >>> doesn't do this correctly if it's called multiple times.) >> >> Thanks. I updated the page with your comments. I moved ohci.c to Good >> status and left core-iso.c in Unmap Broken in case >> ioctl_create_iso_context() case is worth fixing. > > I don't see what could go wrong if ioctl_create_iso_context() is called > multiple times. fw_iso_buffer_map_dma() maps as many pages as it can, and saves in ->page_count_mapped how many pages need unmapping. When fw_iso_buffer_map_dma() fails, ioctl_create_iso_context() does _not_ call fw_iso_buffer_destroy() but takes care to not change the cdev's state in any other way. So ioctl_create_iso_context() can be called again and will then call fw_iso_buffer_map_dma(), which will happily map the pages a second time, overwriting the previous mapped addresses. Regards, Clemens -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
On Sep 10 Shuah Khan wrote: > > > > > > > http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis > > > > > File Name # of calls Status > > > drivers/firewire/core-iso.c 1Unmap Broken > > > drivers/firewire/ohci.c 1Unmap Broken > > > > In ohci.c, ar_context_release() takes care of cleanup. > > > > In core-iso.c, on failure, the callers are responsible to call > > fw_iso_buffer_destroy() eventually. (ioctl_create_iso_context() > > doesn't do this correctly if it's called multiple times.) > > > > Thanks. I updated the page with your comments. I moved ohci.c to Good > status and left core-iso.c in Unmap Broken in case > ioctl_create_iso_context() case is worth fixing. I don't see what could go wrong if ioctl_create_iso_context() is called multiple times. But I wrote the current (= v3.5-rc1) serialization code in it, hence am blind for mistakes which are my own. So anyboy who spots an actual problem please describe it, or even better send a patch. (Hmm, fw_device_op_mmap()'s fail: path is executed outside the client->lock protected section. That might be a problem. I need to look further into it.) -- Stefan Richter -=-===-- =--= -=-=- http://arcgraph.de/sr/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
> > > > http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis > > > File Name # of calls Status > > drivers/firewire/core-iso.c 1Unmap Broken > > drivers/firewire/ohci.c 1Unmap Broken > > In ohci.c, ar_context_release() takes care of cleanup. > > In core-iso.c, on failure, the callers are responsible to call > fw_iso_buffer_destroy() eventually. (ioctl_create_iso_context() > doesn't do this correctly if it's called multiple times.) > Thanks. I updated the page with your comments. I moved ohci.c to Good status and left core-iso.c in Unmap Broken in case ioctl_create_iso_context() case is worth fixing. -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
Shuah Khan wrote: > I analyzed all calls to dma_map_single() and dma_map_page() in the > kernel, to see if callers check for mapping errors, before using the > returned address. > > The goal of this analysis is to find drivers that currently do not > check dma mapping errors, and fix them. > > I documented the results of this analysis: > > http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis > File Name # of calls Status > drivers/firewire/core-iso.c 1Unmap Broken > drivers/firewire/ohci.c 1Unmap Broken In ohci.c, ar_context_release() takes care of cleanup. In core-iso.c, on failure, the callers are responsible to call fw_iso_buffer_destroy() eventually. (ioctl_create_iso_context() doesn't do this correctly if it's called multiple times.) Regards, Clemens -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
Shuah Khan wrote: I analyzed all calls to dma_map_single() and dma_map_page() in the kernel, to see if callers check for mapping errors, before using the returned address. The goal of this analysis is to find drivers that currently do not check dma mapping errors, and fix them. I documented the results of this analysis: http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis File Name # of calls Status drivers/firewire/core-iso.c 1Unmap Broken drivers/firewire/ohci.c 1Unmap Broken In ohci.c, ar_context_release() takes care of cleanup. In core-iso.c, on failure, the callers are responsible to call fw_iso_buffer_destroy() eventually. (ioctl_create_iso_context() doesn't do this correctly if it's called multiple times.) Regards, Clemens -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis File Name # of calls Status drivers/firewire/core-iso.c 1Unmap Broken drivers/firewire/ohci.c 1Unmap Broken In ohci.c, ar_context_release() takes care of cleanup. In core-iso.c, on failure, the callers are responsible to call fw_iso_buffer_destroy() eventually. (ioctl_create_iso_context() doesn't do this correctly if it's called multiple times.) Thanks. I updated the page with your comments. I moved ohci.c to Good status and left core-iso.c in Unmap Broken in case ioctl_create_iso_context() case is worth fixing. -- Shuah -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
On Sep 10 Shuah Khan wrote: http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis File Name # of calls Status drivers/firewire/core-iso.c 1Unmap Broken drivers/firewire/ohci.c 1Unmap Broken In ohci.c, ar_context_release() takes care of cleanup. In core-iso.c, on failure, the callers are responsible to call fw_iso_buffer_destroy() eventually. (ioctl_create_iso_context() doesn't do this correctly if it's called multiple times.) Thanks. I updated the page with your comments. I moved ohci.c to Good status and left core-iso.c in Unmap Broken in case ioctl_create_iso_context() case is worth fixing. I don't see what could go wrong if ioctl_create_iso_context() is called multiple times. But I wrote the current (= v3.5-rc1) serialization code in it, hence am blind for mistakes which are my own. So anyboy who spots an actual problem please describe it, or even better send a patch. (Hmm, fw_device_op_mmap()'s fail: path is executed outside the client-lock protected section. That might be a problem. I need to look further into it.) -- Stefan Richter -=-===-- =--= -=-=- http://arcgraph.de/sr/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
Stefan Richter wrote: On Sep 10 Shuah Khan wrote: http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis File Name # of calls Status drivers/firewire/core-iso.c 1Unmap Broken drivers/firewire/ohci.c 1Unmap Broken In ohci.c, ar_context_release() takes care of cleanup. In core-iso.c, on failure, the callers are responsible to call fw_iso_buffer_destroy() eventually. (ioctl_create_iso_context() doesn't do this correctly if it's called multiple times.) Thanks. I updated the page with your comments. I moved ohci.c to Good status and left core-iso.c in Unmap Broken in case ioctl_create_iso_context() case is worth fixing. I don't see what could go wrong if ioctl_create_iso_context() is called multiple times. fw_iso_buffer_map_dma() maps as many pages as it can, and saves in -page_count_mapped how many pages need unmapping. When fw_iso_buffer_map_dma() fails, ioctl_create_iso_context() does _not_ call fw_iso_buffer_destroy() but takes care to not change the cdev's state in any other way. So ioctl_create_iso_context() can be called again and will then call fw_iso_buffer_map_dma(), which will happily map the pages a second time, overwriting the previous mapped addresses. Regards, Clemens -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
On Sep 10 Clemens Ladisch wrote: fw_iso_buffer_map_dma() maps as many pages as it can, and saves in -page_count_mapped how many pages need unmapping. When fw_iso_buffer_map_dma() fails, ioctl_create_iso_context() does _not_ call fw_iso_buffer_destroy() but takes care to not change the cdev's state in any other way. So ioctl_create_iso_context() can be called again and will then call fw_iso_buffer_map_dma(), which will happily map the pages a second time, overwriting the previous mapped addresses. Indeed; thank you. I make a note to fix this when I get some time. -- Stefan Richter -=-===-- =--= -=-=- http://arcgraph.de/sr/ -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
On Fri, 2012-09-07 at 12:20 -0400, Alan Stern wrote: > On Fri, 7 Sep 2012, Shuah Khan wrote: > > > I analyzed all calls to dma_map_single() and dma_map_page() in the > > kernel, to see if callers check for mapping errors, before using the > > returned address. > > > > The goal of this analysis is to find drivers that currently do not > > check dma mapping errors, and fix them. > > > > I documented the results of this analysis: > > > > http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis > > > > Please review and give me feedback on the analysis and the proposed > > next steps. > > Your first table (dma_map_single) lists drivers/usb/core/usb.c and > marks it as Bad. This is a mistake because the code is #ifdef'ed out. > It hasn't been used in many years; it should be removed. Thanks for catching it. I did note that in my research notes and that was left out by mistake when I put the table together. Table is updated now with your comment and marked it a Cleanup item. -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
On Fri, 7 Sep 2012, Shuah Khan wrote: > I analyzed all calls to dma_map_single() and dma_map_page() in the > kernel, to see if callers check for mapping errors, before using the > returned address. > > The goal of this analysis is to find drivers that currently do not > check dma mapping errors, and fix them. > > I documented the results of this analysis: > > http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis > > Please review and give me feedback on the analysis and the proposed > next steps. Your first table (dma_map_single) lists drivers/usb/core/usb.c and marks it as Bad. This is a mistake because the code is #ifdef'ed out. It hasn't been used in many years; it should be removed. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] DMA mapping error check analysis
I analyzed all calls to dma_map_single() and dma_map_page() in the kernel, to see if callers check for mapping errors, before using the returned address. The goal of this analysis is to find drivers that currently do not check dma mapping errors, and fix them. I documented the results of this analysis: http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis Please review and give me feedback on the analysis and the proposed next steps. Thanks, -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[RFC] DMA mapping error check analysis
I analyzed all calls to dma_map_single() and dma_map_page() in the kernel, to see if callers check for mapping errors, before using the returned address. The goal of this analysis is to find drivers that currently do not check dma mapping errors, and fix them. I documented the results of this analysis: http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis Please review and give me feedback on the analysis and the proposed next steps. Thanks, -- Shuah -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
On Fri, 7 Sep 2012, Shuah Khan wrote: I analyzed all calls to dma_map_single() and dma_map_page() in the kernel, to see if callers check for mapping errors, before using the returned address. The goal of this analysis is to find drivers that currently do not check dma mapping errors, and fix them. I documented the results of this analysis: http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis Please review and give me feedback on the analysis and the proposed next steps. Your first table (dma_map_single) lists drivers/usb/core/usb.c and marks it as Bad. This is a mistake because the code is #ifdef'ed out. It hasn't been used in many years; it should be removed. Alan Stern -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] DMA mapping error check analysis
On Fri, 2012-09-07 at 12:20 -0400, Alan Stern wrote: On Fri, 7 Sep 2012, Shuah Khan wrote: I analyzed all calls to dma_map_single() and dma_map_page() in the kernel, to see if callers check for mapping errors, before using the returned address. The goal of this analysis is to find drivers that currently do not check dma mapping errors, and fix them. I documented the results of this analysis: http://linuxdriverproject.org/mediawiki/index.php/DMA_Mapping_Error_Analysis Please review and give me feedback on the analysis and the proposed next steps. Your first table (dma_map_single) lists drivers/usb/core/usb.c and marks it as Bad. This is a mistake because the code is #ifdef'ed out. It hasn't been used in many years; it should be removed. Thanks for catching it. I did note that in my research notes and that was left out by mistake when I put the table together. Table is updated now with your comment and marked it a Cleanup item. -- Shuah -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dma mapping error check analysis
On Fri, 2012-08-17 at 10:11 -0400, Konrad Rzeszutek Wilk wrote: > On Fri, Aug 10, 2012 at 04:46:42PM -0600, Shuah Khan wrote: > > I analyzed current calls to dma_map_single() and dma_map_page() in the > > kernel > > to see if dma mapping errors are checked after mapping routines return. > > > > Reference linux-next August 6 2012. > > > > This analysis stemmed from the discussion on my patch that disables swiotlb > > overflow as a first step towards removing the support all together. Please > > refer to thread below: > > > > https://lkml.org/lkml/2012/7/24/391 > > > > The goal of this analysis is to find drivers that don't currently check dma > > mapping errors and fix them. I did a grep for dma_map_single() and > > dma_map_page() and looked at the code that calls these routines. I > > classified > > the results of dma mapping error check status as follows: > > > > Broken: > > 1. No error checks > > 2. Partial checks - In that source file, not all calls are followed by > > checks. > > 3. Checks dma mapping errors, doesn't unmap already mapped pages when > > mapping > >error occurs in the middle of a multiple mapping attempt. > > > > The first two categories are classified as broken and need fixing. > > > > The third one needs fixing, since it leaves dangling mapped pages, and holds > > on to them which is equivalent to memory leak. Some drivers release all > > mapped > > pages when the device closes, but others don't. Not doing unmap might be > > harmless on some architectures going by the comments I found in some source > > files. > > > > Good: > > 1. Checks dma mapping errors and unmaps already mapped pages when mapping > >error occurs in the middle of a multiple mapping attempt. > > 2. Checks dma mapping errors without unlikely() > > 3. Checks dma mapping errors with unlikely() > > > > I lumped the above three cases as good cases. Using unlikely() is icing on > > the > > cake, and something we need to be concerned about compared to other > > problems in > > this area. > > > > - dmap_map_single() - results > > No error checks - 195 (46%) > > Partial checks - 46 (11%) > > Doesn't unmap: 26 (6%) > > Good: 147 (35%) > > > > - dma_map_page() - results > > No error checks: 61 (59%) > > Partial checks: 7 (.06%) > > Doesn't unmap: 15 (14.5%) > > Good: 20 (19%) > > > > In summary a large % of the cases (> 50%) go unchecked. That raises the > > following questions: > > > > When do mapping errors get detected? > > How often do these errors occur? > > Why don't we see failures related to missing dma mapping error checks? > > Are they silent failures? > > > > Based on what I found, I am not too eager to remove swiotlb overflow support > > which would increase the probability of returning dma mapping errors. > > > > However I propose the following to gather more information: > > > > - Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is > > triggered. (This is a delta on the disable swiotlb patch I sent a few > > weeks > > ago - References in this posting). > > As opposed to printk(KERN_ERR ? Why? printk(KERN_ERR) is just fine. > > > - Change dma_map_single() and dma_map_page() to track how many times they > > return before attempting to fix all the places that don't do dma mapping > > error checks. (Maybe a counter that keeps track, pr_* is not an option). > > Perhaps this should be done in the DMA debug API instead? Yes that is a good idea. Will explore that. -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dma mapping error check analysis
On Fri, Aug 10, 2012 at 04:46:42PM -0600, Shuah Khan wrote: > I analyzed current calls to dma_map_single() and dma_map_page() in the kernel > to see if dma mapping errors are checked after mapping routines return. > > Reference linux-next August 6 2012. > > This analysis stemmed from the discussion on my patch that disables swiotlb > overflow as a first step towards removing the support all together. Please > refer to thread below: > > https://lkml.org/lkml/2012/7/24/391 > > The goal of this analysis is to find drivers that don't currently check dma > mapping errors and fix them. I did a grep for dma_map_single() and > dma_map_page() and looked at the code that calls these routines. I classified > the results of dma mapping error check status as follows: > > Broken: > 1. No error checks > 2. Partial checks - In that source file, not all calls are followed by checks. > 3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping >error occurs in the middle of a multiple mapping attempt. > > The first two categories are classified as broken and need fixing. > > The third one needs fixing, since it leaves dangling mapped pages, and holds > on to them which is equivalent to memory leak. Some drivers release all mapped > pages when the device closes, but others don't. Not doing unmap might be > harmless on some architectures going by the comments I found in some source > files. > > Good: > 1. Checks dma mapping errors and unmaps already mapped pages when mapping >error occurs in the middle of a multiple mapping attempt. > 2. Checks dma mapping errors without unlikely() > 3. Checks dma mapping errors with unlikely() > > I lumped the above three cases as good cases. Using unlikely() is icing on the > cake, and something we need to be concerned about compared to other problems > in > this area. > > - dmap_map_single() - results > No error checks - 195 (46%) > Partial checks - 46 (11%) > Doesn't unmap: 26 (6%) > Good: 147 (35%) > > - dma_map_page() - results > No error checks: 61 (59%) > Partial checks: 7 (.06%) > Doesn't unmap: 15 (14.5%) > Good: 20 (19%) > > In summary a large % of the cases (> 50%) go unchecked. That raises the > following questions: > > When do mapping errors get detected? > How often do these errors occur? > Why don't we see failures related to missing dma mapping error checks? > Are they silent failures? > > Based on what I found, I am not too eager to remove swiotlb overflow support > which would increase the probability of returning dma mapping errors. > > However I propose the following to gather more information: > > - Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is > triggered. (This is a delta on the disable swiotlb patch I sent a few weeks > ago - References in this posting). As opposed to printk(KERN_ERR ? Why? > - Change dma_map_single() and dma_map_page() to track how many times they > return before attempting to fix all the places that don't do dma mapping > error checks. (Maybe a counter that keeps track, pr_* is not an option). Perhaps this should be done in the DMA debug API instead? > > Comments, thoughts on the analysis and proposal are welcome. > > -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dma mapping error check analysis
On Fri, Aug 10, 2012 at 04:46:42PM -0600, Shuah Khan wrote: I analyzed current calls to dma_map_single() and dma_map_page() in the kernel to see if dma mapping errors are checked after mapping routines return. Reference linux-next August 6 2012. This analysis stemmed from the discussion on my patch that disables swiotlb overflow as a first step towards removing the support all together. Please refer to thread below: https://lkml.org/lkml/2012/7/24/391 The goal of this analysis is to find drivers that don't currently check dma mapping errors and fix them. I did a grep for dma_map_single() and dma_map_page() and looked at the code that calls these routines. I classified the results of dma mapping error check status as follows: Broken: 1. No error checks 2. Partial checks - In that source file, not all calls are followed by checks. 3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple mapping attempt. The first two categories are classified as broken and need fixing. The third one needs fixing, since it leaves dangling mapped pages, and holds on to them which is equivalent to memory leak. Some drivers release all mapped pages when the device closes, but others don't. Not doing unmap might be harmless on some architectures going by the comments I found in some source files. Good: 1. Checks dma mapping errors and unmaps already mapped pages when mapping error occurs in the middle of a multiple mapping attempt. 2. Checks dma mapping errors without unlikely() 3. Checks dma mapping errors with unlikely() I lumped the above three cases as good cases. Using unlikely() is icing on the cake, and something we need to be concerned about compared to other problems in this area. - dmap_map_single() - results No error checks - 195 (46%) Partial checks - 46 (11%) Doesn't unmap: 26 (6%) Good: 147 (35%) - dma_map_page() - results No error checks: 61 (59%) Partial checks: 7 (.06%) Doesn't unmap: 15 (14.5%) Good: 20 (19%) In summary a large % of the cases ( 50%) go unchecked. That raises the following questions: When do mapping errors get detected? How often do these errors occur? Why don't we see failures related to missing dma mapping error checks? Are they silent failures? Based on what I found, I am not too eager to remove swiotlb overflow support which would increase the probability of returning dma mapping errors. However I propose the following to gather more information: - Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is triggered. (This is a delta on the disable swiotlb patch I sent a few weeks ago - References in this posting). As opposed to printk(KERN_ERR ? Why? - Change dma_map_single() and dma_map_page() to track how many times they return before attempting to fix all the places that don't do dma mapping error checks. (Maybe a counter that keeps track, pr_* is not an option). Perhaps this should be done in the DMA debug API instead? Comments, thoughts on the analysis and proposal are welcome. -- Shuah -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: dma mapping error check analysis
On Fri, 2012-08-17 at 10:11 -0400, Konrad Rzeszutek Wilk wrote: On Fri, Aug 10, 2012 at 04:46:42PM -0600, Shuah Khan wrote: I analyzed current calls to dma_map_single() and dma_map_page() in the kernel to see if dma mapping errors are checked after mapping routines return. Reference linux-next August 6 2012. This analysis stemmed from the discussion on my patch that disables swiotlb overflow as a first step towards removing the support all together. Please refer to thread below: https://lkml.org/lkml/2012/7/24/391 The goal of this analysis is to find drivers that don't currently check dma mapping errors and fix them. I did a grep for dma_map_single() and dma_map_page() and looked at the code that calls these routines. I classified the results of dma mapping error check status as follows: Broken: 1. No error checks 2. Partial checks - In that source file, not all calls are followed by checks. 3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple mapping attempt. The first two categories are classified as broken and need fixing. The third one needs fixing, since it leaves dangling mapped pages, and holds on to them which is equivalent to memory leak. Some drivers release all mapped pages when the device closes, but others don't. Not doing unmap might be harmless on some architectures going by the comments I found in some source files. Good: 1. Checks dma mapping errors and unmaps already mapped pages when mapping error occurs in the middle of a multiple mapping attempt. 2. Checks dma mapping errors without unlikely() 3. Checks dma mapping errors with unlikely() I lumped the above three cases as good cases. Using unlikely() is icing on the cake, and something we need to be concerned about compared to other problems in this area. - dmap_map_single() - results No error checks - 195 (46%) Partial checks - 46 (11%) Doesn't unmap: 26 (6%) Good: 147 (35%) - dma_map_page() - results No error checks: 61 (59%) Partial checks: 7 (.06%) Doesn't unmap: 15 (14.5%) Good: 20 (19%) In summary a large % of the cases ( 50%) go unchecked. That raises the following questions: When do mapping errors get detected? How often do these errors occur? Why don't we see failures related to missing dma mapping error checks? Are they silent failures? Based on what I found, I am not too eager to remove swiotlb overflow support which would increase the probability of returning dma mapping errors. However I propose the following to gather more information: - Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is triggered. (This is a delta on the disable swiotlb patch I sent a few weeks ago - References in this posting). As opposed to printk(KERN_ERR ? Why? printk(KERN_ERR) is just fine. - Change dma_map_single() and dma_map_page() to track how many times they return before attempting to fix all the places that don't do dma mapping error checks. (Maybe a counter that keeps track, pr_* is not an option). Perhaps this should be done in the DMA debug API instead? Yes that is a good idea. Will explore that. -- Shuah -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
dma mapping error check analysis
I analyzed current calls to dma_map_single() and dma_map_page() in the kernel to see if dma mapping errors are checked after mapping routines return. Reference linux-next August 6 2012. This analysis stemmed from the discussion on my patch that disables swiotlb overflow as a first step towards removing the support all together. Please refer to thread below: https://lkml.org/lkml/2012/7/24/391 The goal of this analysis is to find drivers that don't currently check dma mapping errors and fix them. I did a grep for dma_map_single() and dma_map_page() and looked at the code that calls these routines. I classified the results of dma mapping error check status as follows: Broken: 1. No error checks 2. Partial checks - In that source file, not all calls are followed by checks. 3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple mapping attempt. The first two categories are classified as broken and need fixing. The third one needs fixing, since it leaves dangling mapped pages, and holds on to them which is equivalent to memory leak. Some drivers release all mapped pages when the device closes, but others don't. Not doing unmap might be harmless on some architectures going by the comments I found in some source files. Good: 1. Checks dma mapping errors and unmaps already mapped pages when mapping error occurs in the middle of a multiple mapping attempt. 2. Checks dma mapping errors without unlikely() 3. Checks dma mapping errors with unlikely() I lumped the above three cases as good cases. Using unlikely() is icing on the cake, and something we need to be concerned about compared to other problems in this area. - dmap_map_single() - results No error checks - 195 (46%) Partial checks - 46 (11%) Doesn't unmap: 26 (6%) Good: 147 (35%) - dma_map_page() - results No error checks: 61 (59%) Partial checks: 7 (.06%) Doesn't unmap: 15 (14.5%) Good: 20 (19%) In summary a large % of the cases (> 50%) go unchecked. That raises the following questions: When do mapping errors get detected? How often do these errors occur? Why don't we see failures related to missing dma mapping error checks? Are they silent failures? Based on what I found, I am not too eager to remove swiotlb overflow support which would increase the probability of returning dma mapping errors. However I propose the following to gather more information: - Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is triggered. (This is a delta on the disable swiotlb patch I sent a few weeks ago - References in this posting). - Change dma_map_single() and dma_map_page() to track how many times they return before attempting to fix all the places that don't do dma mapping error checks. (Maybe a counter that keeps track, pr_* is not an option). Comments, thoughts on the analysis and proposal are welcome. -- Shuah -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
dma mapping error check analysis
I analyzed current calls to dma_map_single() and dma_map_page() in the kernel to see if dma mapping errors are checked after mapping routines return. Reference linux-next August 6 2012. This analysis stemmed from the discussion on my patch that disables swiotlb overflow as a first step towards removing the support all together. Please refer to thread below: https://lkml.org/lkml/2012/7/24/391 The goal of this analysis is to find drivers that don't currently check dma mapping errors and fix them. I did a grep for dma_map_single() and dma_map_page() and looked at the code that calls these routines. I classified the results of dma mapping error check status as follows: Broken: 1. No error checks 2. Partial checks - In that source file, not all calls are followed by checks. 3. Checks dma mapping errors, doesn't unmap already mapped pages when mapping error occurs in the middle of a multiple mapping attempt. The first two categories are classified as broken and need fixing. The third one needs fixing, since it leaves dangling mapped pages, and holds on to them which is equivalent to memory leak. Some drivers release all mapped pages when the device closes, but others don't. Not doing unmap might be harmless on some architectures going by the comments I found in some source files. Good: 1. Checks dma mapping errors and unmaps already mapped pages when mapping error occurs in the middle of a multiple mapping attempt. 2. Checks dma mapping errors without unlikely() 3. Checks dma mapping errors with unlikely() I lumped the above three cases as good cases. Using unlikely() is icing on the cake, and something we need to be concerned about compared to other problems in this area. - dmap_map_single() - results No error checks - 195 (46%) Partial checks - 46 (11%) Doesn't unmap: 26 (6%) Good: 147 (35%) - dma_map_page() - results No error checks: 61 (59%) Partial checks: 7 (.06%) Doesn't unmap: 15 (14.5%) Good: 20 (19%) In summary a large % of the cases ( 50%) go unchecked. That raises the following questions: When do mapping errors get detected? How often do these errors occur? Why don't we see failures related to missing dma mapping error checks? Are they silent failures? Based on what I found, I am not too eager to remove swiotlb overflow support which would increase the probability of returning dma mapping errors. However I propose the following to gather more information: - Change swiotlb to log (pr_info or pr_debug) cases where overflow buffer is triggered. (This is a delta on the disable swiotlb patch I sent a few weeks ago - References in this posting). - Change dma_map_single() and dma_map_page() to track how many times they return before attempting to fix all the places that don't do dma mapping error checks. (Maybe a counter that keeps track, pr_* is not an option). Comments, thoughts on the analysis and proposal are welcome. -- Shuah -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/