Re: [PATCH 08/29] drivers, md: convert mddev.active from atomic_t to refcount_t

2017-03-14 Thread James Bottomley
On Tue, 2017-03-14 at 12:29 +, Reshetova, Elena wrote:
> > Elena Reshetova  writes:
> > 
> > > refcount_t type and corresponding API should be
> > > used instead of atomic_t when the variable is used as
> > > a reference counter. This allows to avoid accidental
> > > refcounter overflows that might lead to use-after-free
> > > situations.
> > > 
> > > Signed-off-by: Elena Reshetova 
> > > Signed-off-by: Hans Liljestrand 
> > > Signed-off-by: Kees Cook 
> > > Signed-off-by: David Windsor 
> > > ---
> > >  drivers/md/md.c | 6 +++---
> > >  drivers/md/md.h | 3 ++-
> > >  2 files changed, 5 insertions(+), 4 deletions(-)
> > 
> > When booting linux-next (specifically 5be4921c9958ec) I'm seeing
> > the
> > backtrace below. I suspect this patch is just exposing an existing
> > issue?
> 
> Yes, we have actually been following this issue in the another
> thread. 
> It looks like the object is re-used somehow, but I can't quite
> understand how just by reading the code. 
> This was what I put into the previous thread:
> 
> "The log below indicates that you are using your refcounter in a bit
> weird way in mddev_find(). 
> However, I can't find the place (just by reading the code) where you
> would increment refcounter from zero (vs. setting it to one).
> It looks like you either iterate over existing nodes (and increment
> their counters, which should be >= 1 at the time of increment) or
> create a new node, but then mddev_init() sets the counter to 1. "
> 
> If you can help to understand what is going on with the object
> creation/destruction, would be appreciated!
> 
> Also Shaohua Li stopped this patch coming from his tree since the 
> issue was caught at that time, so we are not going to merge this 
> until we figure it out. 

Asking on the correct list (dm-devel) would have got you the easy
answer:  The refcount behind mddev->active is a genuine atomic.  It has
refcount properties but only if the array fails to initialise (in that
case, final put kills it).  Once it's added to the system as a gendisk,
it cannot be freed until md_free().  Thus its ->active count can go to
zero (when it becomes inactive; usually because of an unmount). On a
simple allocation regardless of outcome, the last executed statement in
md_alloc is mddev_put(): that destroys the device if we didn't manage
to create it or returns 0 and adds an inactive device to the system
which the user can get with mddev_find().

James




Re: [Ksummit-discuss] Including images on Sphinx documents

2016-11-21 Thread James Bottomley
On Mon, 2016-11-21 at 12:06 -0200, Mauro Carvalho Chehab wrote:
> Em Mon, 21 Nov 2016 11:39:41 +0100
> Johannes Berg  escreveu:
> > On Sat, 2016-11-19 at 10:15 -0700, Jonathan Corbet wrote:
> > 
> > > Rather than beating our heads against the wall trying to convert
> > > between various image formats, maybe we need to take a step
> > > back.  We're trying to build better documentation, and there is
> > > certainly a place for diagrams and such in that
> > > documentation.  Johannes was asking about it for the 802.11 docs, 
> > > and I know Paul has run into these issues with the RCU docs as
> > > well.  Might there be a tool or an extension out there that would
> > > allow us to express these diagrams in a text-friendly, editable
> > > form?
> > > 
> > > With some effort, I bet we could get rid of a number of the 
> > > images, and perhaps end up with something that makes sense when 
> > > read in the .rst source files as an extra benefit.   
> > 
> > I tend to agree, and I think that having this readable in the text
> > would be good.
> > 
> > You had pointed me to this plugin before
> > https://pythonhosted.org/sphinxcontrib-aafig/
> > 
> > but I don't think it can actually represent any of the pictures.
> 
> No, but there are some ascii art images inside some txt/rst files
> and inside some kernel-doc comments. We could either use the above
> extension for them or to convert into some image. The ascii art
> images I saw seem to be diagrams, so Graphviz would allow replacing
> most of them, if not all.

Please don't replace ASCII art that effectively conveys conceptual
diagrams.  If you do, we'll wind up in situations where someone hasn't
built the docs and doesn't possess the tools to see a diagram that was
previously shown by every text editor (or can't be bothered to dig out
the now separate file).  In the name of creating "prettier" diagrams
(and final doc), we'll have damaged capacity to understand stuff by
just reading the source if this diagram is in kernel doc comments.  I
think this is a good application of "if it ain't broke, don't fix it".

James

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ksummit-discuss] Including images on Sphinx documents

2016-11-17 Thread James Bottomley
On Thu, 2016-11-17 at 13:16 -0200, Mauro Carvalho Chehab wrote:
> Hi Ted,
> 
> Em Thu, 17 Nov 2016 09:52:44 -0500
> Theodore Ts'o  escreveu:
> 
> > On Thu, Nov 17, 2016 at 12:07:15PM +0100, Arnd Bergmann wrote:
> > > [adding Linus for clarification]
> > > 
> > > I understood the concern as being about binary files that you
> > > cannot
> > > modify with classic 'patch', which is a separate issue.  
> > 
> > I think the other complaint is that the image files aren't "source"
> > in
> > the proper term, since they are *not* the preferred form for
> > modification --- that's the svg files.  Beyond the license
> > compliance
> > issues (which are satisified because the .svg files are included in
> > the git tree), there is the SCM cleaniless argument of not
> > including
> > generated files in the distribution, since this increases the
> > opportunites for the "real" source file and the generated source
> > file
> > to get out of sync.  (As just one example, if the patch can't
> > represent the change to binary file.)
> > 
> > I do check in generated files on occasion --- usually because I
> > don't
> > trust autoconf to be a stable in terms of generating a correct
> > configure file from a configure.in across different versions of
> > autoconf and different macro libraries that might be installed on
> > the
> > system.  So this isn't a hard and fast rule by any means (although
> > Linus may be more strict than I on that issue).
> > 
> > I don't understand why it's so terrible to have generate the image
> > file from the .svg file in a Makefile rule, and then copy it
> > somewhere
> > else if Sphinx is too dumb to fetch it from the normal location?
> 
> The images whose source are in .svg are now generated via Makefile
> for the PDF output (after my patches, already applied to the docs
> -next
> tree).
> 
> So, the problem that remains is for those images whose source
> is a bitmap. If we want to stick with the Sphinx supported formats,
> we have only two options for bitmaps: jpg or png. We could eventually
> use uuencode or base64 to make sure that the patches won't use
> git binary diff extension, or, as Arnd proposed, use a portable
> bitmap format, in ascii, converting via Makefile, but losing
> the alpha channel with makes the background transparent.

If it can use svg, why not use that?  SVG files can be a simple xml
wrapper around a wide variety of graphic image formats which are
embedded in the svg using the data-uri format, you know ...

Anything that handles SVGs should be able to handle all the embeddable
image formats, which should give you a way around image restrictions
whatever it is would otherwise have.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: IR remote stopped working in kernels 4.5 and 4.6

2016-07-04 Thread James Bottomley
On Mon, 2016-07-04 at 23:08 +0200, Heiner Kallweit wrote:
> Am 04.07.2016 um 22:36 schrieb James Bottomley:
> > This looks to be a problem with the rc subsystem.  The IR 
> > controller in question is part of a cx8800 atsc card. In the 4.4 
> > kernel, where it works, this is what ir-keytable says:
> > 
> > Found /sys/class/rc/rc0/ (/dev/input/event12) with:
> > Driver cx88xx, table rc-hauppauge
> > Supported protocols: other lirc rc-5 jvc sony nec sanyo mce-kbd
> > rc-6 sharp xmp 
> > Enabled protocols: lirc nec 
> > Name: cx88 IR (pcHDTV HD3000 HDTV)
> > bus: 1, vendor/product: 7063:3000, version: 0x0001
> > Repeat delay = 500 ms, repeat period = 125 ms
> > 
> > And in 4.6, where it doesn't work:
> > 
> > Found /sys/class/rc/rc0/ (/dev/input/event12) with:
> > Driver cx88xx, table rc-hauppauge
> > Supported protocols: lirc 
> > Enabled protocols: lirc 
> > Name: cx88 IR (pcHDTV HD3000 HDTV)
> > bus: 1, vendor/product: 7063:3000, version: 0x0001
> > Repeat delay = 500 ms, repeat period = 125 ms
> > 
> > The particular remote in question seems to require the nec protocol 
> > to work and the failure in 4.5 and 4.6 is having any supported
> > protocols at all.  I can get the remote to start working again by 
> > adding the nec protocol:
> > 
> > echo nec > /sys/class/rc/rc0/protocols
> > 
> > But it would be nice to have this happen by default rather than 
> > having to add yet another work around init script.
> > 
> Meanwhile decoder modules are loaded on demand only. This can be done
> automatically w/o the need for additional init scripts.
> 
> If /etc/rc_maps.cfg includes a keymap with type NEC then the nec 
> decoder module is loaded automatically.
> 
> My rc_maps.cfg looks like this (and causes the SONY decoder module to 
> be loaded automatically):
> 
> #driver tablefile
> *   *sony-rm-sx800
> 
> And the keymap:
> 
> # table sony-rm-sx800, type SONY
> 0x110030KEY_PREVIOUS
> 0x110031KEY_NEXT
> 0x110033KEY_BACK
> ..
> ..


Well, to work in the 4.4 kernel, the rc_maps.cfg names a table with the
nec controller, so it seems that whatever is supposed to trigger
autoloading isn't.

Is ir-keytable supposed to do this?  The current debian testing one is

Package: ir-keytable
Source: v4l-utils
Version: 1.10.1-1

Which seems to be the most current one.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


IR remote stopped working in kernels 4.5 and 4.6

2016-07-04 Thread James Bottomley
This looks to be a problem with the rc subsystem.  The IR controller in
question is part of a cx8800 atsc card. In the 4.4 kernel, where it
works, this is what ir-keytable says:

Found /sys/class/rc/rc0/ (/dev/input/event12) with:
Driver cx88xx, table rc-hauppauge
Supported protocols: other lirc rc-5 jvc sony nec sanyo mce-kbd rc-6 
sharp xmp 
Enabled protocols: lirc nec 
Name: cx88 IR (pcHDTV HD3000 HDTV)
bus: 1, vendor/product: 7063:3000, version: 0x0001
Repeat delay = 500 ms, repeat period = 125 ms

And in 4.6, where it doesn't work:

Found /sys/class/rc/rc0/ (/dev/input/event12) with:
Driver cx88xx, table rc-hauppauge
Supported protocols: lirc 
Enabled protocols: lirc 
Name: cx88 IR (pcHDTV HD3000 HDTV)
bus: 1, vendor/product: 7063:3000, version: 0x0001
Repeat delay = 500 ms, repeat period = 125 ms

The particular remote in question seems to require the nec protocol to
work and the failure in 4.5 and 4.6 is having any supported protocols
at all.  I can get the remote to start working again by adding the nec
protocol:

echo nec > /sys/class/rc/rc0/protocols

But it would be nice to have this happen by default rather than having
to add yet another work around init script.

James

--
To unsubscribe from this list: send the line "unsubscribe linux-media" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 29/31] parisc: handle page-less SG entries

2015-08-13 Thread James Bottomley
On Thu, 2015-08-13 at 20:30 -0700, Dan Williams wrote:
 On Thu, Aug 13, 2015 at 7:31 AM, Christoph Hellwig h...@lst.de wrote:
  On Wed, Aug 12, 2015 at 09:01:02AM -0700, Linus Torvalds wrote:
  I'm assuming that anybody who wants to use the page-less
  scatter-gather lists always does so on memory that isn't actually
  virtually mapped at all, or only does so on sane architectures that
  are cache coherent at a physical level, but I'd like that assumption
  *documented* somewhere.
 
  It's temporarily mapped by kmap-like helpers.  That code isn't in
  this series. The most recent version of it is here:
 
  https://git.kernel.org/cgit/linux/kernel/git/djbw/nvdimm.git/commit/?h=pfnid=de8237c99fdb4352be2193f3a7610e902b9bb2f0
 
  note that it's not doing the cache flushing it would have to do yet, but
  it's also only enabled for x86 at the moment.
 
 For virtually tagged caches I assume we would temporarily map with
 kmap_atomic_pfn_t(), similar to how drm_clflush_pages() implements
 powerpc support.  However with DAX we could end up with multiple
 virtual aliases for a page-less pfn.

At least on some PA architectures, you have to be very careful.
Improperly managed, multiple aliases will cause the system to crash
(actually a machine check in the cache chequerboard). For the most
temperamental systems, we need the cache line flushed and the alias
mapping ejected from the TLB cache before we access the same page at an
inequivalent alias.

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC: prepare for struct scatterlist entries without page backing

2015-08-12 Thread James Bottomley
On Wed, 2015-08-12 at 09:05 +0200, Christoph Hellwig wrote:
 Dan Williams started to look into addressing I/O to and from
 Persistent Memory in his series from June:
 
   http://thread.gmane.org/gmane.linux.kernel.cross-arch/27944
 
 I've started looking into DMA mapping of these SGLs specifically instead
 of the map_pfn method in there.  In addition to supporting NVDIMM backed
 I/O I also suspect this would be highly useful for media drivers that
 go through nasty hoops to be able to DMA from/to their ioremapped regions,
 with vb2_dc_get_userptr in drivers/media/v4l2-core/videobuf2-dma-contig.c
 being a prime example for the unsafe hacks currently used.
 
 It turns out most DMA mapping implementation can handle SGLs without
 page structures with some fairly simple mechanical work.  Most of it
 is just about consistently using sg_phys.  For implementations that
 need to flush caches we need a new helper that skips these cache
 flushes if a entry doesn't have a kernel virtual address.
 
 However the ccio (parisc) and sba_iommu (parisc  ia64) IOMMUs seem
 to be operate mostly on virtual addresses.  It's a fairly odd concept
 that I don't fully grasp, so I'll need some help with those if we want
 to bring this forward.

I can explain that.  I think this doesn't apply to ia64 because it's
cache is PIPT, but on parisc, we have a VIPT cache.

On normal physically indexed architectures, when the iommu sees a DMA
transfer to/from physical memory, it also notifies the CPU to flush the
internal CPU caches of those lines.  This is usually an interlocking
step of the transfer to make sure the page is coherent before transfer
to/from the device (it's why the ia32 for instance is a coherent
architecture).  Because the system is physically indexed, there's no
need to worry about aliases.

On Virtually Indexed systems, like parisc, there is an aliasing problem.
The CCIO iommu unit (and all other iommu systems on parisc) have what's
called a local coherence index (LCI).  You program it as part of the
IOMMU page table and it tells the system which Virtual line in the cache
to flush as part of the IO transaction, thus still ensuring cache
coherence.  That's why we have to know the virtual as well as physical
addresses for the page.  The problem we have in Linux is that we have
two virtual addresses, which are often incoherent aliases: the user
virtual address and a kernel virtual address but we can only make the
page coherent with a single alias (only one LCI).  The way I/O on Linux
currently works is that get_user_pages actually flushes the user virtual
address, so that's expected to be coherent, so the address we program
into the VCI is the kernel virtual address.  Usually nothing in the
kernel has ever touched the page, so there's nothing to flush, but we do
it just in case.

In theory, for these non kernel page backed SG entries, we can make the
process more efficient by not flushing in gup and instead programming
the user virtual address into the local coherence index.  However,
simply zeroing the LCI will also work (except that poor VI zero line
will get flushed repeatedly, so it's probably best to pick a known
untouched line in the kernel).

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Xen-devel] RIP MTRR - status update for upcoming v4.2

2015-06-12 Thread James Bottomley
On Fri, 2015-06-12 at 16:15 -0700, Andy Lutomirski wrote:
 On Jun 12, 2015 12:59 AM, Jan Beulich jbeul...@suse.com wrote:
 
   On 12.06.15 at 01:23, toshi.k...@hp.com wrote:
   There are two usages on MTRRs:
1) MTRR entries set by firmware
2) MTRR entries set by OS drivers
  
   We can obsolete 2), but we have no control over 1).  As UEFI firmwares
   also set this up, this usage will continue to stay.  So, we should not
   get rid of the MTRR code that looks up the MTRR entries, while we have
   no need to modify them.
  
   Such MTRR entries provide safe guard to /dev/mem, which allows
   privileged user to access a range that may require UC mapping while
   the /dev/mem driver blindly maps it with WB.  MTRRs converts WB to UC in
   such a case.
 
  But it wouldn't be impossible to simply read the MTRRs upon boot,
  store the information, disable MTRRs, and correctly use PAT to
  achieve the same effect (i.e. the blindly maps part of course
  would need fixing).
 
 This may crash and burn badly when we call a UEFI function or an SMI
 happens.  I think we should just leave the MTRRs alone.

Wholeheartedly agree: PAT only works when the given memory map is in
operation but MTRRs function everywhere.  Anything that goes into real
mode or installs its own memory map won't see the Linux page attributes.

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REPOST PATCH 1/8] fence: dma-buf cross-device synchronization (v17)

2014-06-19 Thread James Bottomley
On Thu, 2014-06-19 at 11:19 -0700, Greg KH wrote:
 On Thu, Jun 19, 2014 at 01:45:30PM -0400, Rob Clark wrote:
  On Thu, Jun 19, 2014 at 1:00 PM, Greg KH gre...@linuxfoundation.org wrote:
   On Thu, Jun 19, 2014 at 10:00:18AM -0400, Rob Clark wrote:
   On Wed, Jun 18, 2014 at 9:13 PM, Greg KH gre...@linuxfoundation.org 
   wrote:
On Wed, Jun 18, 2014 at 12:36:54PM +0200, Maarten Lankhorst wrote:
+#define CREATE_TRACE_POINTS
+#include trace/events/fence.h
+
+EXPORT_TRACEPOINT_SYMBOL(fence_annotate_wait_on);
+EXPORT_TRACEPOINT_SYMBOL(fence_emit);
   
Are you really willing to live with these as tracepoints for forever?
What is the use of them in debugging?  Was it just for debugging the
fence code, or for something else?
   
+/**
+ * fence_context_alloc - allocate an array of fence contexts
+ * @num: [in]amount of contexts to allocate
+ *
+ * This function will return the first index of the number of fences 
allocated.
+ * The fence context is used for setting fence-context to a unique 
number.
+ */
+unsigned fence_context_alloc(unsigned num)
+{
+ BUG_ON(!num);
+ return atomic_add_return(num, fence_context_counter) - num;
+}
+EXPORT_SYMBOL(fence_context_alloc);
   
EXPORT_SYMBOL_GPL()?  Same goes for all of the exports in here.
Traditionally all of the driver core exports have been with this
marking, any objection to making that change here as well?
  
   tbh, I prefer EXPORT_SYMBOL()..  well, I'd prefer even more if there
   wasn't even a need for EXPORT_SYMBOL_GPL(), but sadly it is a fact of
   life.  We already went through this debate once with dma-buf.  We
   aren't going to change $evil_vendor's mind about non-gpl modules.  The
   only result will be a more flugly convoluted solution (ie. use syncpt
   EXPORT_SYMBOL() on top of fence EXPORT_SYMBOL_GPL()) just as a
   workaround, with the result that no-one benefits.
  
   It has been proven that using _GPL() exports have caused companies to
   release their code properly over the years, so as these really are
   Linux-only apis, please change them to be marked this way, it helps
   everyone out in the end.
  
  Well, maybe that is the true in some cases.  But it certainly didn't
  work out that way for dma-buf.  And I think the end result is worse.
  
  I don't really like coming down on the side of EXPORT_SYMBOL() instead
  of EXPORT_SYMBOL_GPL(), but if we do use EXPORT_SYMBOL_GPL() then the
  result will only be creative workarounds using the _GPL symbols
  indirectly by whatever is available via EXPORT_SYMBOL().  I don't
  really see how that will be better.
 
 You are saying that you _know_ companies will violate our license, so
 you should just give up?  And how do you know people aren't working on
 preventing those indirect usages as well?  :)
 
 Sorry, I'm not going to give up here, again, it has proven to work in
 the past in changing the ways of _very_ large companies, why stop now?

When you try to train a dog, you have to be consistent about it.  We're
fantastically inconsistent in symbol exports.

For instance, the mutex primitives are all EXPORT_SYMBOL(), so we're
telling proprietary modules they can use them.  However, when the kernel
is built with CONFIG_DEBUG_MUTEX, they all become
EXPORT_SYMBOL_GPL() ... what type of crazy message is that supposed to
send?  It's OK to use mutexes but it's potentially a GPL violation to
debug them?

We either need to decide that we have a defined and consistent part of
our API that's GPL only or make the bold statement that we don't have
any part of our API that's usable by non-GPL modules.  Right at the
minute we do neither and it confuses people no end about what is and
isn't allowed.

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REPOST PATCH 1/8] fence: dma-buf cross-device synchronization (v17)

2014-06-19 Thread James Bottomley
On Thu, 2014-06-19 at 15:39 -0700, H. Peter Anvin wrote:
 On 06/19/2014 01:01 PM, Greg KH wrote:
  On Thu, Jun 19, 2014 at 09:15:36PM +0200, Daniel Vetter wrote:
  On Thu, Jun 19, 2014 at 7:00 PM, Greg KH gre...@linuxfoundation.org 
  wrote:
  + BUG_ON(f1-context != f2-context);
 
  Nice, you just crashed the kernel, making it impossible to debug or
  recover :(
 
  agreed, that should probably be 'if (WARN_ON(...)) return NULL;'
 
  (but at least I wouldn't expect to hit that under console_lock so you
  should at least see the last N lines of the backtrace on the screen
  ;-))
 
  Lots of devices don't have console screens :)
 
  Aside: This is a pet peeve of mine and recently I've switched to
  rejecting all patch that have a BUG_ON, period.
  
  Please do, I have been for a few years now as well for the same reasons
  you cite.
  
 
 I'm actually concerned about this trend.  Downgrading things to WARN_ON
 can allow a security bug in the kernel to continue to exist, for
 example, or make the error message disappear.

Me too.  We use BUG_ON in the I/O subsystem where we're forced to
violate a guarantee.  When the choice is corrupt something or panic the
system, I prefer the latter every time.

 I am wondering if the right thing here isn't to have a user (command
 line?) settable policy as to how to proceed on an assert violation,
 instead of hardcoding it at compile time.

I'd say it depends on the consequence of the assertion violation.  We
have assertions that are largely theoretical, ones that govern process
internal state (so killing the process mostly sanitizes the system) and
a few that imply data loss or data corruption.

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 6/8] drivers: add Contiguous Memory Allocator

2011-08-03 Thread James Bottomley
[cc to ks-discuss added, since this may be a relevant topic]

On Tue, 2011-07-05 at 14:27 +0200, Arnd Bergmann wrote:
 On Tuesday 05 July 2011, Russell King - ARM Linux wrote:
  On Tue, Jul 05, 2011 at 09:41:48AM +0200, Marek Szyprowski wrote:
   The Contiguous Memory Allocator is a set of helper functions for DMA
   mapping framework that improves allocations of contiguous memory chunks.
   
   CMA grabs memory on system boot, marks it with CMA_MIGRATE_TYPE and
   gives back to the system. Kernel is allowed to allocate movable pages
   within CMA's managed memory so that it can be used for example for page
   cache when DMA mapping do not use it. On dma_alloc_from_contiguous()
   request such pages are migrated out of CMA area to free required
   contiguous block and fulfill the request. This allows to allocate large
   contiguous chunks of memory at any time assuming that there is enough
   free memory available in the system.
   
   This code is heavily based on earlier works by Michal Nazarewicz.
  
  And how are you addressing the technical concerns about aliasing of
  cache attributes which I keep bringing up with this and you keep
  ignoring and telling me that I'm standing in your way.

Just to chime in here, parisc has an identical issue.  If the CPU ever
sees an alias with different attributes for the same page, it will HPMC
the box (that's basically the bios will kill the system as being
architecturally inconsistent), so an architecture neutral solution on
this point is essential to us as well.

 This is of course an important issue, and it's the one item listed as
 TODO in the introductory mail that sent.
 
 It's also a preexisting problem as far as I can tell, and it needs
 to be solved in __dma_alloc for both cases, dma_alloc_from_contiguous
 and __alloc_system_pages as introduced in patch 7.
 
 We've discussed this back and forth, and it always comes down to
 one of two ugly solutions:
 
 1. Put all of the MIGRATE_CMA and pages into highmem and change
 __alloc_system_pages so it also allocates only from highmem pages.
 The consequences of this are that we always need to build kernels
 with highmem enabled and that we have less lowmem on systems that
 are already small, both of which can be fairly expensive unless
 you have lots of highmem already.

So this would require that systems using the API have a highmem? (parisc
doesn't today).

 2. Add logic to unmap pages from the linear mapping, which is
 very expensive because it forces the use of small pages in the
 linear mapping (or in parts of it), and possibly means walking
 all page tables to remove the PTEs on alloc and put them back
 in on free.
 
 I believe that Chunsang Jeong from Linaro is planning to
 implement both variants and post them for review, so we can
 decide which one to merge, or even to merge both and make
 it a configuration option. See also
 https://blueprints.launchpad.net/linaro-mm-sig/+spec/engr-mm-dma-mapping-2011.07
 
 I don't think we need to make merging the CMA patches depending on
 the other patches, it's clear that both need to be solved, and
 they are independent enough.

I assume from the above that ARM has a hardware page walker?

The way I'd fix this on parisc, because we have a software based TLB, is
to rely on the fact that a page may only be used either for DMA or for
Page Cache, so the aliases should never be interleaved.  Since you know
the point at which the page flips from DMA to Cache (and vice versa),
I'd purge the TLB entry and flush the page at that point and rely on the
usage guarantees to ensure that the alias TLB entry doesn't reappear.
This isn't inexpensive but the majority of the cost is the cache flush
which is a requirement to clean the aliases anyway (a TLB entry purge is
pretty cheap).

Would this work for the ARM hardware walker as well?  It would require
you to have a TLB entry purge instruction as well as some architectural
guarantees about not speculating the TLB.

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/6] get rid of on-stack dma buffers

2011-03-22 Thread James Bottomley
On Mon, 2011-03-21 at 22:03 +0100, Florian Mickler wrote:
 On Mon, 21 Mar 2011 15:26:43 -0400
 Andy Walls awa...@md.metrocast.net wrote:
 
  Florian Mickler flor...@mickler.org wrote:
  
  Hi all!
  
  These patches get rid of on-stack dma buffers for some of the dvb-usb
  drivers. 
  I do not own the hardware, so these are only compile tested. I would 
  appreciate testing and review.
  They were previously sent to the list, but some error on my side
  prevented (some of?) them from beeing delivered to all parties (the
  lists).
  
  These changes are motivated by 
  https://bugzilla.kernel.org/show_bug.cgi?id=15977 .
  
  The patches which got tested already were submitted to Mauro (and
  lkml/linux-media) yesterday seperately. Those fix this same issue for
  ec168,
  ce6230, au6610 and lmedm04. 
  
  A fix for vp702x has been submitted seperately for review on the list.
  I have
  similiar fixes like the vp702x-fix for dib0700 (overlooked some
  on-stack
  buffers in there in my original submission as well) and gp8psk, but I
  am
  holding them back 'till I got time to recheck those and getting some
  feedback
  on vp702x.
  
  Please review and test.
  
  Regards,
  Flo
  
  Florian Mickler (6):
[media] a800: get rid of on-stack dma buffers
[media v2] vp7045: get rid of on-stack dma buffers
[media] friio: get rid of on-stack dma buffers
[media] dw2102: get rid of on-stack dma buffer
[media] m920x: get rid of on-stack dma buffers
[media] opera1: get rid of on-stack dma buffer
  
   drivers/media/dvb/dvb-usb/a800.c   |   17 ++---
   drivers/media/dvb/dvb-usb/dw2102.c |   10 ++-
   drivers/media/dvb/dvb-usb/friio.c  |   23 ++---
   drivers/media/dvb/dvb-usb/m920x.c  |   33 
   drivers/media/dvb/dvb-usb/opera1.c |   31 +++
  drivers/media/dvb/dvb-usb/vp7045.c |   47
  ++--
   6 files changed, 116 insertions(+), 45 deletions(-)
  
  -- 
  1.7.4.1
  
  --
  To unsubscribe from this list: send the line unsubscribe linux-media
  in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  
 
  Florian,
  
  For all of these, what happens when the USB call times out and you kfree() 
  the buffer?  Can the USB DMA actually complete after this kfree(), possibly 
  corrupting space that has been reallocated off the heap, since the kfree()?
  
  This is the scenario for which I assume allocating off the stack is bad.  
  
  Do these changes simply make corruption less noticable since heap gets 
  corrupted vs stack?
  
  Regards,
  Andy
 
 To be blunt, I'm not shure I fully understand the requirements myself.
 But as far as I grasped it, the main problem is that we need memory
 which the processor can see as soon as the device has scribbled upon
 it. (think caches and the like)
 
 Somewhere down the line, the buffer to usb_control_msg get's to be
 a parameter to dma_map_single which is described as part of
 the DMA API in Documentation/DMA-API.txt 
 
 The main point I filter out from that is that the memory has to begin
 exactly at a cache line boundary... 

The API will round up so that the correct region covers the API.
However, if you have other structures packed into the space (as very
often happens on stack), you get cache line interference in the CPU if
they get accessed:  The act of accessing an adjacent object pulls in
cache above your object and destroys DMA coherence.  This is the
principle reason why DMA to stack is a bad idea.

 I guess (not verified), that the dma api takes sufficient precautions
 to abort the dma transfer if a timeout happens.  So freeing _should_
 not be an issue. (At least, I would expect big fat warnings everywhere
 if that were the case)

No, it doesn't take any precautions like this.  the DMA API is just
mapping (possibly via an IOMMU).  If the transfer times out, that's done
in the DMA engine of the card, and must be cleaned up by the driver and
unmapped.

The general rule though is never DMA to stack.  On some processors, the
way stack is allocated can actually make this not work.

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH 00/20] world-writable files in sysfs and debugfs

2011-03-15 Thread James Bottomley
On Mon, 2011-03-14 at 20:09 -0700, Greg KH wrote:
 On Mon, Mar 14, 2011 at 10:26:05PM -0400, James Bottomley wrote:
  On Sat, 2011-03-12 at 23:23 +0300, Vasiliy Kulikov wrote:
Vasiliy Kulikov (20):
 mach-ux500: mbox-db5500: world-writable sysfs fifo file
 leds: lp5521: world-writable sysfs engine* files
 leds: lp5523: world-writable engine* sysfs files
 misc: ep93xx_pwm: world-writable sysfs files
 rtc: rtc-ds1511: world-writable sysfs nvram file
 scsi: aic94xx: world-writable sysfs update_bios file
 scsi: iscsi: world-writable sysfs priv_sess file
   
   These are still not merged :(
  
  OK, so I've not been tracking where we are in the dizzying ride on
  security systems.  However, I thought we landed up in the privilege
  separation arena using capabilities.  That means that world writeable
  files aren't necessarily a problem as long as the correct capabilities
  checks are in place, right?
 
 There are no capability checks on sysfs files right now, so these all
 need to be fixed.

That statement is true but irrelevant, isn't it?  There can't be
capabilities within sysfs files because the system that does them has no
idea what the capabilities would be.  If there were capabilities checks,
they'd have to be in the implementing routines.

I think the questions are twofold:

 1. Did anyone actually check for capabilities before assuming world
writeable files were wrong?
 2. Even if there aren't any capabilities checks in the implementing
routines, should there be (are we going the separated
capabilities route vs the monolithic root route)?

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH 00/20] world-writable files in sysfs and debugfs

2011-03-15 Thread James Bottomley
On Tue, 2011-03-15 at 07:18 -0700, Greg KH wrote:
 On Tue, Mar 15, 2011 at 07:50:28AM -0400, James Bottomley wrote:
  On Mon, 2011-03-14 at 20:09 -0700, Greg KH wrote:
   There are no capability checks on sysfs files right now, so these all
   need to be fixed.
  
  That statement is true but irrelevant, isn't it?  There can't be
  capabilities within sysfs files because the system that does them has no
  idea what the capabilities would be.  If there were capabilities checks,
  they'd have to be in the implementing routines.
 
 Ah, you are correct, sorry for the misunderstanding.
 
  I think the questions are twofold:
  
   1. Did anyone actually check for capabilities before assuming world
  writeable files were wrong?
 
 I do not think so as the majority (i.e. all the ones that I looked at)
 did no such checks.

OK, as long as someone checked, I'm happy.

   2. Even if there aren't any capabilities checks in the implementing
  routines, should there be (are we going the separated
  capabilities route vs the monolithic root route)?
 
 I think the general consensus is that we go the monolithic root route
 for sysfs files in that we do not allow them to be world writable.
 
 Do you have any exceptions that you know of that do these checks?

Heh, I didn't call our security vacillations a dizzying ride for
nothing.  I know the goal once was to try to run a distro without root
daemons (which is what required the capabilities stuff).  I'm actually
trying to avoid the issue ... I just want to make sure that people who
care aren't all moving in different directions.

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Security] [PATCH 00/20] world-writable files in sysfs and debugfs

2011-03-15 Thread James Bottomley
On Tue, 2011-03-15 at 19:08 +0300, Vasiliy Kulikov wrote:
 On Tue, Mar 15, 2011 at 07:50 -0400, James Bottomley wrote:
   1. Did anyone actually check for capabilities before assuming world
  writeable files were wrong?
 
 I didn't check all these files as I haven't got these hardware :-)

You don't need the hardware to check ... the question becomes is a
capabilities test sitting in the implementation or not.

   But
 as I can chmod a+w all sysfs files on my machine and they all become
 sensible to nonroot writes, I suppose there is nothing preventing
 nonroot users from writing to these buggy sysfs files.  As you can see,
 there are no capable() checks in these drivers in open() or write().
 
   2. Even if there aren't any capabilities checks in the implementing
  routines, should there be (are we going the separated
  capabilities route vs the monolithic root route)?
 
 IMO, In any case old good DAC security model must not be obsoleted just
 because someone thinks that MAC or anything else is more convenient for
 him.  If sysfs is implemented via filesystem then it must support POSIX
 permissions semantic.  MAC is very good in _some_ cases, but not instead
 of DAC.

Um, I'm not sure that's even an issue.  capabilities have CAP_ADMIN
which is precisely the same check as owner == root.  We use this a lot
because ioctls ignore the standard unix DAC model.

James



--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/20] world-writable files in sysfs and debugfs

2011-03-14 Thread James Bottomley
On Sat, 2011-03-12 at 23:23 +0300, Vasiliy Kulikov wrote:
  Vasiliy Kulikov (20):
   mach-ux500: mbox-db5500: world-writable sysfs fifo file
   leds: lp5521: world-writable sysfs engine* files
   leds: lp5523: world-writable engine* sysfs files
   misc: ep93xx_pwm: world-writable sysfs files
   rtc: rtc-ds1511: world-writable sysfs nvram file
   scsi: aic94xx: world-writable sysfs update_bios file
   scsi: iscsi: world-writable sysfs priv_sess file
 
 These are still not merged :(

OK, so I've not been tracking where we are in the dizzying ride on
security systems.  However, I thought we landed up in the privilege
separation arena using capabilities.  That means that world writeable
files aren't necessarily a problem as long as the correct capabilities
checks are in place, right?

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15]drivers:spi:dw_spi.c Typo change diable to disable.

2010-12-31 Thread James Bottomley
On Fri, 2010-12-31 at 06:17 -0800, Justin P. Mattock wrote:
 On 12/31/2010 01:11 AM, Dan Carpenter wrote:
  On Thu, Dec 30, 2010 at 10:52:30PM -0800, Justin P. Mattock wrote:
  On 12/30/2010 10:45 PM, Grant Likely wrote:
  On Thu, Dec 30, 2010 at 03:07:51PM -0800, Justin P. Mattock wrote:
  The below patch fixes a typo diable to disable. Please let me know 
  if this
  is correct or not.
 
  Signed-off-by: Justin P. Mattockjustinmatt...@gmail.com
 
  applied, thanks.
 
  g.
 
  ahh.. thanks.. just cleared up the left out diabled that I had
  thought I forgotten(ended up separating comments and code and
  forgot)
 
  This is really just defensiveness and random grumbling and grumpiness on
  my part, but one reason I may have missed the first patch is because
  your subject lines are crap.
 
  Wrong:  [PATCH 02/15]drivers:spi:dw_spi.c Typo change diable to disable.
 
  Right:  [PATCH 02/15] spi/dw_spi: Typo change diable to disable
 
  regards,
  dan carpenter
 
 
 alright.. so having the backlash is alright for the subject
 
 Thanks for the pointer on this..

There is actually no specific form.  Most of us edit this part of the
subject line anyway to conform to whatever (nonuniform) conventions we
use.  I just use component: with no scsi or drivers prefix because the
git tree is tagged [SCSI]; others are different.

James


--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html