Diego,
Thankyou very much for your detailed analysis and suggestions, we very
much appreciate it.
I haven't been involved in the CMEM world for very long and therefore
don't have much insight into its original spec or development, and I'm
still learning the internals of Linux, but what you suggest sounds quite
reasonable. I understand that the general feeling in the Linux
community within and outside TI is that CMEM is "not ready for
primetime", i.e., incorporation into the Linux mainline.
Your analysis of the features supported by CMEM is accurate. One thing
you didn't mention is the requirement that the user be able to obtain
the physical address of CMEM buffers, for the purpose of granting them
to DSP codecs, where the DSP doesn't have an MMU, and for granting to HW
(DSP or otherwise) that doesn't sit behind an MMU (such as DMA).
Another point worth mentioning is that CMEM has a history of being used
for buffers that were not granted by CMEM. I'm speaking of the get_phys
functionality, where users of CMEM will also call CMEM's get_phys
functionality for buffers that were not obtained by CMEM. This is not a
supported use case, but we are almost forced to support this due to
established legacy usage.
Regards,
- Rob Tivy
Texas Instruments, Santa Barbara
________________________________
From: Diego Dompe [mailto:[EMAIL PROTECTED]
Sent: Friday, October 26, 2007 1:19 PM
To: Tivy, Robert
Cc: [email protected]
Subject: Re: Translating kernel virtual address to physical
address
Robert,
I have tried to use CMEM in recent kernels and found the
problems you described, and I was wondering if TI plans to fix it to
make it work, or would be interested in re-design the driver to make it
portable and simpler. I see unlikely that the main tree accepts the
driver in the current state.
As far I understand from the CMEM driver available, his
functional objective is to create object pools to be used by Codec
Engine. To do so the current implementation does as follows:
- Requires the kernel to be run with a reserved memory area
(leaving memory out of kernel control with the mem=XX param). The
location of the reserved memory is notified to the driver as parameter.
- Creates several object pools of different sizes in the memory
area (parsed as a module parameters). Those objects should be contiguous
physical memory areas so they can be accessed by the EDMA engine, or the
DSP.
- The objects are requested using ioctls, and the return value
(object handler) is the virtual address of the allocated object into the
memory area.
- The mmap function is used along with the object handler to map
the address into the user space address space (requiring the get_phys
function to re-obtain the physical address that needs to be re-mapped).
The mapping mode (cached or uncached) is controlled trough ioctls as
well the special functions to implement the cache coherency.
I think there are several things that could be done in a
different way to make the driver simpler.
- Unless I'm missing something the requirement for contiguous
physical memory pages could be satisfied by the __get_free_page()
function (just as the FB driver does, as the VPBE hardware requires also
contiguous memory). I'm not sure why is required to take a complete
block of memory out of kernel control.
- The current driver is implementing his own object pool
handlers: the linux kernel provides this functionality already with the
slab allocator interface (kmem_cache_create). Using the flags
SLAB_NO_REAP | SLAB_PANIC and allocating the objects early on the boot
process (or forcing to install the driver early during board boot-up)
will be enough warranty to create a driver that could allocate
contiguous memory objects without requiring taking the memory out of
kernel control.
- The question will be then how to provide this memory objects
to the user space without requiring a hack like the current mmap +
reverse translation of the virtual pointers as object handlers?
A simple and clean way could be to create a pseudo file system
(mount it under /dev/ticmem) where a file descriptor is created for
every object (/dev/ticmem/buffer4MBX, buffer2MBX, etc). Providing
exclusive open operations could be used for allocation (if you can open
it, it's yours), getting rid of the ioctls for reservation.
A way to get rid of the ioctls for selecting cached or
non-cached mapping is to use the standard of /dev/mem that if you mmap a
device with the O_SYNC flag, then the mapping is not cached.
This was just my 2 cents on how I would like to see the driver
implemented, and I think that will make it a bit more appealing to be
accepted in main tree. Please feel free to comment or question it.
Diego
RidgeRun Engineering
On Oct 25, 2007, at 3:30 PM, Tivy, Robert wrote:
Engineers at TI are looking to get the TI CMEM module
into the Linux kernel mainline in the near future. In order to progress
to this goal we need to address some items. One of the items that needs
addressing is the CMEM driver get_phys() function. Currently it just
decodes page mappings at a low level, using pgd/pmd/pte low-level
functions/macros. Kevin Hilman provided an improved version, one that
is much more portable and correct. However, there's a problem with this
new version in that it doesn't translate some kernel virtual addresses
correctly, so I'm asking the list for help.
CMEM is responsible for allocating contiguous memory
blocks. It is granted a block of physical memory during module
insertion time. Typically this block is one not managed by the kernel,
i.e., the kernel has been "told" to not touch this memory, by virtue of
the "mem=" boot command line arg. In order to manage this memory block,
CMEM performs an ioremap() (either ioremap_nocache() or
ioremap_cached()), getting a kernel virtual pointer as a result. At
some later time (as a result of a user request) CMEM calls get_phys() on
a kernel virtual address within this block (get_phys() also is called
for user mappings to the same block). The "new" get_phys() doesn't
handle this kernel translation correctly - the returned phys addr is
always some constant offset from the correct one.
The ioremap functions obtain a virtual address block
large enough for the physical block through the kernel get_vm_area()
function. This function traverses the list 'vmlist' for an area that is
large enough to satisfy the request. The chosen virtual address is then
mapped to the requested physical address. The problem with get_phys()
is that it uses the 'virt_to_phys()' API which assumes a fixed, constant
relationship between a kernel virt addr and the corresponding phys addr,
but in this case the virt addr is some arbitrary address found to be
free somewhere between VMALLOC_START->VMALLOC_END. >From what I can
see, the only way to do this translation is to traverse the 'vmlist' and
find the vm_struct for the virt addr, and return the phys_addr element
of the vm_struct. However, vmlist is not exported, and I would expect
that there is a helper function somewhere for traversing vmlist, but I
can't find anything.
The following code will return the wrong physp:
block_virtp = (unsigned long) ioremap_nocache(start,
length);
...
physp = get_phys(block_virtp);
And here is the code in get_phys() that is returning the
wrong value:
/* For kernel direct-mapped memory, take the easy
way */
if (virtp >= PAGE_OFFSET) {
physp = virt_to_phys((void *)virtp);
}
So, for example, if the physical block starts at
0x87800000 and we do
block_virtp = (unsigned long)
ioremap_nocache(0x87800000, 0x800000);
we get block_virtp = 0xc80800000, and the call
physp = get_phys(0xc8080000);
returns a physp of 0x88080000, but it should be
0x87800000 (and all phys addrs it returns are the same amount too high -
0x00880000).
Moving to a different chip w/ a different Linux, we get
block_virtp = 0xce000000
and
get_phys(0xce000000)
returns 0x8e000000. Clearly both these systems have a
virt_to_phys() macro/function that simply subtracts 0x40000000, but for
virt addrs that were allocated by ioremap() there is no such fixed
relationship and the vmlist must be consulted, but the CMEM module can't
get to the unexported symbol 'vmlist'.
Please advise on how to successfully translate these
types of virtual addresses.
Thanks in advance for any "pointers" you can provide :)
Regards,
- Rob Tivy
Texas Instruments, Santa Barbara
_______________________________________________
Davinci-linux-open-source mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source
_______________________________________________
Davinci-linux-open-source mailing list
[email protected]
http://linux.davincidsp.com/mailman/listinfo/davinci-linux-open-source