Re: [PATCH/RFC v2.1 0/2] Mem-to-mem device framework

2010-02-03 Thread Mauro Carvalho Chehab
Hiremath, Vaibhav wrote:
 -Original Message-
 From: Pawel Osciak [mailto:p.osc...@samsung.com]
 Sent: Monday, December 28, 2009 8:19 PM
 To: 'Hans Verkuil'
 Cc: linux-media@vger.kernel.org; linux-samsung-...@vger.kernel.org;
 linux-arm-ker...@lists.infradead.org; Marek Szyprowski;
 kyungmin.p...@samsung.com; Hiremath, Vaibhav; Karicheri,
 Muralidharan; 'Guru Raj'; 'Xiaolin Zhang'; 'Magnus Damm'; 'Sakari
 Ailus'
 Subject: RE: [PATCH/RFC v2.1 0/2] Mem-to-mem device framework

 Hello Hans,


 On Wednesday 23 December 2009 16:06:18 Hans Verkuil wrote:
 Thank you for working on this! It's much appreciated. Now I've
 noticed that
 patches regarding memory-to-memory and memory pool tend to get
 very few comments.
 I suspect that the main reason is that these are SoC-specific
 features that do
 not occur in consumer-type products. So most v4l developers do not
 have the
 interest and motivation (and time!) to look into this.
 Thank you very much for your response. We were a bit surprised with
 the lack of
 responses as there seemed to be a good number of people interested
 in this area.

 I'm hoping that everybody interested would take a look at the test
 device posted
 along with the patches. It's virtual, no specific hardware required,
 but it
 demonstrates the concepts behind the framework, including
 transactions.

 [Hiremath, Vaibhav] I was on vacation and resumed today itself, I will go 
 through these patch series this weekend and will get back to you.
 
 I just had cursory look and I would say it should be really good starting 
 point for us to support mem-to-mem devices.\

Hmm... it seems to me that those patches are still under discussion/analysis.
I'll mark them as RFC at the Patchwork.

Please let me know after you, SoC guys, go into a consensus about it. Then,
please submit me the final version.

Cheers,
Mauro
--
To unsubscribe from this list: send the line unsubscribe linux-media in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH/RFC v2.1 0/2] Mem-to-mem device framework

2009-12-30 Thread Hiremath, Vaibhav
 -Original Message-
 From: Pawel Osciak [mailto:p.osc...@samsung.com]
 Sent: Monday, December 28, 2009 8:19 PM
 To: 'Hans Verkuil'
 Cc: linux-media@vger.kernel.org; linux-samsung-...@vger.kernel.org;
 linux-arm-ker...@lists.infradead.org; Marek Szyprowski;
 kyungmin.p...@samsung.com; Hiremath, Vaibhav; Karicheri,
 Muralidharan; 'Guru Raj'; 'Xiaolin Zhang'; 'Magnus Damm'; 'Sakari
 Ailus'
 Subject: RE: [PATCH/RFC v2.1 0/2] Mem-to-mem device framework
 
 Hello Hans,
 
 
 On Wednesday 23 December 2009 16:06:18 Hans Verkuil wrote:
  Thank you for working on this! It's much appreciated. Now I've
 noticed that
  patches regarding memory-to-memory and memory pool tend to get
 very few comments.
  I suspect that the main reason is that these are SoC-specific
 features that do
  not occur in consumer-type products. So most v4l developers do not
 have the
  interest and motivation (and time!) to look into this.
 
 Thank you very much for your response. We were a bit surprised with
 the lack of
 responses as there seemed to be a good number of people interested
 in this area.
 
 I'm hoping that everybody interested would take a look at the test
 device posted
 along with the patches. It's virtual, no specific hardware required,
 but it
 demonstrates the concepts behind the framework, including
 transactions.
 
[Hiremath, Vaibhav] I was on vacation and resumed today itself, I will go 
through these patch series this weekend and will get back to you.

I just had cursory look and I would say it should be really good starting point 
for us to support mem-to-mem devices.

Thanks,
Vaibhav

  One thing that I am missing is a high-level overview of what we
 want. Currently
  there are patches/RFCs floating around for memory-to-memory
 support, multiplanar
  support and memory-pool support.
 
  What I would like to see is a RFC that ties this all together from
 the point of
  view of the public API. I.e. what are the requirements? Possibly
 solutions? Open
  questions? Forget about how to implement it for the moment, that
 will follow
  from the chosen solutions.
 
 Yes, that's true, sorry about that. We've been so into it after the
 memory pool
 discussion and the V4L2 mini-summit that I neglected describing the
 big picture
 behind this.
 
 So to give a more high-level description, from the point of view of
 applications
 and the V4L2 API:
 
 ---
 Requirements:
 ---
 (Some of the following were first posted by Laurent in:
 http://thread.gmane.org/gmane.linux.drivers.video-input-
 infrastructure/10204).
 
 1. Support for devices that take input data in a source buffer, take
 a separate
 destination buffer, process the source data and put it in the
 destination buffer.
 
 2. Allow sharing buffers between devices, effectively chaining them
 to form
 video pipelines. An example of this could be a video decoder, fed
 with video
 stream which returns raw frames, which then have to be postprocessed
 by another
 device and displayed. This is the main scenario we need to have for
 our S3C/S5P
 series SoCs. Of course, we'd like zero-copy.
 
 3. Allow using more than one buffer by the device at the same time.
 This is not
 supported by videobuffer (e.g. we have to choose on which buffer
 we'd like
 to sleep, and we do not always know that). This is not really a
 requirement
 from the V4L2 API point of view, but has direct influence on how
 poll() and
 blocking I/O works.
 
 4. Multiplanar buffers. Our devices require them (see the RFC for
 more details:
 http://article.gmane.org/gmane.linux.drivers.video-input-
 infrastructure/11212).
 
 5. Solve problems with cache coherency on non-x86 architectures,
 especially in
 videobuf for OUTPUT buffers. We need to flush the cache before
 starting the
 transaction.
 
 6. Reduce buffer queuing latency, e.g.: move operations such as
 locking, out
 of qbuf.
 Applications would like to queue a buffer and be able to fire up the
 device
 as fast as possible.
 
 7. Large buffer allocations, buffer preallocation, etc.
 
 
 ---
 Solutions:
 ---
 1. After a detailed discussion, we agreed in:
 http://thread.gmane.org/gmane.linux.drivers.video-input-
 infrastructure/10668,
 that we'd like the application to be able to queue/dequeue both
 OUTPUT (as source)
 and CAPTURE (as destination) buffers on one video node. Activating
 the device
 (after streamon) would take effect only if there are both types of
 buffers
 available. The application would put source data into OUTPUT buffers
 and expect
 to find it processed in dequeued CAPTURE buffers. Addressed by
 mem2mem framework.
 
 2. I don't see anything to do here from the API's point of view. The
 application
 would open two video nodes, e.g. video decoder and video
 postprocessor and queue
 buffers dequeued from decoder on the postprocessor. To get the best
 performance,
 this requires the buffers to be marked as non cached somehow to
 avoid unneeded
 cache syncs.
 
 3. Mem2mem addresses this partially

RE: [PATCH/RFC v2.1 0/2] Mem-to-mem device framework

2009-12-28 Thread Pawel Osciak
Hello Hans,


On Wednesday 23 December 2009 16:06:18 Hans Verkuil wrote:
 Thank you for working on this! It's much appreciated. Now I've noticed that
 patches regarding memory-to-memory and memory pool tend to get very few 
 comments.
 I suspect that the main reason is that these are SoC-specific features that do
 not occur in consumer-type products. So most v4l developers do not have the
 interest and motivation (and time!) to look into this.

Thank you very much for your response. We were a bit surprised with the lack of
responses as there seemed to be a good number of people interested in this area.

I'm hoping that everybody interested would take a look at the test device posted
along with the patches. It's virtual, no specific hardware required, but it
demonstrates the concepts behind the framework, including transactions.

 One thing that I am missing is a high-level overview of what we want. 
 Currently
 there are patches/RFCs floating around for memory-to-memory support, 
 multiplanar
 support and memory-pool support.

 What I would like to see is a RFC that ties this all together from the point 
 of
 view of the public API. I.e. what are the requirements? Possibly solutions? 
 Open
 questions? Forget about how to implement it for the moment, that will follow
 from the chosen solutions.

Yes, that's true, sorry about that. We've been so into it after the memory pool
discussion and the V4L2 mini-summit that I neglected describing the big picture
behind this.

So to give a more high-level description, from the point of view of applications
and the V4L2 API:

---
Requirements:
---
(Some of the following were first posted by Laurent in:
http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/10204).

1. Support for devices that take input data in a source buffer, take a separate
destination buffer, process the source data and put it in the destination 
buffer.

2. Allow sharing buffers between devices, effectively chaining them to form
video pipelines. An example of this could be a video decoder, fed with video
stream which returns raw frames, which then have to be postprocessed by another
device and displayed. This is the main scenario we need to have for our S3C/S5P
series SoCs. Of course, we'd like zero-copy.

3. Allow using more than one buffer by the device at the same time. This is not
supported by videobuffer (e.g. we have to choose on which buffer we'd like
to sleep, and we do not always know that). This is not really a requirement
from the V4L2 API point of view, but has direct influence on how poll() and
blocking I/O works.

4. Multiplanar buffers. Our devices require them (see the RFC for more details:
http://article.gmane.org/gmane.linux.drivers.video-input-infrastructure/11212).

5. Solve problems with cache coherency on non-x86 architectures, especially in
videobuf for OUTPUT buffers. We need to flush the cache before starting the
transaction.

6. Reduce buffer queuing latency, e.g.: move operations such as locking, out
of qbuf.
Applications would like to queue a buffer and be able to fire up the device
as fast as possible.

7. Large buffer allocations, buffer preallocation, etc.


---
Solutions:
---
1. After a detailed discussion, we agreed in:
http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/10668,
that we'd like the application to be able to queue/dequeue both OUTPUT (as 
source)
and CAPTURE (as destination) buffers on one video node. Activating the device
(after streamon) would take effect only if there are both types of buffers 
available. The application would put source data into OUTPUT buffers and expect
to find it processed in dequeued CAPTURE buffers. Addressed by mem2mem 
framework.

2. I don't see anything to do here from the API's point of view. The application
would open two video nodes, e.g. video decoder and video postprocessor and queue
buffers dequeued from decoder on the postprocessor. To get the best performance,
this requires the buffers to be marked as non cached somehow to avoid unneeded
cache syncs.

3. Mem2mem addresses this partially by adding a transaction concept. It's
not bullet-proof though, as it assumes the buffers will be returned in the same
order as passed. Some videobuffer limitations will have to be addressed here.

4. See my RFC. Patches in progress. 

5. We have narrowed it down to an additional sync() before the operation
(i.e. in qbuf), but more issues may exist here. I have already added sync()
support for qbuf with minimal changes to videobuf and will be posting the
proposal soon. This also requires identifying the direction of the sync, but
we have found a way to do this without adding anything new (videobuf flags
are enough).

6. Later. We haven't done anything in this field.

7. We use our own allocator (see
http://thread.gmane.org/gmane.linux.ports.arm.kernel/56879), but we have a new
concept for that which we'd like to discuss separately later. 


 Note 

Re: [PATCH/RFC v2.1 0/2] Mem-to-mem device framework

2009-12-23 Thread Hans Verkuil
On Wednesday 23 December 2009 14:17:32 Pawel Osciak wrote:
 Hello,
 
 this is the second version of the proposed implementation for mem-to-mem 
 memory
 device framework. Your comments are very welcome.

Hi Pawel,

Thank you for working on this! It's much appreciated. Now I've noticed that
patches regarding memory-to-memory and memory pool tend to get very few 
comments.
I suspect that the main reason is that these are SoC-specific features that do
not occur in consumer-type products. So most v4l developers do not have the
interest and motivation (and time!) to look into this.

I'm CC-ing this reply to developers from Intel, TI, Nokia and Renesas in the
hope that they will find some time to review and think about this since this 
will
affect all of them.

One thing that I am missing is a high-level overview of what we want. Currently
there are patches/RFCs floating around for memory-to-memory support, multiplanar
support and memory-pool support.

What I would like to see is a RFC that ties this all together from the point of
view of the public API. I.e. what are the requirements? Possibly solutions? Open
questions? Forget about how to implement it for the moment, that will follow
from the chosen solutions.

Note that I would suggest though that the memory-pool part is split into two
parts: how to actually allocate the memory is pretty much separate from how v4l
will use it. The actual allocation part is probably quite complex and might
even be hardware dependent and should be discussed separately. But how to use
it is something that can be discussed without needing to know how it was
allocated.

The lack of discussion in this area does worry me a bit. IMHO this is a very
important area that needs a lot more work. The initiative should be with the
SoC companies and right now it seems only Samsung is active.

BTW, what is the status of the multiplanar RFC? I later realized that that RFC
might be very useful for adding meta-data to buffers. There are several cases
where that is useful: sensors that provide meta-data when capturing a frame and
imagepipelines (particularly in memory-to-memory cases) that want to have all
parameters as part of the meta-data associated with the image. There may well
be more of those.

Regards,

Hans

 
 In v2.1:
 I am very sorry for the resend, but somehow an orphaned endif found its way to
 Kconfig during the rebase.
 
 Changes since v1:
 - v4l2_m2m_buf_queue() now requires m2m_ctx as its argument
 - video_queue private data stores driver private data
 - a new submenu in kconfig for mem-to-mem devices
 - minor rebase leftovers cleanup
 
 A second patch series followed v2 with a new driver for a real device -
 Samsung S3C/S5P image rotator, utilizing this framework.
 
 
 This series contains:
 
 [PATCH v2.1 1/2] V4L: Add memory-to-memory device helper framework for V4L2.
 [PATCH v2.1 2/2] V4L: Add a mem-to-mem V4L2 framework test device.
 [EXAMPLE v2] Mem-to-mem userspace test application.
 
 
 Previous discussion and RFC on this topic:
 http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/10668
 
 
 A mem-to-mem device is a device that uses memory buffers passed by
 userspace applications for both source and destination. This is
 different from existing drivers that use memory buffers for only one
 of those at once.
 In terms of V4L2 such a device would be both of OUTPUT and CAPTURE type.
 Although no such devices are present in the V4L2 framework, a demand for such
 a model exists, e.g. for 'resizer devices'.
 
 
 ---
 Mem-to-mem devices
 ---
 In the previous discussion we concluded that we should use one video node with
 two queues, an output (V4L2_BUF_TYPE_VIDEO_OUTPUT) queue for source buffers 
 and
 a capture queue (V4L2_BUF_TYPE_VIDEO_CAPTURE) for destination buffers.
 
 
 Each instance has its own set of queues: 2 videobuf_queues, each with a ready
 buffer queue, managed by the framework. Everything is encapsulated in the
 queue context struct:
 
 struct v4l2_m2m_queue_ctx {
 struct videobuf_queue   q;
  /* ... */
 /* Queue for buffers ready to be processed as soon as this
  * instance receives access to the device */
 struct list_headrdy_queue;
  /* ... */
 };
 
 struct v4l2_m2m_ctx {
  /* ... */
 /* Capture (output to memory) queue context */
 struct v4l2_m2m_queue_ctx   cap_q_ctx;
 
 /* Output (input from memory) queue context */
 struct v4l2_m2m_queue_ctx   out_q_ctx;
  /* ... */
 };
 
 Streamon can be called for all instances and will not sleep if another 
 instance
 is streaming.
 
 vidioc_querycap() should report V4L2_CAP_VIDEO_CAPTURE | 
 V4L2_CAP_VIDEO_OUTPUT.
 
 ---
 Queuing and dequeuing buffers