RE: Mem2Mem V4L2 devices [RFC]
Hello, On Thursday, October 08, 2009 11:26 PM Karicheri, Muralidharan wrote: Why not? In a typical camera scenario, application can feed one frame and get two output frames (one for storing and another for sending over email (a lower resolution). I just gave an example. You gave an example of the Y-type pipeline which start in real streaming device (camera) which is completely different thing. Y-type CAPTURE pipeline is quite common thing, which can be simply mapped to 2 different capture video nodes. In my previous mail I asked about Y-type pipeline which starts in memory. I don't think there is any common use case for such thing. Marek, You can't say that. This feature is currently supported in our internal release which is being used by our customers. So for feature parity it is required to be supported as we can't determine how many customers are using this feature. Besides in the above scenario that I have mentioned, following happens. sensor - CCDC - Memory (video node) Memory - Previewer - Resizer1 - Memory |- Resizer2 - Memory Typically application capture full resolution frame (Bayer RGB) to Memory and then use Previewer and Resizer in memory to memory mode to do conversion to UYVY format. But application use second resizer to get a lower resolution frame simultaneously. We would like to expose this hardware capability to user application through this memory to memory device. Ok. I understand that Your current custom API exports such functionality. I thought a bit about this issue and found a solution how this can be implemented using one video node approach. It would require additional custom ioctl but imho there is no other way. An application can open the /dev/videoX node 2 times. Then it can 'link' them with this special ioctl, so the driver would know which instances are 'linked' together. Then the application queues source buffer to both instances, sets destination format/size/colorspace/etc, and queues output buffers. Then calls stream on both instances. The driver can detect if the 2 instances has been linked together and if the source buffer is the same in both of them, it will use this special feature of your hardware and run 2 resizers simultaneously. This sounds a bit complicated (especially because the driver would need to play a bit with synchronization and possible races...), but currently I see no other possibility to implement it on top of one-video-node approach. Since only one capture queue per IO instance is possible in this model (matched by buf type), I don't think we can scale it for 2 outputs case. Or is it possible to queue 2 output buffers of two different sizes to the same queue? This can be hacked by introducing yet another 'type' (for example SECOND_CAPTURE), but I don't like such solution. Anyway - would we really need Y-type mem2mem device? Yes. No hacking please! We should be able to do S_FMT for the second Resizer output and dequeue the frame. Not sure how can we handle this in this model. Currently I see no clean way of adding support for more than one output in one video node approach. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC] - Can we enhance the V4L2 API?
Hello, On Wednesday, October 07, 2009 3:39 PM Karicheri, Muralidharan wrote: As we have seen in the discussion, this is not a streaming device, rather a transaction/conversion device which operate on a given frame to get a desired output frame. Each transaction may have it's own set of configuration context which will be applied to the hardware before starting the operation. This is unlike a streaming device, where most of the configuration is done prior to starting the streaming. From the application point of view an instance of such a device still is a streaming device. The application should not even know if any other apps are using the device or not (well, it may only notice the lower throughput or higher device latency, but this cannot be avoided). Application can queue input and output buffers, stream on and wait for the result. In a typical capture or display side streaming, AFAIK, there is only one device io instance. While streaming is ON, if another application tries to do IO, driver returns -EBUSY. I believe this is true for all drivers (Correct me if this is not true).When you say the memory to memory device is able to allow multiple application to call STREAMON, this model is broken(Assuming what I said above is true). May be I am missing something here. Is the following true? I think in your model, each application gets a device instance that has it's own scaling factors and other parameters. So streaming status is maintained for each IO instance. Each IO instance has it's own buffer queues. If this is true then you are right. Streaming model is not broken. This is exactly what I mean. Typical capture or display devices are single instance from the definition (I cannot imagine more than one application streaming _directly_ from the camera interface). However, a multi-instance support for mem2mem device perfectly makes sense and heavily improves the usability of it. So following scenario holds good concurrently (api call sequence). App1 - open() - S_FMT - STREAMON-QBUF/DQBUF(n times)-STREAMOFF-close() App2 - open() - S_FMT - STREAMON-QBUF/DQBUF(n times)-STREAMOFF-close() App3 - open() - S_FMT - STREAMON-QBUF/DQBUF(n times)-STREAMOFF-close() Exactly. So internal to driver, if there are multiple concurrent streamon requests, and hardware is busy, subsequent requests waits until the first one is complete and driver schedules requests from multiple IO queues. So this is essentially what we have in our internal implementation (discussed during the linux plumbers mini summit) converted to v4l2 model. Right, this is what we also have in our custom v4l2-incompatible drivers. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hello, On Wednesday, October 07, 2009 4:03 PM Karicheri, Muralidharan wrote: How the hardware is actually designed? I see two possibilities: 1. [input buffer] --[dma engine] [resizer1] --[dma]- [mem output buffer1] \- [resizer2] --[dma]- [mem output buffer2] This is the case. 2. [input buffer] ---[dma engine1]- [resizer1] --[dma]- [mem output buffer1] \-[dma engine2]- [resizer2] --[dma]- [mem output buffer2] In the first case we would really have problems mapping it properly to video nodes. But we should think if there are any use cases of such design? (in terms of mem-2-mem device) Why not? In a typical camera scenario, application can feed one frame and get two output frames (one for storing and another for sending over email (a lower resolution). I just gave an example. You gave an example of the Y-type pipeline which start in real streaming device (camera) which is completely different thing. Y-type CAPTURE pipeline is quite common thing, which can be simply mapped to 2 different capture video nodes. In my previous mail I asked about Y-type pipeline which starts in memory. I don't think there is any common use case for such thing. I know that this Y-type design makes sense as a part of the pipeline from a sensor or decoder device. But I cannot find any useful use case for mem2mem version of it. The second case is much more trivial. One can just create two separate resizer devices (with their own nodes) or one resizer driver with two hardware resizers underneath it. In both cases application would simply queue the input buffer 2 times for both transactions. I am assuming we are using the One node implementation model suggested by Ivan. At hardware, streaming should happen at the same time (only one bit in register). So if we have second node for the same, then driver needs to match the IO instance of second device with the corresponding request on first node and this takes us to the same complication as with 2 video nodes implementation. Right. Since only one capture queue per IO instance is possible in this model (matched by buf type), I don't think we can scale it for 2 outputs case. Or is it possible to queue 2 output buffers of two different sizes to the same queue? This can be hacked by introducing yet another 'type' (for example SECOND_CAPTURE), but I don't like such solution. Anyway - would we really need Y-type mem2mem device? Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hello, On Tuesday, October 06, 2009 6:12 PM Hiremath, Vaibhav wrote: On Monday, October 05, 2009 8:27 PM Hiremath, Vaibhav wrote: [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same video node. [Hiremath, Vaibhav] Do we create separate queue of buffers based on type? I think we don't. Why not? I really see no problems implementing such driver, especially if this heavily increases the number of use cases where such device can be used. [Hiremath, Vaibhav] I thought of it and you are correct, it should be possible. I was kind of biased and thinking in only one direction. Now I don't see any reason why we should go for 2 device node approach. Earlier I was thinking of 2 device nodes for 2 queues, if it is possible with one device node then I think we should align to single device node approach. Do you see any issues with it? Currently, it looks that all issues are resolved. However, something might arise during the implementation. If so, I will post it here of course. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC] - Can we enhance the V4L2 API?
Marek, As we have seen in the discussion, this is not a streaming device, rather a transaction/conversion device which operate on a given frame to get a desired output frame. Each transaction may have it's own set of configuration context which will be applied to the hardware before starting the operation. This is unlike a streaming device, where most of the configuration is done prior to starting the streaming. From the application point of view an instance of such a device still is a streaming device. The application should not even know if any other apps are using the device or not (well, it may only notice the lower throughput or higher device latency, but this cannot be avoided). Application can queue input and output buffers, stream on and wait for the result. In a typical capture or display side streaming, AFAIK, there is only one device io instance. While streaming is ON, if another application tries to do IO, driver returns -EBUSY. I believe this is true for all drivers (Correct me if this is not true).When you say the memory to memory device is able to allow multiple application to call STREAMON, this model is broken(Assuming what I said above is true). May be I am missing something here. Is the following true? I think in your model, each application gets a device instance that has it's own scaling factors and other parameters. So streaming status is maintained for each IO instance. Each IO instance has it's own buffer queues. If this is true then you are right. Streaming model is not broken. So following scenario holds good concurrently (api call sequence). App1 - open() - S_FMT - STREAMON-QBUF/DQBUF(n times)-STREAMOFF-close() App2 - open() - S_FMT - STREAMON-QBUF/DQBUF(n times)-STREAMOFF-close() App3 - open() - S_FMT - STREAMON-QBUF/DQBUF(n times)-STREAMOFF-close() So internal to driver, if there are multiple concurrent streamon requests, and hardware is busy, subsequent requests waits until the first one is complete and driver schedules requests from multiple IO queues. So this is essentially what we have in our internal implementation (discussed during the linux plumbers mini summit) converted to v4l2 model. The changes done during streaming are controls like brightness, contrast, gain etc. The frames received by application are either synchronized to an input source timing or application output frame based on a display timing. Also a single IO instance is usually maintained at the driver where as in the case of memory to memory device, hardware needs to switch contexts between operations. So we might need a different approach than capture/output device. All this is internal to the device driver, which can hide it from the application. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Marek, How the hardware is actually designed? I see two possibilities: 1. [input buffer] --[dma engine] [resizer1] --[dma]- [mem output buffer1] \- [resizer2] --[dma]- [mem output buffer2] This is the case. 2. [input buffer] ---[dma engine1]- [resizer1] --[dma]- [mem output buffer1] \-[dma engine2]- [resizer2] --[dma]- [mem output buffer2] In the first case we would really have problems mapping it properly to video nodes. But we should think if there are any use cases of such design? (in terms of mem-2-mem device) Why not? In a typical camera scenario, application can feed one frame and get two output frames (one for storing and another for sending over email (a lower resolution). I just gave an example. You would say that this can be done in two steps, but when hardware is capable of doing this parallel, why not driver provide the support? I know that this Y-type design makes sense as a part of the pipeline from a sensor or decoder device. But I cannot find any useful use case for mem2mem version of it. The second case is much more trivial. One can just create two separate resizer devices (with their own nodes) or one resizer driver with two hardware resizers underneath it. In both cases application would simply queue the input buffer 2 times for both transactions. I am assuming we are using the One node implementation model suggested by Ivan. At hardware, streaming should happen at the same time (only one bit in register). So if we have second node for the same, then driver needs to match the IO instance of second device with the corresponding request on first node and this takes us to the same complication as with 2 video nodes implementation. Since only one capture queue per IO instance is possible in this model (matched by buf type), I don't think we can scale it for 2 outputs case. Or is it possible to queue 2 output buffers of two different sizes to the same queue? Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hello, On Monday, October 05, 2009 8:07 PM Hiremath, Vaibhav wrote: -Original Message- From: Hiremath, Vaibhav [mailto:hvaib...@ti.com] Sent: Monday, October 05, 2009 8:07 PM To: Marek Szyprowski; linux-media@vger.kernel.org Cc: kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: RE: Mem2Mem V4L2 devices [RFC] -Original Message- From: Marek Szyprowski [mailto:m.szyprow...@samsung.com] Sent: Monday, October 05, 2009 7:26 PM To: Hiremath, Vaibhav; linux-media@vger.kernel.org Cc: kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak; Marek Szyprowski Subject: RE: Mem2Mem V4L2 devices [RFC] Hello, On Monday, October 05, 2009 7:43 AM Hiremath, Vaibhav wrote: In terms of V4L2 framework such device would be both video sink and source at the same time. The main problem is how the video nodes (/dev/videoX) should be assigned to such a device. The simplest way of implementing mem2mem device in v4l2 framework would use two video nodes (one for input and one for output). Such an idea has been already suggested on V4L2 mini-summit. [Hiremath, Vaibhav] We discussed 2 options during summit, 1) Only one video device node, and configuring parameters using V4L2_BUF_TYPE_VIDEO_CAPTURE for input parameter and V4L2_BUF_TYPE_VIDEO_OUTPUT for output parameter. 2) 2 separate video device node, one with V4L2_BUF_TYPE_VIDEO_CAPTURE and another with V4L2_BUF_TYPE_VIDEO_OUTPUT, as mentioned by you. The obvious and preferred option would be 2, because with option 1 we could not able to achieve real streaming. And again we have to put constraint on application for fixed input buffer index. What do you mean by real streaming? [Hiremath, Vaibhav] I meant, after streamon, there will be just sequence of queuing and de-queuing of buffers. With single node of operation, how are we deciding which is input buffer and which one is output? By the buffer-type parameter. The only difference is that you will queue both buffers into the same video node. We have to assume or put constraint on application that the 0th index will be always input, irrespective of number of buffers requested. No. The input buffers will be distinguished by the type parameter. [Hiremath, Vaibhav] Please note that we must put one limitation to application that, the buffers in both the video nodes are mapped one-to-one. This means that, Video0 (input)Video1 (output) Index-0 == index-0 Index-1 == index-1 Index-2 == index-2 Do you see any other option to this? I think this constraint is obvious from application point of view in during streaming. This is correct. Every application should queue a corresponding output buffer for each queued input buffer. NOTE that the this while discussion is how make it possible to have 2 different applications running at the same time, each of them queuing their own input and output buffers. It will look somehow like this: Video0 (input) Video1 (output) App1, Index-0 == App1, index-0 App2, Index-0 == App2, index-0 App1, Index-1 == App1, index-1 App2, Index-1 == App2, index-1 App1, Index-2 == App1, index-2 App2, Index-2 == App2, index-2 Note, that the absolute order of the queue/dequeue might be different, but each application should get the right output buffer, which corresponds to the queued input buffer. [Hiremath, Vaibhav] We have to create separate queues for every device open call. It would be difficult/complex for the driver to maintain special queue for request from number of applications. I know that this would be complex for every driver to maintain its special queues. But imho such an use case (multiple instance support) is so important (especially for embedded applications) that it is worth to properly design an additional framework for mem2mem v4l2 devices, so all the buffers handling will be hidden from the actual drivers. [Hiremath, Vaibhav] Initially I thought of having separate queue in driver which tries to make maximum usage of underneath hardware. Application just will queue the buffers and call streamon, driver internally queues it in his own queue and issues a resize operation (in this case) for all the queued buffers, releasing one-by-one to application. We have similar implementation internally, but not with standard V4L2 framework, it uses custom IOCTL's for everything. This is similar to what we have currently, however we want to move all our custom drivers into the generic kernel frameworks. But when we decided to provide User Space library with media controller, I thought of moving this burden to application layer. Application library will create an interface
RE: Mem2Mem V4L2 devices [RFC] - Can we enhance the V4L2 API?
Hello, On October 06, 2009 12:31 AM Karicheri, Muralidharan wrote: Are we constrained to use the QBUF/DQBUF/STREAMON/STREAMOFF model for this specific device (memory to memory)? What about adding new IOCTLs that can be used for this specific device type that possibly can simplify the implementation? Don't forget about the simplest V4L2 io model based on read() and write() calls. This io model fits very well into transaction/conversion like device. There is an issue with blocking calls, as the applications would need to use threads in order to do simple image conversion, but this can be easily avoided with non-blocking io and poll(). As we have seen in the discussion, this is not a streaming device, rather a transaction/conversion device which operate on a given frame to get a desired output frame. Each transaction may have it's own set of configuration context which will be applied to the hardware before starting the operation. This is unlike a streaming device, where most of the configuration is done prior to starting the streaming. From the application point of view an instance of such a device still is a streaming device. The application should not even know if any other apps are using the device or not (well, it may only notice the lower throughput or higher device latency, but this cannot be avoided). Application can queue input and output buffers, stream on and wait for the result. The changes done during streaming are controls like brightness, contrast, gain etc. The frames received by application are either synchronized to an input source timing or application output frame based on a display timing. Also a single IO instance is usually maintained at the driver where as in the case of memory to memory device, hardware needs to switch contexts between operations. So we might need a different approach than capture/output device. All this is internal to the device driver, which can hide it from the application. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hello, On Monday, October 05, 2009 8:27 PM Hiremath, Vaibhav wrote: [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same video node. [Hiremath, Vaibhav] Do we create separate queue of buffers based on type? I think we don't. Why not? I really see no problems implementing such driver, especially if this heavily increases the number of use cases where such device can be used. App1 App2App3... AppN || | | | --- | /dev/video0 | Resizer Driver Everyone will be doing streamon, and in normal use case every application must be getting buffers from another module (another driver, codecs, DSP, etc...) in multiple streams, 0, 1,2,3,4N Right. Every application will start streaming with (mostly) fixed scaling factor which mostly never changes. Right. The driver can store the scaling factors and other parameters in the private data of each opened instance of the /dev/video0 device. This one video node approach is possible only with constraint that, the application will always queue only 2 buffers with one CAPTURE and one with OUTPUT type. He has to wait till first/second gets finished, you can't queue multiple buffers (input and output) simultaneously. Why do you think you cannot queue multiple buffers? IMHO can perfectly queue more than one input buffer, then queue the same number of output buffers and then the device will process all the buffers. I do agree here with you that we need to investigate on whether we really have such use-case. Does it make sense to put such constraint on application? What constraint? What is the impact? Again in case of down-scaling, application may want to use same buffer as input, which is easily possible with single node approach. Right. But take into account that down-scaling is the one special case in which the operation can be performed in-place. Usually all other types of operations (like color space conversion or rotation) require 2 buffers. Please note that having only one video node would not mean that all operations must be done in-place. As Ivan stated you can perfectly queue 2 separate input and output buffers into the one video node and the driver can handle this correctly. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hello, On Monday, October 05, 2009 10:02 PM Karicheri, Muralidharan wrote: There is another use case where there are two Resizer hardware working om the same input frame and give two different output frames of different resolution. How do we handle this using the one video device approach you just described here? How the hardware is actually designed? I see two possibilities: 1. [input buffer] --[dma engine] [resizer1] --[dma]- [mem output buffer1] \- [resizer2] --[dma]- [mem output buffer2] 2. [input buffer] ---[dma engine1]- [resizer1] --[dma]- [mem output buffer1] \-[dma engine2]- [resizer2] --[dma]- [mem output buffer2] In the first case we would really have problems mapping it properly to video nodes. But we should think if there are any use cases of such design? (in terms of mem-2-mem device) I know that this Y-type design makes sense as a part of the pipeline from a sensor or decoder device. But I cannot find any useful use case for mem2mem version of it. The second case is much more trivial. One can just create two separate resizer devices (with their own nodes) or one resizer driver with two hardware resizers underneath it. In both cases application would simply queue the input buffer 2 times for both transactions. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hi, On Tue, 2009-10-06 at 08:23 +0200, Marek Szyprowski wrote: Hello, On Monday, October 05, 2009 8:27 PM Hiremath, Vaibhav wrote: [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same video node. [Hiremath, Vaibhav] Do we create separate queue of buffers based on type? I think we don't. Why not? I really see no problems implementing such driver, especially if this heavily increases the number of use cases where such device can be used. App1App2App3... AppN | | | | | --- | /dev/video0 | Resizer Driver Everyone will be doing streamon, and in normal use case every application must be getting buffers from another module (another driver, codecs, DSP, etc...) in multiple streams, 0, 1,2,3,4N Right. Every application will start streaming with (mostly) fixed scaling factor which mostly never changes. Right. The driver can store the scaling factors and other parameters in the private data of each opened instance of the /dev/video0 device. This one video node approach is possible only with constraint that, the application will always queue only 2 buffers with one CAPTURE and one with OUTPUT type. He has to wait till first/second gets finished, you can't queue multiple buffers (input and output) simultaneously. Why do you think you cannot queue multiple buffers? IMHO can perfectly queue more than one input buffer, then queue the same number of output buffers and then the device will process all the buffers. I do agree here with you that we need to investigate on whether we really have such use-case. Does it make sense to put such constraint on application? What constraint? What is the impact? Again in case of down-scaling, application may want to use same buffer as input, which is easily possible with single node approach. Right. But take into account that down-scaling is the one special case in which the operation can be performed in-place. Usually all other types of operations (like color space conversion or rotation) require 2 buffers. Please note that having only one video node would not mean that all operations must be done in-place. As Ivan stated you can perfectly queue 2 separate input and output buffers into the one video node and the driver can handle this correctly. i agree with you Marek. can i made one suggestion. as we all know some hardware can do in-place processing. i think it will be not too bad if user put same buffer as input and output, or with some spare space between start address of input and output. from driver point of view there is no difference, it will see 2 different buffers. in this case we also can save time from mapping virtual to physical addresses. but in general, i think separate input and output buffers (even overlapped), and single device node will simplify design and implementation of such drivers. Also this will be more clear and easily manageable from user space point of view. iivanov Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
-Original Message- From: Marek Szyprowski [mailto:m.szyprow...@samsung.com] Sent: Tuesday, October 06, 2009 11:53 AM To: Hiremath, Vaibhav; 'Ivan T. Ivanov'; linux-media@vger.kernel.org Cc: kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak; Marek Szyprowski Subject: RE: Mem2Mem V4L2 devices [RFC] Hello, On Monday, October 05, 2009 8:27 PM Hiremath, Vaibhav wrote: [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same video node. [Hiremath, Vaibhav] Do we create separate queue of buffers based on type? I think we don't. Why not? I really see no problems implementing such driver, especially if this heavily increases the number of use cases where such device can be used. [Hiremath, Vaibhav] I thought of it and you are correct, it should be possible. I was kind of biased and thinking in only one direction. Now I don't see any reason why we should go for 2 device node approach. Earlier I was thinking of 2 device nodes for 2 queues, if it is possible with one device node then I think we should align to single device node approach. Do you see any issues with it? Thanks, Vaibhav App1App2App3... AppN | | | | | --- | /dev/video0 | Resizer Driver Everyone will be doing streamon, and in normal use case every application must be getting buffers from another module (another driver, codecs, DSP, etc...) in multiple streams, 0, 1,2,3,4N snip case in which the operation can be performed in-place. Usually all other types of operations (like color space conversion or rotation) require 2 buffers. Please note that having only one video node would not mean that all operations must be done in-place. As Ivan stated you can perfectly queue 2 separate input and output buffers into the one video node and the driver can handle this correctly. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
-Original Message- From: linux-media-ow...@vger.kernel.org [mailto:linux-media- ow...@vger.kernel.org] On Behalf Of Ivan T. Ivanov Sent: Friday, October 02, 2009 9:55 PM To: Marek Szyprowski Cc: linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: Re: Mem2Mem V4L2 devices [RFC] Hi Marek, On Fri, 2009-10-02 at 13:45 +0200, Marek Szyprowski wrote: Hello, snip image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. I think that is not unusual one video device to define that it can support at the same time input and output operation. Lets take as example resizer device. it is always possible that it inform user space application that struct v4l2_capability.capabilities == (V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_VIDEO_OUTPUT) User can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE .pix = width x height which will instruct this device to prepare its output for this resolution. after that user can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_OUTPUT .pix = width x height using only these ioctls should be enough to device driver to know down/up scale factor required. regarding color space struct v4l2_pix_format have field 'pixelformat' which can be used to define input and output buffers content. so using only existing ioctl's user can have working resizer device. also please note that there is VIDIOC_S_CROP which can add additional flexibility of adding cropping on input or output. [Hiremath, Vaibhav] I think this makes more sense in capture pipeline, for example, Sensor/decoder - previewer - resizer - /dev/videoX last thing which should be done is to QBUF 2 buffers and call STREAMON. [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. We have to put some constraints - - Driver will treat index 0 as input always, irrespective of number of buffers queued. - Or, application should not queue more that 2 buffers. - Multi-channel use-case I think we have to have 2 device nodes which are capable of streaming multiple buffers, both are queuing the buffers. The constraint would be the buffers must be mapped one-to-one. User layer library would be important here to play major role in supporting multi-channel feature. I think we need to do some more investigation on this. Thanks, Vaibhav i think this will simplify a lot buffer synchronization. iivanov 2. Input and output in the same video node would not be compatible with the upcoming media controller, with which we will get an ability to arrange devices into a custom pipeline. Piping together two separate input-output nodes to create a new mem2mem device would be difficult and unintuitive. And that not even considering multi-output devices. My idea is to get back to the 2 video nodes per device approach and introduce a new ioctl for matching input and output instances of the same device. When such an ioctl could be called is another question. I like the idea of restricting such a call to be issued after opening video nodes and before using them. Using this ioctl, a user application would be able to match output instance to an input one, by matching their corresponding file descriptors. What do you think of such a solution? Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux- media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux- media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hello, On Friday, October 02, 2009 6:25 PM Ivan T. Ivanov wrote: On Fri, 2009-10-02 at 13:45 +0200, Marek Szyprowski wrote: Hello, During the V4L2 mini-summit and the Media Controller RFC discussion on Linux Plumbers 2009 Conference a mem2mem video device has been mentioned a few times (usually in a context of a 'resizer device' which might be a part of Camera interface pipeline or work as a standalone device). We are doing a research how our custom video/multimedia drivers can fit into the V4L2 framework. Most of our multimedia devices work in mem2mem mode. I did a quick research and I found that currently in the V4L2 framework there is no device that processes video data in a memory-to-memory model. In terms of V4L2 framework such device would be both video sink and source at the same time. The main problem is how the video nodes (/dev/videoX) should be assigned to such a device. The simplest way of implementing mem2mem device in v4l2 framework would use two video nodes (one for input and one for output). Such an idea has been already suggested on V4L2 mini-summit. Each DMA engine (either input or output) that is available in the hardware should get its own video node. In this approach an application can write() source image to for example /dev/video0 and then read the processed output from for example /dev/video1. Source and destination format/params/other custom settings also can be easily set for either source or destination node. Besides a single image, user applications can also process video streams by calling stream_on(), qbuf() + dqbuf(), stream_off() simultaneously on both video nodes. This approach has a limitation however. As user applications would have to open 2 different file descriptors to perform the processing of a single image, the v4l2 driver would need to match read() calls done on one file descriptor with write() calls from the another. The same thing would happen with buffers enqueued with qbuf(). In practice, this would result in a driver that allows only one instance of /dev/video0 as well as /dev/video1 opened. Otherwise, it would not be possible to track which opened /dev/video0 instance matches which /dev/video1 one. The real limitation of this approach is the fact, that it is hardly possible to implement multi-instance support and application multiplexing on a video device. In a typical embedded system, in contrast to most video-source-only or video-sink-only devices, a mem2mem device is very often used by more than one application at a time. Be it either simple one-shot single video frame processing or stream processing. Just consider that the 'resizer' module might be used in many applications for scaling bitmaps (xserver video subsystem, gstreamer, jpeglib, etc) only. At the first glance one might think that implementing multi-instance support should be done in a userspace daemon instead of mem2mem drivers. However I have run into problems designing such a user space daemon. Usually, video buffers are passed to v4l2 device as a user pointer or are mmaped directly from the device. The main issue that cannot be easily resolved is passing video buffers from the client application to the daemon. The daemon would queue a request on the device and return results back to the client application after a transaction is finished. Passing userspace pointers between an application and the daemon cannot be done, as they are two different processes. Mmap-type buffers are similar in this aspect - at least 2 buffer copy operations are required (from client application to device input buffers mmaped in daemon's memory and then from device output buffers to client application). Buffer copying and process context switches add both latency and additional cpu workload. In our custom drivers for mem2mem multimedia devices we implemented a queue shared between all instances of an opened mem2mem device. Each instance is assigned to an open device file descriptor. The queue is serviced in the device context, thus maximizing the device throughput. This is achieved by scheduling the next transaction in the driver (kernel) context. This may not even require a context switch at all. Do you have any ideas how would this solution fit into the current v4l2 design? Another solution that came into my mind that would not suffer from this limitation is to use the same video node for both writing input buffers and reading output buffers (or queuing both input and output buffers). Such a design causes more problems with the current v4l2 design however: 1. How to set different color space or size for input and output buffer each? It could be solved by adding a set of ioctls to get/set source image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. I think that is not
RE: Mem2Mem V4L2 devices [RFC]
Hello, On Monday, October 05, 2009 7:43 AM Hiremath, Vaibhav wrote: In terms of V4L2 framework such device would be both video sink and source at the same time. The main problem is how the video nodes (/dev/videoX) should be assigned to such a device. The simplest way of implementing mem2mem device in v4l2 framework would use two video nodes (one for input and one for output). Such an idea has been already suggested on V4L2 mini-summit. [Hiremath, Vaibhav] We discussed 2 options during summit, 1) Only one video device node, and configuring parameters using V4L2_BUF_TYPE_VIDEO_CAPTURE for input parameter and V4L2_BUF_TYPE_VIDEO_OUTPUT for output parameter. 2) 2 separate video device node, one with V4L2_BUF_TYPE_VIDEO_CAPTURE and another with V4L2_BUF_TYPE_VIDEO_OUTPUT, as mentioned by you. The obvious and preferred option would be 2, because with option 1 we could not able to achieve real streaming. And again we have to put constraint on application for fixed input buffer index. What do you mean by real streaming? This approach has a limitation however. As user applications would have to open 2 different file descriptors to perform the processing of a single image, the v4l2 driver would need to match read() calls done on one file descriptor with write() calls from the another. The same thing would happen with buffers enqueued with qbuf(). In practice, this would result in a driver that allows only one instance of /dev/video0 as well as /dev/video1 opened. Otherwise, it would not be possible to track which opened /dev/video0 instance matches which /dev/video1 one. [Hiremath, Vaibhav] Please note that we must put one limitation to application that, the buffers in both the video nodes are mapped one-to-one. This means that, Video0 (input)Video1 (output) Index-0 == index-0 Index-1 == index-1 Index-2 == index-2 Do you see any other option to this? I think this constraint is obvious from application point of view in during streaming. This is correct. Every application should queue a corresponding output buffer for each queued input buffer. NOTE that the this while discussion is how make it possible to have 2 different applications running at the same time, each of them queuing their own input and output buffers. It will look somehow like this: Video0 (input) Video1 (output) App1, Index-0 == App1, index-0 App2, Index-0 == App2, index-0 App1, Index-1 == App1, index-1 App2, Index-1 == App2, index-1 App1, Index-2 == App1, index-2 App2, Index-2 == App2, index-2 Note, that the absolute order of the queue/dequeue might be different, but each application should get the right output buffer, which corresponds to the queued input buffer. [Hiremath, Vaibhav] Initially I thought of having separate queue in driver which tries to make maximum usage of underneath hardware. Application just will queue the buffers and call streamon, driver internally queues it in his own queue and issues a resize operation (in this case) for all the queued buffers, releasing one-by-one to application. We have similar implementation internally, but not with standard V4L2 framework, it uses custom IOCTL's for everything. This is similar to what we have currently, however we want to move all our custom drivers into the generic kernel frameworks. But when we decided to provide User Space library with media controller, I thought of moving this burden to application layer. Application library will create an interface and queue and call streamon for all the buffers queued. Do you see any loopholes here? Am I missing any use-case scenario? How do you want to pass buffers from your client applications through the user space library to the video nodes? Such a design causes more problems with the current v4l2 design however: 1. How to set different color space or size for input and output buffer each? It could be solved by adding a set of ioctls to get/set source image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. 2. Input and output in the same video node would not be compatible with the upcoming media controller, with which we will get an ability to arrange devices into a custom pipeline. Piping together two separate input-output nodes to create a new mem2mem device would be difficult and unintuitive. And that not even considering multi-output devices. [Hiremath, Vaibhav] irrespective of the 2 options I mentioned before the media controller will come into picture, either for custom parameter configuration or creating/deleting links. We are only discussing about buffer queue/de-queue and input output params configuration and this has to happen
RE: Mem2Mem V4L2 devices [RFC]
Hello, On Monday, October 05, 2009 7:59 AM Hiremath, Vaibhav wrote: -Original Message- From: linux-media-ow...@vger.kernel.org [mailto:linux-media-ow...@vger.kernel.org] On Behalf Of Hiremath, Vaibhav Sent: Monday, October 05, 2009 7:59 AM To: Ivan T. Ivanov; Marek Szyprowski Cc: linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: RE: Mem2Mem V4L2 devices [RFC] -Original Message- From: linux-media-ow...@vger.kernel.org [mailto:linux-media- ow...@vger.kernel.org] On Behalf Of Ivan T. Ivanov Sent: Friday, October 02, 2009 9:55 PM To: Marek Szyprowski Cc: linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: Re: Mem2Mem V4L2 devices [RFC] Hi Marek, On Fri, 2009-10-02 at 13:45 +0200, Marek Szyprowski wrote: Hello, snip image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. I think that is not unusual one video device to define that it can support at the same time input and output operation. Lets take as example resizer device. it is always possible that it inform user space application that struct v4l2_capability.capabilities == (V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_VIDEO_OUTPUT) User can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE .pix = width x height which will instruct this device to prepare its output for this resolution. after that user can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_OUTPUT .pix = width x height using only these ioctls should be enough to device driver to know down/up scale factor required. regarding color space struct v4l2_pix_format have field 'pixelformat' which can be used to define input and output buffers content. so using only existing ioctl's user can have working resizer device. also please note that there is VIDIOC_S_CROP which can add additional flexibility of adding cropping on input or output. [Hiremath, Vaibhav] I think this makes more sense in capture pipeline, for example, Sensor/decoder - previewer - resizer - /dev/videoX I don't get this. In strictly capture pipeline we will get one video node anyway. However the question is how we should support a bit more complicated pipeline. Just consider a resizer module and the pipeline: sensor/decoder -[bus]- previewer - [memory] - resizer - [memory] ([bus] means some kind of internal bus that is completely interdependent from the system memory) Mapping to video nodes is not so trivial. In fact this pipeline consist of 2 independent (sub)pipelines connected by user space application: sensor/decoder -[bus]- previewer - [memory] -[user application]- [memory] - resizer - [memory] For further analysis it should be cut into 2 separate pipelines: a. sensor/decoder -[bus]- previewer - [memory] b. [memory] - resizer - [memory] Again, mapping the first subpipeline is trivial: sensor/decoder -[bus]- previewer - /dev/video0 But the last, can be mapped either as: /dev/video1 - resizer - /dev/video1 (one video node approach) or /dev/video1 - resizer - /dev/video2 (2 video nodes approach). So at the end the pipeline would look like this: sensor/decoder -[bus]- previewer - /dev/video0 -[user application]- /dev/video1 - resizer - /dev/video2 or sensor/decoder -[bus]- previewer - /dev/video0 -[user application]- /dev/video1 - resizer - /dev/video1 last thing which should be done is to QBUF 2 buffers and call STREAMON. [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same video node. We have to put some constraints - - Driver will treat index 0 as input always, irrespective of number of buffers queued. - Or, application should not queue more that 2 buffers. - Multi-channel use-case I think we have to have 2 device nodes which are capable of streaming multiple buffers, both are queuing the buffers. In one video node approach there can be 2 buffer queues in one video node, for input and output respectively. The constraint would be the buffers must be mapped one-to-one. Right, each queued input buffer must have corresponding output buffer. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hi, On Mon, 2009-10-05 at 15:54 +0200, Marek Szyprowski wrote: Hello, On Friday, October 02, 2009 6:25 PM Ivan T. Ivanov wrote: On Fri, 2009-10-02 at 13:45 +0200, Marek Szyprowski wrote: Hello, During the V4L2 mini-summit and the Media Controller RFC discussion on Linux Plumbers 2009 Conference a mem2mem video device has been mentioned a few times (usually in a context of a 'resizer device' which might be a part of Camera interface pipeline or work as a standalone device). We are doing a research how our custom video/multimedia drivers can fit into the V4L2 framework. Most of our multimedia devices work in mem2mem mode. I did a quick research and I found that currently in the V4L2 framework there is no device that processes video data in a memory-to-memory model. In terms of V4L2 framework such device would be both video sink and source at the same time. The main problem is how the video nodes (/dev/videoX) should be assigned to such a device. The simplest way of implementing mem2mem device in v4l2 framework would use two video nodes (one for input and one for output). Such an idea has been already suggested on V4L2 mini-summit. Each DMA engine (either input or output) that is available in the hardware should get its own video node. In this approach an application can write() source image to for example /dev/video0 and then read the processed output from for example /dev/video1. Source and destination format/params/other custom settings also can be easily set for either source or destination node. Besides a single image, user applications can also process video streams by calling stream_on(), qbuf() + dqbuf(), stream_off() simultaneously on both video nodes. This approach has a limitation however. As user applications would have to open 2 different file descriptors to perform the processing of a single image, the v4l2 driver would need to match read() calls done on one file descriptor with write() calls from the another. The same thing would happen with buffers enqueued with qbuf(). In practice, this would result in a driver that allows only one instance of /dev/video0 as well as /dev/video1 opened. Otherwise, it would not be possible to track which opened /dev/video0 instance matches which /dev/video1 one. The real limitation of this approach is the fact, that it is hardly possible to implement multi-instance support and application multiplexing on a video device. In a typical embedded system, in contrast to most video-source-only or video-sink-only devices, a mem2mem device is very often used by more than one application at a time. Be it either simple one-shot single video frame processing or stream processing. Just consider that the 'resizer' module might be used in many applications for scaling bitmaps (xserver video subsystem, gstreamer, jpeglib, etc) only. At the first glance one might think that implementing multi-instance support should be done in a userspace daemon instead of mem2mem drivers. However I have run into problems designing such a user space daemon. Usually, video buffers are passed to v4l2 device as a user pointer or are mmaped directly from the device. The main issue that cannot be easily resolved is passing video buffers from the client application to the daemon. The daemon would queue a request on the device and return results back to the client application after a transaction is finished. Passing userspace pointers between an application and the daemon cannot be done, as they are two different processes. Mmap-type buffers are similar in this aspect - at least 2 buffer copy operations are required (from client application to device input buffers mmaped in daemon's memory and then from device output buffers to client application). Buffer copying and process context switches add both latency and additional cpu workload. In our custom drivers for mem2mem multimedia devices we implemented a queue shared between all instances of an opened mem2mem device. Each instance is assigned to an open device file descriptor. The queue is serviced in the device context, thus maximizing the device throughput. This is achieved by scheduling the next transaction in the driver (kernel) context. This may not even require a context switch at all. Do you have any ideas how would this solution fit into the current v4l2 design? Another solution that came into my mind that would not suffer from this limitation is to use the same video node for both writing input buffers and reading output buffers (or queuing both input and output buffers). Such a design causes more problems with the current v4l2 design however: 1. How to set different color space or size for input and output buffer each? It could be solved by adding a set of ioctls to get/set source image format
RE: Mem2Mem V4L2 devices [RFC]
-Original Message- From: Marek Szyprowski [mailto:m.szyprow...@samsung.com] Sent: Monday, October 05, 2009 7:26 PM To: Hiremath, Vaibhav; linux-media@vger.kernel.org Cc: kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak; Marek Szyprowski Subject: RE: Mem2Mem V4L2 devices [RFC] Hello, On Monday, October 05, 2009 7:43 AM Hiremath, Vaibhav wrote: In terms of V4L2 framework such device would be both video sink and source at the same time. The main problem is how the video nodes (/dev/videoX) should be assigned to such a device. The simplest way of implementing mem2mem device in v4l2 framework would use two video nodes (one for input and one for output). Such an idea has been already suggested on V4L2 mini-summit. [Hiremath, Vaibhav] We discussed 2 options during summit, 1) Only one video device node, and configuring parameters using V4L2_BUF_TYPE_VIDEO_CAPTURE for input parameter and V4L2_BUF_TYPE_VIDEO_OUTPUT for output parameter. 2) 2 separate video device node, one with V4L2_BUF_TYPE_VIDEO_CAPTURE and another with V4L2_BUF_TYPE_VIDEO_OUTPUT, as mentioned by you. The obvious and preferred option would be 2, because with option 1 we could not able to achieve real streaming. And again we have to put constraint on application for fixed input buffer index. What do you mean by real streaming? [Hiremath, Vaibhav] I meant, after streamon, there will be just sequence of queuing and de-queuing of buffers. With single node of operation, how are we deciding which is input buffer and which one is output? We have to assume or put constraint on application that the 0th index will be always input, irrespective of number of buffers requested. In normal scenario (for example in codecs), the application will open the device once and start pumping the buffers, driver should queue the buffers as and when it comes directly to driver. This approach has a limitation however. As user applications would have to open 2 different file descriptors to perform the processing of a single image, the v4l2 driver would need to match read() calls done on one file descriptor with write() calls from the another. The same thing would happen with buffers enqueued with qbuf(). In practice, this would result in a driver that allows only one instance of /dev/video0 as well as /dev/video1 opened. Otherwise, it would not be possible to track which opened /dev/video0 instance matches which /dev/video1 one. [Hiremath, Vaibhav] Please note that we must put one limitation to application that, the buffers in both the video nodes are mapped one-to-one. This means that, Video0 (input) Video1 (output) Index-0 == index-0 Index-1 == index-1 Index-2 == index-2 Do you see any other option to this? I think this constraint is obvious from application point of view in during streaming. This is correct. Every application should queue a corresponding output buffer for each queued input buffer. NOTE that the this while discussion is how make it possible to have 2 different applications running at the same time, each of them queuing their own input and output buffers. It will look somehow like this: Video0 (input)Video1 (output) App1, Index-0 == App1, index-0 App2, Index-0 == App2, index-0 App1, Index-1 == App1, index-1 App2, Index-1 == App2, index-1 App1, Index-2 == App1, index-2 App2, Index-2 == App2, index-2 Note, that the absolute order of the queue/dequeue might be different, but each application should get the right output buffer, which corresponds to the queued input buffer. [Hiremath, Vaibhav] We have to create separate queues for every device open call. It would be difficult/complex for the driver to maintain special queue for request from number of applications. [Hiremath, Vaibhav] Initially I thought of having separate queue in driver which tries to make maximum usage of underneath hardware. Application just will queue the buffers and call streamon, driver internally queues it in his own queue and issues a resize operation (in this case) for all the queued buffers, releasing one-by-one to application. We have similar implementation internally, but not with standard V4L2 framework, it uses custom IOCTL's for everything. This is similar to what we have currently, however we want to move all our custom drivers into the generic kernel frameworks. But when we decided to provide User Space library with media controller, I thought of moving this burden to application layer. Application library will create an interface and queue and call streamon for all the buffers queued. Do you see any loopholes here? Am I missing any use-case scenario? How do you want to pass buffers from your client
RE: Mem2Mem V4L2 devices [RFC]
-Original Message- From: Marek Szyprowski [mailto:m.szyprow...@samsung.com] Sent: Monday, October 05, 2009 7:26 PM To: Hiremath, Vaibhav; 'Ivan T. Ivanov'; linux-media@vger.kernel.org Cc: kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak; Marek Szyprowski Subject: RE: Mem2Mem V4L2 devices [RFC] Hello, On Monday, October 05, 2009 7:59 AM Hiremath, Vaibhav wrote: -Original Message- From: linux-media-ow...@vger.kernel.org [mailto:linux-media- ow...@vger.kernel.org] On Behalf Of Hiremath, Vaibhav Sent: Monday, October 05, 2009 7:59 AM To: Ivan T. Ivanov; Marek Szyprowski Cc: linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: RE: Mem2Mem V4L2 devices [RFC] -Original Message- From: linux-media-ow...@vger.kernel.org [mailto:linux-media- ow...@vger.kernel.org] On Behalf Of Ivan T. Ivanov Sent: Friday, October 02, 2009 9:55 PM To: Marek Szyprowski Cc: linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: Re: Mem2Mem V4L2 devices [RFC] Hi Marek, On Fri, 2009-10-02 at 13:45 +0200, Marek Szyprowski wrote: Hello, snip image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. I think that is not unusual one video device to define that it can support at the same time input and output operation. Lets take as example resizer device. it is always possible that it inform user space application that struct v4l2_capability.capabilities == (V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_VIDEO_OUTPUT) User can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE .pix = width x height which will instruct this device to prepare its output for this resolution. after that user can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_OUTPUT .pix = width x height using only these ioctls should be enough to device driver to know down/up scale factor required. regarding color space struct v4l2_pix_format have field 'pixelformat' which can be used to define input and output buffers content. so using only existing ioctl's user can have working resizer device. also please note that there is VIDIOC_S_CROP which can add additional flexibility of adding cropping on input or output. [Hiremath, Vaibhav] I think this makes more sense in capture pipeline, for example, Sensor/decoder - previewer - resizer - /dev/videoX I don't get this. In strictly capture pipeline we will get one video node anyway. However the question is how we should support a bit more complicated pipeline. Just consider a resizer module and the pipeline: sensor/decoder -[bus]- previewer - [memory] - resizer - [memory] [Hiremath, Vaibhav] For me this is not single pipeline, it has two separate links - 1) sensor/decoder -[bus]- previewer - [memory] 2) [memory] - resizer - [memory] ([bus] means some kind of internal bus that is completely interdependent from the system memory) Mapping to video nodes is not so trivial. In fact this pipeline consist of 2 independent (sub)pipelines connected by user space application: sensor/decoder -[bus]- previewer - [memory] -[user application]- [memory] - resizer - [memory] For further analysis it should be cut into 2 separate pipelines: a. sensor/decoder -[bus]- previewer - [memory] b. [memory] - resizer - [memory] [Hiremath, Vaibhav] Correct, I wouldn't call them as sub-pipeline. Application is linking them, so from driver point of view they are completely separate. Again, mapping the first subpipeline is trivial: sensor/decoder -[bus]- previewer - /dev/video0 [Hiremath, Vaibhav] Correct, it is separate streaming device. But the last, can be mapped either as: /dev/video1 - resizer - /dev/video1 (one video node approach) [Hiremath, Vaibhav] Please go through my last response where I have mentioned about buffer queuing constraints with this approach. or /dev/video1 - resizer - /dev/video2 (2 video nodes approach). So at the end the pipeline would look like this: sensor/decoder -[bus]- previewer - /dev/video0 -[user application]- /dev/video1 - resizer - /dev/video2 or sensor/decoder -[bus]- previewer - /dev/video0 -[user application]- /dev/video1 - resizer - /dev/video1 last thing which should be done is to QBUF 2 buffers and call STREAMON. [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same
RE: Mem2Mem V4L2 devices [RFC]
Hi Vaibhav, On Mon, 2009-10-05 at 23:57 +0530, Hiremath, Vaibhav wrote: -Original Message- From: Marek Szyprowski [mailto:m.szyprow...@samsung.com] Sent: Monday, October 05, 2009 7:26 PM To: Hiremath, Vaibhav; 'Ivan T. Ivanov'; linux-media@vger.kernel.org Cc: kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak; Marek Szyprowski Subject: RE: Mem2Mem V4L2 devices [RFC] Hello, On Monday, October 05, 2009 7:59 AM Hiremath, Vaibhav wrote: -Original Message- From: linux-media-ow...@vger.kernel.org [mailto:linux-media- ow...@vger.kernel.org] On Behalf Of Hiremath, Vaibhav Sent: Monday, October 05, 2009 7:59 AM To: Ivan T. Ivanov; Marek Szyprowski Cc: linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: RE: Mem2Mem V4L2 devices [RFC] -Original Message- From: linux-media-ow...@vger.kernel.org [mailto:linux-media- ow...@vger.kernel.org] On Behalf Of Ivan T. Ivanov Sent: Friday, October 02, 2009 9:55 PM To: Marek Szyprowski Cc: linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: Re: Mem2Mem V4L2 devices [RFC] Hi Marek, On Fri, 2009-10-02 at 13:45 +0200, Marek Szyprowski wrote: Hello, snip image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. I think that is not unusual one video device to define that it can support at the same time input and output operation. Lets take as example resizer device. it is always possible that it inform user space application that struct v4l2_capability.capabilities == (V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_VIDEO_OUTPUT) User can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE .pix = width x height which will instruct this device to prepare its output for this resolution. after that user can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_OUTPUT .pix = width x height using only these ioctls should be enough to device driver to know down/up scale factor required. regarding color space struct v4l2_pix_format have field 'pixelformat' which can be used to define input and output buffers content. so using only existing ioctl's user can have working resizer device. also please note that there is VIDIOC_S_CROP which can add additional flexibility of adding cropping on input or output. [Hiremath, Vaibhav] I think this makes more sense in capture pipeline, for example, Sensor/decoder - previewer - resizer - /dev/videoX I don't get this. In strictly capture pipeline we will get one video node anyway. However the question is how we should support a bit more complicated pipeline. Just consider a resizer module and the pipeline: sensor/decoder -[bus]- previewer - [memory] - resizer - [memory] [Hiremath, Vaibhav] For me this is not single pipeline, it has two separate links - 1) sensor/decoder -[bus]- previewer - [memory] 2) [memory] - resizer - [memory] ([bus] means some kind of internal bus that is completely interdependent from the system memory) Mapping to video nodes is not so trivial. In fact this pipeline consist of 2 independent (sub)pipelines connected by user space application: sensor/decoder -[bus]- previewer - [memory] -[user application]- [memory] - resizer - [memory] For further analysis it should be cut into 2 separate pipelines: a. sensor/decoder -[bus]- previewer - [memory] b. [memory] - resizer - [memory] [Hiremath, Vaibhav] Correct, I wouldn't call them as sub-pipeline. Application is linking them, so from driver point of view they are completely separate. Again, mapping the first subpipeline is trivial: sensor/decoder -[bus]- previewer - /dev/video0 [Hiremath, Vaibhav] Correct, it is separate streaming device. But the last, can be mapped either as: /dev/video1 - resizer - /dev/video1 (one video node approach) [Hiremath, Vaibhav] Please go through my last response where I have mentioned about buffer queuing constraints with this approach. or /dev/video1 - resizer - /dev/video2 (2 video nodes approach). So at the end the pipeline would look like this: sensor/decoder -[bus]- previewer - /dev/video0 -[user application]- /dev/video1 - resizer - /dev/video2 or sensor/decoder -[bus]- previewer - /dev/video0 -[user application]- /dev/video1 - resizer - /dev/video1 last thing which should be done is to QBUF 2 buffers and call STREAMON. [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem
RE: Mem2Mem V4L2 devices [RFC]
-Original Message- From: Ivan T. Ivanov [mailto:iiva...@mm-sol.com] Sent: Tuesday, October 06, 2009 12:27 AM To: Hiremath, Vaibhav Cc: Marek Szyprowski; linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: RE: Mem2Mem V4L2 devices [RFC] snip last thing which should be done is to QBUF 2 buffers and call STREAMON. [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same video node. [Hiremath, Vaibhav] Do we create separate queue of buffers based on type? I think we don't. App1App2App3... AppN | | | | | --- | /dev/video0 | Resizer Driver why not? they can be per file handler input/output queue. and we can do time sharing use of resizer driver like Marek suggests. [Hiremath, Vaibhav] Ivan, File handle based queue and buffer type based queue are two different terms. Yes, definitely we have to create separate queues for each file handle to support multiple channels. But my question was for buffer type, CAPTURE and OUTPUT. Thanks, Vaibhav Everyone will be doing streamon, and in normal use case every application must be getting buffers from another module (another driver, codecs, DSP, etc...) in multiple streams, 0, 1,2,3,4N Every application will start streaming with (mostly) fixed scaling factor which mostly never changes. This one video node approach is possible only with constraint that, the application will always queue only 2 buffers with one CAPTURE and one with OUTPUT type. i don't see how 2 device node approach can help with this case. even in normal video capture device you should stop streaming when change buffer sizes. He has to wait till first/second gets finished, you can't queue multiple buffers (input and output) simultaneously. actually this should be possible. iivanov I do agree here with you that we need to investigate on whether we really have such use-case. Does it make sense to put such constraint on application? What is the impact? Again in case of down-scaling, application may want to use same buffer as input, which is easily possible with single node approach. Thanks, Vaibhav We have to put some constraints - - Driver will treat index 0 as input always, irrespective of number of buffers queued. - Or, application should not queue more that 2 buffers. - Multi-channel use-case I think we have to have 2 device nodes which are capable of streaming multiple buffers, both are queuing the buffers. In one video node approach there can be 2 buffer queues in one video node, for input and output respectively. The constraint would be the buffers must be mapped one-to-one. Right, each queued input buffer must have corresponding output buffer. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
On Tue, 2009-10-06 at 00:31 +0530, Hiremath, Vaibhav wrote: -Original Message- From: Ivan T. Ivanov [mailto:iiva...@mm-sol.com] Sent: Tuesday, October 06, 2009 12:27 AM To: Hiremath, Vaibhav Cc: Marek Szyprowski; linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: RE: Mem2Mem V4L2 devices [RFC] snip last thing which should be done is to QBUF 2 buffers and call STREAMON. [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same video node. [Hiremath, Vaibhav] Do we create separate queue of buffers based on type? I think we don't. App1 App2App3... AppN || | | | --- | /dev/video0 | Resizer Driver why not? they can be per file handler input/output queue. and we can do time sharing use of resizer driver like Marek suggests. [Hiremath, Vaibhav] Ivan, File handle based queue and buffer type based queue are two different terms. really? ;) Yes, definitely we have to create separate queues for each file handle to support multiple channels. But my question was for buffer type, CAPTURE and OUTPUT. let me see. you concern is that for very big frames 1X Mpix, managing separate buffers for input and output will be waste of space for operations like downs calling. i know that such operations can be done in-place ;). but what about up-scaling. this also should be possible, but with some very dirty hacks. iivanov Thanks, Vaibhav Everyone will be doing streamon, and in normal use case every application must be getting buffers from another module (another driver, codecs, DSP, etc...) in multiple streams, 0, 1,2,3,4N Every application will start streaming with (mostly) fixed scaling factor which mostly never changes. This one video node approach is possible only with constraint that, the application will always queue only 2 buffers with one CAPTURE and one with OUTPUT type. i don't see how 2 device node approach can help with this case. even in normal video capture device you should stop streaming when change buffer sizes. He has to wait till first/second gets finished, you can't queue multiple buffers (input and output) simultaneously. actually this should be possible. iivanov I do agree here with you that we need to investigate on whether we really have such use-case. Does it make sense to put such constraint on application? What is the impact? Again in case of down-scaling, application may want to use same buffer as input, which is easily possible with single node approach. Thanks, Vaibhav We have to put some constraints - - Driver will treat index 0 as input always, irrespective of number of buffers queued. - Or, application should not queue more that 2 buffers. - Multi-channel use-case I think we have to have 2 device nodes which are capable of streaming multiple buffers, both are queuing the buffers. In one video node approach there can be 2 buffer queues in one video node, for input and output respectively. The constraint would be the buffers must be mapped one-to-one. Right, each queued input buffer must have corresponding output buffer. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Thanks, Vaibhav Hiremath Platform Support Products Texas Instruments Inc Ph: +91-80-25099927 -Original Message- From: Ivan T. Ivanov [mailto:iiva...@mm-sol.com] Sent: Tuesday, October 06, 2009 12:39 AM To: Hiremath, Vaibhav Cc: Marek Szyprowski; linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: RE: Mem2Mem V4L2 devices [RFC] On Tue, 2009-10-06 at 00:31 +0530, Hiremath, Vaibhav wrote: -Original Message- From: Ivan T. Ivanov [mailto:iiva...@mm-sol.com] Sent: Tuesday, October 06, 2009 12:27 AM To: Hiremath, Vaibhav Cc: Marek Szyprowski; linux-media@vger.kernel.org; kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak Subject: RE: Mem2Mem V4L2 devices [RFC] snip last thing which should be done is to QBUF 2 buffers and call STREAMON. [Hiremath, Vaibhav] IMO, this implementation is not streaming model, we are trying to fit mem-to-mem forcefully to streaming. Why this does not fit streaming? I see no problems with streaming over mem2mem device with only one video node. You just queue input and output buffers (they are distinguished by 'type' parameter) on the same video node. [Hiremath, Vaibhav] Do we create separate queue of buffers based on type? I think we don't. App1App2App3... AppN | | | | | --- | /dev/video0 | Resizer Driver why not? they can be per file handler input/output queue. and we can do time sharing use of resizer driver like Marek suggests. [Hiremath, Vaibhav] Ivan, File handle based queue and buffer type based queue are two different terms. really? ;) Yes, definitely we have to create separate queues for each file handle to support multiple channels. But my question was for buffer type, CAPTURE and OUTPUT. let me see. you concern is that for very big frames 1X Mpix, managing separate buffers for input and output will be waste of space for operations like downs calling. i know that such operations can be done in-place ;). but what about up-scaling. this also should be possible, but with some very dirty hacks. [Hiremath, Vaibhav] Dirty hacks??? I think, for upscaling we have to have 2 separate buffers, I do not see any options here. Thanks, Vaibhav iivanov Thanks, Vaibhav Everyone will be doing streamon, and in normal use case every application must be getting buffers from another module (another driver, codecs, DSP, etc...) in multiple streams, 0, 1,2,3,4N Every application will start streaming with (mostly) fixed scaling factor which mostly never changes. This one video node approach is possible only with constraint that, the application will always queue only 2 buffers with one CAPTURE and one with OUTPUT type. i don't see how 2 device node approach can help with this case. even in normal video capture device you should stop streaming when change buffer sizes. He has to wait till first/second gets finished, you can't queue multiple buffers (input and output) simultaneously. actually this should be possible. iivanov I do agree here with you that we need to investigate on whether we really have such use-case. Does it make sense to put such constraint on application? What is the impact? Again in case of down- scaling, application may want to use same buffer as input, which is easily possible with single node approach. Thanks, Vaibhav We have to put some constraints - - Driver will treat index 0 as input always, irrespective of number of buffers queued. - Or, application should not queue more that 2 buffers. - Multi-channel use-case I think we have to have 2 device nodes which are capable of streaming multiple buffers, both are queuing the buffers. In one video node approach there can be 2 buffer queues in one video node, for input and output respectively. The constraint would be the buffers must be mapped one-to- one. Right, each queued input buffer must have corresponding output buffer. Best regards -- Marek Szyprowski Samsung Poland RD Center -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
1. How to set different color space or size for input and output buffer each? It could be solved by adding a set of ioctls to get/set source image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. I think that is not unusual one video device to define that it can support at the same time input and output operation. Lets take as example resizer device. it is always possible that it inform user space application that struct v4l2_capability.capabilities == (V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_VIDEO_OUTPUT) User can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE .pix = width x height which will instruct this device to prepare its output for this resolution. after that user can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_OUTPUT .pix = width x height using only these ioctls should be enough to device driver to know down/up scale factor required. regarding color space struct v4l2_pix_format have field 'pixelformat' which can be used to define input and output buffers content. so using only existing ioctl's user can have working resizer device. also please note that there is VIDIOC_S_CROP which can add additional flexibility of adding cropping on input or output. last thing which should be done is to QBUF 2 buffers and call STREAMON. i think this will simplify a lot buffer synchronization. Ivan, There is another use case where there are two Resizer hardware working on the same input frame and give two different output frames of different resolution. How do we handle this using the one video device approach you just described here? Murali -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Hi, On Mon, 2009-10-05 at 15:02 -0500, Karicheri, Muralidharan wrote: 1. How to set different color space or size for input and output buffer each? It could be solved by adding a set of ioctls to get/set source image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. I think that is not unusual one video device to define that it can support at the same time input and output operation. Lets take as example resizer device. it is always possible that it inform user space application that struct v4l2_capability.capabilities == (V4L2_CAP_VIDEO_CAPTURE | V4L2_CAP_VIDEO_OUTPUT) User can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE .pix = width x height which will instruct this device to prepare its output for this resolution. after that user can issue S_FMT ioctl supplying struct v4l2_format.type = V4L2_BUF_TYPE_VIDEO_OUTPUT .pix = width x height using only these ioctls should be enough to device driver to know down/up scale factor required. regarding color space struct v4l2_pix_format have field 'pixelformat' which can be used to define input and output buffers content. so using only existing ioctl's user can have working resizer device. also please note that there is VIDIOC_S_CROP which can add additional flexibility of adding cropping on input or output. last thing which should be done is to QBUF 2 buffers and call STREAMON. i think this will simplify a lot buffer synchronization. Ivan, There is another use case where there are two Resizer hardware working on the same input frame and give two different output frames of different resolution. How do we handle this using the one video device approach you just described here? what is the difference? - you can have only one resizer device driver which will hide that they are actually 2 hardware resizers. just operations will be faster ;). - they are two device drivers (nodes) with similar characteristics. in both cases input buffer can be the same. iivanov Murali -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
Ivan, There is another use case where there are two Resizer hardware working on the same input frame and give two different output frames of different resolution. How do we handle this using the one video device approach you just described here? what is the difference? - you can have only one resizer device driver which will hide that they are actually 2 hardware resizers. just operations will be faster ;). In your implementation as mentioned above, there will be one queue for the OUTPUT buffer type and another queue for the CAPTURE buffer type right? So if we have two Resizer outputs, then we would need two queues of the CAPTURE buffer type. When application calls QBUF, on the node, which queue will be used for the buffer? So this makes me believe we need to two capture nodes and one output node for this driver. - they are two device drivers (nodes) with similar characteristics. in both cases input buffer can be the same. iivanov Murali -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Mem2Mem V4L2 devices [RFC]
-Original Message- From: linux-media-ow...@vger.kernel.org [mailto:linux-media- ow...@vger.kernel.org] On Behalf Of Marek Szyprowski Sent: Friday, October 02, 2009 5:15 PM To: linux-media@vger.kernel.org Cc: kyungmin.p...@samsung.com; Tomasz Fujak; Pawel Osciak; Marek Szyprowski Subject: Mem2Mem V4L2 devices [RFC] Hello, During the V4L2 mini-summit and the Media Controller RFC discussion on Linux Plumbers 2009 Conference a mem2mem video device has been mentioned a few times (usually in a context of a 'resizer device' which might be a part of Camera interface pipeline or work as a standalone device). We are doing a research how our custom video/multimedia drivers can fit into the V4L2 framework. Most of our multimedia devices work in mem2mem mode. I did a quick research and I found that currently in the V4L2 framework there is no device that processes video data in a memory-to-memory model. [Hiremath, Vaibhav] yes you are right; we do not have readily support available in V4L2 framework. In terms of V4L2 framework such device would be both video sink and source at the same time. The main problem is how the video nodes (/dev/videoX) should be assigned to such a device. The simplest way of implementing mem2mem device in v4l2 framework would use two video nodes (one for input and one for output). Such an idea has been already suggested on V4L2 mini-summit. [Hiremath, Vaibhav] We discussed 2 options during summit, 1) Only one video device node, and configuring parameters using V4L2_BUF_TYPE_VIDEO_CAPTURE for input parameter and V4L2_BUF_TYPE_VIDEO_OUTPUT for output parameter. 2) 2 separate video device node, one with V4L2_BUF_TYPE_VIDEO_CAPTURE and another with V4L2_BUF_TYPE_VIDEO_OUTPUT, as mentioned by you. The obvious and preferred option would be 2, because with option 1 we could not able to achieve real streaming. And again we have to put constraint on application for fixed input buffer index. Each DMA engine (either input or output) that is available in the hardware should get its own video node. In this approach an application can write() source image to for example /dev/video0 and then read the processed output from for example /dev/video1. Source and destination format/params/other custom settings also can be easily set for either source or destination node. Besides a single image, user applications can also process video streams by calling stream_on(), qbuf() + dqbuf(), stream_off() simultaneously on both video nodes. [Hiremath, Vaibhav] Correct. This approach has a limitation however. As user applications would have to open 2 different file descriptors to perform the processing of a single image, the v4l2 driver would need to match read() calls done on one file descriptor with write() calls from the another. The same thing would happen with buffers enqueued with qbuf(). In practice, this would result in a driver that allows only one instance of /dev/video0 as well as /dev/video1 opened. Otherwise, it would not be possible to track which opened /dev/video0 instance matches which /dev/video1 one. [Hiremath, Vaibhav] Please note that we must put one limitation to application that, the buffers in both the video nodes are mapped one-to-one. This means that, Video0 (input) Video1 (output) Index-0 == index-0 Index-1 == index-1 Index-2 == index-2 Do you see any other option to this? I think this constraint is obvious from application point of view in during streaming. The real limitation of this approach is the fact, that it is hardly possible to implement multi-instance support and application multiplexing on a video device. In a typical embedded system, in contrast to most video-source-only or video-sink-only devices, a mem2mem device is very often used by more than one application at a time. Be it either simple one-shot single video frame processing or stream processing. Just consider that the 'resizer' module might be used in many applications for scaling bitmaps (xserver video subsystem, gstreamer, jpeglib, etc) only. [Hiremath, Vaibhav] Correct. At the first glance one might think that implementing multi-instance support should be done in a userspace daemon instead of mem2mem drivers. However I have run into problems designing such a user space daemon. Usually, video buffers are passed to v4l2 device as a user pointer or are mmaped directly from the device. The main issue that cannot be easily resolved is passing video buffers from the client application to the daemon. The daemon would queue a request on the device and return results back to the client application after a transaction is finished. Passing userspace pointers between an application and the daemon cannot be done, as they are two different processes. Mmap-type buffers are similar in this aspect - at least 2 buffer copy operations are required (from client
Re: Mem2Mem V4L2 devices [RFC]
Hi Marek, On Fri, 2009-10-02 at 13:45 +0200, Marek Szyprowski wrote: Hello, During the V4L2 mini-summit and the Media Controller RFC discussion on Linux Plumbers 2009 Conference a mem2mem video device has been mentioned a few times (usually in a context of a 'resizer device' which might be a part of Camera interface pipeline or work as a standalone device). We are doing a research how our custom video/multimedia drivers can fit into the V4L2 framework. Most of our multimedia devices work in mem2mem mode. I did a quick research and I found that currently in the V4L2 framework there is no device that processes video data in a memory-to-memory model. In terms of V4L2 framework such device would be both video sink and source at the same time. The main problem is how the video nodes (/dev/videoX) should be assigned to such a device. The simplest way of implementing mem2mem device in v4l2 framework would use two video nodes (one for input and one for output). Such an idea has been already suggested on V4L2 mini-summit. Each DMA engine (either input or output) that is available in the hardware should get its own video node. In this approach an application can write() source image to for example /dev/video0 and then read the processed output from for example /dev/video1. Source and destination format/params/other custom settings also can be easily set for either source or destination node. Besides a single image, user applications can also process video streams by calling stream_on(), qbuf() + dqbuf(), stream_off() simultaneously on both video nodes. This approach has a limitation however. As user applications would have to open 2 different file descriptors to perform the processing of a single image, the v4l2 driver would need to match read() calls done on one file descriptor with write() calls from the another. The same thing would happen with buffers enqueued with qbuf(). In practice, this would result in a driver that allows only one instance of /dev/video0 as well as /dev/video1 opened. Otherwise, it would not be possible to track which opened /dev/video0 instance matches which /dev/video1 one. The real limitation of this approach is the fact, that it is hardly possible to implement multi-instance support and application multiplexing on a video device. In a typical embedded system, in contrast to most video-source-only or video-sink-only devices, a mem2mem device is very often used by more than one application at a time. Be it either simple one-shot single video frame processing or stream processing. Just consider that the 'resizer' module might be used in many applications for scaling bitmaps (xserver video subsystem, gstreamer, jpeglib, etc) only. At the first glance one might think that implementing multi-instance support should be done in a userspace daemon instead of mem2mem drivers. However I have run into problems designing such a user space daemon. Usually, video buffers are passed to v4l2 device as a user pointer or are mmaped directly from the device. The main issue that cannot be easily resolved is passing video buffers from the client application to the daemon. The daemon would queue a request on the device and return results back to the client application after a transaction is finished. Passing userspace pointers between an application and the daemon cannot be done, as they are two different processes. Mmap-type buffers are similar in this aspect - at least 2 buffer copy operations are required (from client application to device input buffers mmaped in daemon's memory and then from device output buffers to client application). Buffer copying and process context switches add both latency and additional cpu workload. In our custom drivers for mem2mem multimedia devices we implemented a queue shared between all instances of an opened mem2mem device. Each instance is assigned to an open device file descriptor. The queue is serviced in the device context, thus maximizing the device throughput. This is achieved by scheduling the next transaction in the driver (kernel) context. This may not even require a context switch at all. Do you have any ideas how would this solution fit into the current v4l2 design? Another solution that came into my mind that would not suffer from this limitation is to use the same video node for both writing input buffers and reading output buffers (or queuing both input and output buffers). Such a design causes more problems with the current v4l2 design however: 1. How to set different color space or size for input and output buffer each? It could be solved by adding a set of ioctls to get/set source image format and size, while the existing v4l2 ioctls would only refer to the output buffer. Frankly speaking, we don't like this idea. I think that is not unusual one video device to define that it can support at