(Retry, hopefully this is the correct dxr3-devel list address. I've 
added the dxr3 maintainer as an extra CC as well, just in case it 
bounces again).

RFC MPEG encoding and decoding V4L2 API additions
Version 0.1

This RFC adds new functionality to the V4L2 API in order to properly 
support MPEG hardware encoders and decoders. This is mostly driven by 
the work to get the ivtv driver (www.ivtvdriver.org) into the kernel, 
but it can also benefit other hardware encoders and decoders. Which is 
why this RFC is cross-posted to the dxr3-devel mailinglist as well.

A general note: while MPEG-1/2/4 is currently the codec most often 
found, this RFC should also work for other compressed-stream format, 
possibly with some later additions.

This RFC only deals with the encoding and decoding part. The cx23415 
also supports and On-Screen Display (OSD). Another RFC will appear for 
that later. I need to do some more research on that first before I can 
issue that.

This RFC is divided into several sections. The first section describes a 
few additional MPEG compression controls. It is followed by a 
description of the new Program Index functionality. Then a description 
is given of the actual MPEG encoding commands (start, stop, pause, 
resume) and the timing query ioctl.

This is followed by a description of new MPEG decompression controls and 
a description of the MPEG decoding commands and timing query ioctls.

Finally there is a section on the rationale of the some of the decisions 
taken in this RFC.


Part I: MPEG encoding
=====================

MPEG compression controls
-------------------------

V4L2_CID_MPEG_VIDEO_MUTE
Type: integer
Description: Mutes the video to a fixed color when capturing. This is 
useful for testing as it creates a fixed and reproducable video 
bitstream.

The supplied 32-bit integer has the following value:

         0      '0'=video not muted
                '1'=video muted, creates frames with the YUV color 
defined below
         1:7    Unused, set to 0.
         8:15   V chrominance information
        16:23   U chrominance information
        24:31   Y luminance information

V4L2_CID_MPEG_AUDIO_MUTE
Type: bool
Description: Mutes the audio when capturing. This is not done by muting 
audio hardware, which can still produce a slight hiss, but in the 
encoder itself, guaranteeing a fixed and reproducable audio bitstream.

0 = unmuted, 1 = muted.
 
V4L2_CID_MPEG_CX2341X_STREAM_INSERT_NAV_PACKETS
Type: bool
Description: this control is specific to the CX23415/6. If set, then it 
enables navigation pack insertion for DVD. To be precise: it adds 0xbf 
(private stream 2) packets to the MPEG. The size of these packets is 
2048 bytes (including the 6-byte header). The payload is zeroed and it 
is up to the application to fill them in. These packets are inserted 
every four frames.

0 = do not insert, 1 = insert DVD navigation packets.


MPEG Program Index
------------------

#define V4L2_PGMIDX_FRAME_P 0
#define V4L2_PGMIDX_FRAME_I 1
#define V4L2_PGMIDX_FRAME_B 2
#define V4L2_PGMIDX_FRAME_MASK 3

struct v4l2_pgmidx_entry {
        u64 offset;
        u64 pts;
        u32 length;
        u32 flags;
        u32 reserved[2];
};

#define V4L2_PGMIDX_ENTRIES (64)
struct v4l2_pgmidx {
        u32 entries;
        u32 entries_cap;
        u32 reserved[4];
        struct v4l2_pgmidx_entry entry[V4L2_PGMIDX_ENTRIES];
};
#define VIDIOC_G_ENCODER_PGMIDX        _IOR('V', 64, struct v4l2_pgmidx)

Return program indices. I.e. at the given offset a frame starts (P/I/B 
according to the flags) and with the given PTS (Presentation Time 
Stamp) and length. The offset may never exceed the number of bytes 
actually read. I.e. it should never return 'future events'.

'entries' is the number of entries filled in the entry 
array. 'entries_cap' is the capacity of the index in the driver. This 
may be larger or smalled than V4L2_PGMIDX_ENTRIES. 'entries' will 
always be less or equal to min(entries_cap, V4L2_PGMIDX_ENTRIES).

If this ioctl is called when no capture is in progress, then 'entries' 
is 0 and 'entries_cap' should be set to the capacity. This way 
applications can check beforehand how frequently the index should be 
obtained. 


MPEG Encoding commands
----------------------

#define V4L2_ENC_CMD_START      (0)
#define V4L2_ENC_CMD_STOP       (1)
#define V4L2_ENC_CMD_PAUSE      (2)
#define V4L2_ENC_CMD_RESUME     (3)

/* Flags for V4L2_ENC_CMD_STOP */
#define V4L2_ENC_CMD_STOP_AT_GOP_END    (1 << 0)

struct v4l2_encoder_cmd {
        __u32 cmd;
        __u32 flags;
        union {
                struct {
                        __u32 data[16];
                } raw;
        };
};
#define VIDIOC_ENCODER_CMD     _IORW('V', 69, struct v4l2_encoder_cmd)
#define VIDIOC_TRY_ENCODER_CMD _IORW('V', 69, struct v4l2_encoder_cmd)

Before calling this ioctl the unused fields of v4l2_encoder_cmd must be 
zeroed.

'cmd' is set by the user and is the command for the encoder.
'flags' is currently only used by the STOP command and contains one bit: 
If V4L2_ENC_CMD_STOP_AT_GOP_END is set, then the capture continues 
until the end of the GOP, otherwise it stops immediately.

These ioctl wills check whether the command is supported (-EINVAL is 
returned if not) and modify any arguments if needed to make it a valid 
call for the available hardware. The modified arguments are returned. 
The VIDIOC_TRY_ENCODER_CMD is identical to VIDIOC_ENCODER_CMD, except 
that the TRY ioctl does not actually execute the command.

Note that a read() to a stopped encoder implies a V4L2_ENC_CMD_START. A 
close() of an encoder that is currently encoding implies an immediate 
V4L2_ENC_CMD_STOP. When the encoder has no more pending data after 
issuing a STOP the read() call will return 0 to indicate that the 
encoder has stopped. The next read will start the encoder again.

MPEG Timing query
-----------------

struct v4l2_stream_timing {
        u32 frame;      // frame counter from start of capture/playback.
                        // starts at 1. 0 = unknown
        u64 pts;        // MPEG program time stamp.  33 bits, 0 = unknown
        u64 clock_ref;  // MPEG system clock reference.  42 bits, 0 = unknown.
        u32 reserved[8];
};
#define VIDIOC_G_ENCODER_TIMING   _IOR('V', 70, struct 
v4l2_stream_timing)

Return the timing information of the last read frame.
The unit of the PTS is 1/90000 second.
The clock_ref (also known as a SCR for an MPEG Program Stream or PCR for 
an MPEG Transport Stream) consists of two parts: bits 9-41 is the 
reference base in units of 1/90000 second. Bits 0-8 form the reference 
extension with units of 1/27000000 second. The range of the ref. 
extension is 0-299. If unknown, then the reference extension must be 
set to 0.

These units come from the MPEG standard. Room is reserved in the timing 
struct for other timing information should that be required.




Part II: MPEG decoding
======================

MPEG decompression controls
---------------------------

The MPEG decompression controls all belong to the MPEG decompression
class:

#define V4L2_CTRL_CLASS_MPEG_DEC 0x009a0000 /* MPEG-decompression 
controls */

enum v4l2_mpeg_dec_audmode {
        V4L2_DECODER_AUDMODE_STEREO = 0,
        V4L2_DECODER_AUDMODE_LEFT   = 1,
        V4L2_DECODER_AUDMODE_RIGHT  = 2,
        V4L2_DECODER_AUDMODE_MONO   = 3,
        V4L2_DECODER_AUDMODE_SWAP   = 4,
};
V4L2_CID_MPEG_DEC_AUDMODE_STEREO
Type: v4l2_mpeg_dec_audmode enum
Description: Select how an MPEG stereo audio stream should be decoded.

V4L2_CID_MPEG_DEC_AUDMODE_BILINGUAL
Type: v4l2_mpeg_dec_audmode enum
Description: Select how an MPEG bilingual audio stream should be 
decoded.

Background information: the ivtv driver detects when the capture source 
has bilingual audio and sets the MPEG stream marker that tells the 
decoder that the content of the stream contains bilingual audio. The 
decoder detects this marker as well and automatically selects the 
stereo or bilingual audio mode.

V4L2_CID_MPEG_DEC_STREAM_PID_AUDIO
Type: integer
Description: Select which audio Transport Stream Packet ID should be 
used for playback. Default = 256.

V4L2_CID_MPEG_DEC_STREAM_PID_VIDEO
Type: integer
Description: Select which video Transport Stream Packet ID should be 
used for playback. Default = 260.




MPEG Decoding commands
----------------------

#define V4L2_DEC_CMD_START              (0)
#define V4L2_DEC_CMD_STOP               (1)
#define V4L2_DEC_CMD_PAUSE              (2)
#define V4L2_DEC_CMD_RESUME             (3)
#define V4L2_DEC_CMD_SPEED              (4)
#define V4L2_DEC_CMD_REVERSE_SPEED      (5)
#define V4L2_DEC_CMD_STEP               (6)
#define V4L2_DEC_CMD_REVERSE_STEP       (7)
#define V4L2_DEC_CMD_PASSTHROUGH_START  (8)
#define V4L2_DEC_CMD_PASSTHROUGH_STOP   (9)

/* Flags for V4L2_DEC_CMD_PAUSE */
#define V4L2_DEC_CMD_PAUSE_TO_BLACK     (1 << 0)

/* Flags for V4L2_DEC_CMD_STOP */
#define V4L2_DEC_CMD_STOP_TO_BLACK      (1 << 0)
#define V4L2_DEC_CMD_STOP_WAIT_FOR_END  (1 << 1)

/* Flags for V4L2_DEC_CMD_SPEED/REVERSE_SPEED */
#define V4L2_DEC_CMD_SPEED_MUTE_AUDIO   (1 << 0)

/* Speed input formats: */

/* The decoder has no special format requirements */
#define V4L2_DEC_SPEED_FMT_NONE         (0)
/* The decoder requires full GOPs */
#define V4L2_DEC_SPEED_FMT_GOP          (1)

struct v4l2_decoder_cmd {
        __u32 cmd;
        __u32 flags;
        union {
                struct {
                        __u64 pts;
                } stop;

                struct {
                        v4l2_fract factor;
                        __u32 format;
                } speed;

                struct {
                        __u32 data[16];
                } raw;
        };
};
#define VIDIOC_DECODER_CMD     _IORW('V', 69, struct v4l2_decoder_cmd)
#define VIDIOC_TRY_DECODER_CMD _IORW('V', 69, struct v4l2_decoder_cmd)

Before calling this ioctl the unused fields of v4l2_decoder_cmd must
be zeroed.

'cmd' is set by the user and is the command for the decoder.
The PASSTHROUGH commands are probably fairly specific for the cx23415: 
if the passthrough mode is start then the video/audio input is routed 
straight to the video/audio output. This is done internally on the 
cx23415. While PASSTHROUGH is on, it is still possible to record from 
the input at the same time. It's basically live TV functionality. The 
other commands are self-explanatory.

'flags' is used by several commands:

PAUSE and STOP can either leave the last frame or clear the output to 
black at the end.

STOP can also wait for the command to finish, so the ioctl doesn't 
return until the decoder has stopped decoding. Useful for waiting until 
all buffers are decoded. -EINTR is returned if a signal interrupted 
this ioctl. It is also possible to specify a PTS to stop at. If pts == 
0, then the decoder stops accepting new data immediately.

The SPEED commands can mute the audio.

The speed is set using a fraction where 1 is normal speed. The driver 
will map this fraction to the next valid speed that is supported by 
hardware.

The format is set to the input requirements of the decoder in order to 
handle the given speed. Either there are no requirements, or it 
requires that full GOPs are passed to the decoder at a time. That is 
for example how reverse playback is implemented: a full Group Of 
Pictures is passed to the decoder, followed by the previous GOP, etc. 
etc. In the future additional formats might be added, such as I-frames 
only.

If you want faster playback than is supported by the hardware, then you 
need to do so in software by skipping GOPs.

STEP/REVERSE_STEP will step through the mpeg frame-by-frame.

These ioctl wills check whether the command is supported (-EINVAL is 
returned if not) and modify any arguments if needed to make it a valid 
call for the available hardware. The modified arguments are returned. 
The VIDIOC_TRY_DECODER_CMD is identical to VIDIOC_DECODER_CMD, except 
that the TRY ioctl does not actually execute the command.

Note that a write() to a stopped decoder implies a V4L2_DEC_CMD_START. A 
close() of a decoder that is currently decoding implies an immediate 
V4L2_DEC_CMD_STOP. When the decoder stops accepting data after issuing 
a STOP the write() call will return 0 to indicate that the decoder has 
stopped and accepts no more data. The next write will start the decoder 
again.


MPEG Timing query
-----------------

#define VIDIOC_G_DECODER_TIMING   _IOR('V', 70, struct 
v4l2_stream_timing)

Return timing information of last playbacked frame

#define VIDIOC_G_DECODER_TIMING_SYNC   _IOR('V', 70, struct 
v4l2_stream_timing)

Wait for next frame to be displayed and return the timing information of 
that frame.

The unit of the PTS is 1/90000 second.
The clock_ref (also known as a SCR for an MPEG Program Stream or PCR for 
an MPEG Transport Stream) consists of two parts: bits 9-41 is the 
reference base in units of 1/90000 second. Bits 0-8 form the reference 
extension with units of 1/27000000 second. The range of the ref. 
extension is 0-299. If unknown, then the reference extension must be 
set to 0.

These units come from the MPEG standard. Room is reserved in the timing 
struct for other timing information should that be required.




Part III: Rationale of Encoder/Decoder Commands
===============================================


Encoder/Decoder commands are simple commands to the encoder or decoder: 
start, stop, pause, resume, fast forward, etc. Basically the commands 
you have on a DVD/CD player.

Not all hardware supports all actions, so the programmer needs to be 
able to query somehow what is supported. Just checking whether e.g. 
PAUSE returns -EINVAL is not an option: you need to be able to check 
the presence of an action without actually executing it.

Each action can have flags and other arguments. E.g. PAUSE has a flag to 
say whether the TV-OUT should go to black or if the last frame should 
remain. STOP has an optional PTS to postpone stopping until that pts is 
reached. The speed settings for FWD/REW are more complicated since it 
depends on the hardware what speeds are supported natively, so you need 
to be able to query whether a certain speed is supported and if not, 
what the closest matching speed is.

There are several options how to implement this:

1) Each action has its own struct and ioctl. One CAP ioctl returns
   a bitmask listing the supported actions.

+ simple
- no way to check which flags/values are supported, esp. for possible 
speed settings
- lots of ioctls
- new actions -> new ioctls + struct

2) One action ioctl, receiving a struct containing an action enum
   and a union where each action has its own struct. One CAP ioctl
   returns a bitmask listing the supported actions.

+ simple
+ easily extendable with new actions (although limited by the CAP
  bitmask width, but that's unlikely to be a problem)
- cannot check speed ratio this way

3) One action ioctl as 2) and a corresponding TRY ioctl

+ easily extendable with new actions
+ able to check/modify action arguments.
- the TRY is more complicated

4) One action ioctl as 2) and a CAP ioctl with its own struct
   containing an action enum and a union with capability settings
   for each type of action. For the speed check it allows you to
   specify a speed and it will return the closest supported one.

+ simple
+ easily extendable with new actions
+ speed can be checked
- CAP is contrived. Would need its own union to return action
  specific capabilities.

Option 3 has IMHO the best balance between extendability and ease of 
use. It also matches existing usage of 'TRY' ioctls.


=============================================================

This concludes this RFC. Comments are welcome!

Regards,

        Hans Verkuil


_______________________________________________
ivtv-devel mailing list
ivtv-devel@ivtvdriver.org
http://ivtvdriver.org/mailman/listinfo/ivtv-devel

Reply via email to