Re: [libav-user] Decode H264 streams: how to fill AVCodecContext from SPS PPS

Sven Wasmer Sun, 19 Sep 2010 04:58:34 -0700

Hey Huy and Alex,

thanks for your ideas, maybe that's what's going wrong. I thought the first 
frame after the PPS would be the IDR, but if there are many slices I will have 
to do some extra work. Interestingly I tried to collect all frames from RTP and 
concatenate them into a huge frame, but that didn't work.


If I collect slices, will they need a DELIMITER? or only the full frame 
consisting of those frames? That could explain, why the huge frame I sent did 
not work:

D       =       Delimiter 0x00,0x00,0x00,0x01
7       =       SPS
8       =       PPS
S       =       Slice of a Frame

huge frame was possibly something like that:

[D7 D8 DS DS DS DS DS DS DS DS DS DS ....]

but should be like that, where D(SSSSSSSSS) = IDR
[D7 D8 D(SSSSSSSSSS) .... ]

I will investigate that soon, and give some feedback. If I ever get this to 
run, i will publish a howto on that :)

cheers
Sven



Am 16.09.2010 um 09:39 schrieb Alex Grosu:

> Hi Sven
> 
> I made a test on my side by removing the IDR. Thus, I am now sending to 
> ffmpeg  the following sequence:
> [7 8] [1] [1] ... [1]
> Huy is right (I was sure about this, but just wanted to check). Without IDR, 
> ffmpeg can't decode anything. Thus, I think you wil have to find the IDR type 
> (5), bundle it along with 7(sps) and 8(pps) types and offer this to ffmpeg. 
> After this, offer each NAL as a whole frame to ffmpeg. Thus, as Huy said 
> previously, if you are having slices, you will have to collect them all and 
> give it to ffmpeg. Ig you are having a frame per NAL, you are in luck, just 
> put the start sequence before each NAL and feed ffmpeg.
> Regards,
> Alex
> 
> Huy Tran wrote:
>> Hi, Sven,
>> 
>> On Wed, Sep 15, 2010 at 7:11 PM, Sven Wasmer <[email protected]> wrote:
>> 
>>  
>>> Hi Alex,
>>> 
>>> thank you for your help! YES, network streamed h264 decoding is a big
>>> secret on the net, but h264 is far more complex than h263.
>>> 
>>> Well, I think I can't use "av_open_input_file", because my server and
>>> client use live555 for the rtsp/rtp handling. Thus I get (not
>>> fragmented) nalus by each RTP-Frame and get them into a buffer.
>>>    
>> 
>> 
>>  
>>> The current procedere is that I get a frame (first SPS, second PPS)
>>> and hand them over to the decoder (error: no frame).
>>> all the other frames are handled the same way.
>>> 
>>> 
>>>    
>> As I said in my previous post, SPS and PPS must have an IDR slice followed
>> them, if not you get errors.
>> 
>> 
>> Another point is, that I think that "ffplay" works the same way you
>>  
>>> described in your code, but ffplay does also not play the streamed
>>> h264 correctly for me (only grey difference pictures are played, might
>>> be missing the IDR frame?). It can on the other hand play the "file"
>>> very well, thus I think, the decoder is not initialized correctly.
>>> 
>>> 
>>>    
>> FFplay can work with streaming. Do you check this ?
>> http://www.wu.ece.ufl.edu/projects/wirelessVideo/project/realTimeCoding/
>> 
>> When you pass an IDR slice to  avcodec_decode_video, FFmpeg will decode that
>> frame, and store it for reference.
>> FFMpeg will not output this IDR until you feed another frame to
>> avcodec_decode_video.
>> 
>> Hope it helps.
>> Huy.
>> 
>> Maybe I need to backbuffer some of these RTP-frames and let
>>  
>>> av_read_frame work through that package? That would make only sense,
>>> if av_read_frame would write some data to the AVFormatContext, right?
>>> 
>>> I will try it and come back later.
>>> 
>>> cheers
>>> Sven
>>> 
>>> 2010/9/15 Alex Grosu <[email protected]>:
>>>    
>>>> Hi Sven
>>>> 
>>>> AFAIK, there isn't any other possibility when decoding raw NALs is
>>>>      
>>> wanted.
>>>    
>>>> The correct manner to handle this kind of issue is very vague discussed
>>>>      
>>> over
>>>    
>>>> the net. At least, I couldn't find proper information for this.
>>>> Anyway, you said you want to decode streams which are received through
>>>>      
>>> rtsp
>>>    
>>>> protocol. Why don't you use:
>>>> 
>>>> if(av_open_input_file(&pFormatCtx, "rtsp_address", NULL, 0, NULL)!=0)
>>>>  return -1; // Couldn't open file
>>>> videoStream=-1;
>>>> audioStream=-1;
>>>> for(i=0; i<pFormatCtx->nb_streams; i++) {
>>>>  if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_VIDEO &&
>>>>     videoStream < 0) {
>>>>    videoStream=i;
>>>>  }
>>>>  if(pFormatCtx->streams[i]->codec->codec_type==CODEC_TYPE_AUDIO &&
>>>>     audioStream < 0) {
>>>>    audioStream=i;
>>>>  }
>>>> }
>>>> vCodecCtxp = FormatCtx->streams[videoStream]->codec;
>>>> aCodecCtx = pFormatCtx->streams[audioStream]->codec;
>>>> 
>>>> After this, use av_read_frame:
>>>> 
>>>> while(av_read_frame(pFormatCtx, &packet)>=0) {
>>>>  // Is this a packet from the video stream?
>>>>  if(packet.stream_index==videoStream) {
>>>>    // Decode video frame
>>>>    avcodec_decode_video(pCodecCtx, pFrame, &frameFinished,
>>>>             packet.data, packet.size);
>>>>  if(frameFinished) {
>>>>     //do something with decoded data
>>>>  }
>>>>  }
>>>> }
>>>> 
>>>> The board I am using sends the streams in two ways:
>>>> 1. using rtsp protocol (for testing purpose)
>>>> 2. sending raw NALs in order when they appear (which I have to use)
>>>> Using the above code, I can properly decode the rtsp protocol.
>>>> Hope this helps
>>>> Regards, Alex
>>>> 
>>>> 
>>>> Sven Wasmer wrote:
>>>>      
>>>>> Hey,
>>>>> 
>>>>> I got the same. I want to stream raw h264 streams via RTSP/RTP. In
>>>>> another post "Re: [libav-user] Decode h264/rtp video -> problem with
>>>>> SPS/PPS and extradata" I mentioned already my problem.
>>>>> 
>>>>> Did you get any further? I tried to concatenate multiple received
>>>>> RTP-Frames, always prepending 0x00, 0x00,0x00,0x01. But I still get
>>>>> "-1" after avcodec_video_decode.
>>>>> 
>>>>> Is there no other way to initialize the decoder correctly for h264
>>>>> using SDP (SPS,PPS)?
>>>>> 
>>>>> Cheers and thx for your help!
>>>>> Sven
>>>>> 
>>>>> 2010/9/14 Alex Grosu <[email protected]>:
>>>>> 
>>>>>        
>>>>>> Hello Huy
>>>>>> 
>>>>>> I will check your information. Great advices, thanks a lot
>>>>>> Regards,
>>>>>> Alex
>>>>>> 
>>>>>> Nhat Huy wrote:
>>>>>> 
>>>>>>          
>>>>>>> On Fri, Sep 10, 2010 at 5:38 PM, Alex Grosu <[email protected]> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>            
>>>>>>>> Hello
>>>>>>>> 
>>>>>>>> The information you passed will help me later, thanks. In this
>>>>>>>>              
>>> moment,
>>>    
>>>>>>>> my
>>>>>>>> application communicates with an embedded product which sends only
>>>>>>>>              
>>> one
>>>    
>>>>>>>> frame
>>>>>>>> per NAL in this moment. Later, it will send inceed multiple slices,
>>>>>>>>              
>>> and
>>>    
>>>>>>>> we
>>>>>>>> said that we will see what we are going to do then. Is there any
>>>>>>>> possibility
>>>>>>>> to find which slice is the start of a frame sequence and which slice
>>>>>>>>              
>>> is
>>>    
>>>>>>>> the
>>>>>>>> end of it? In you case for example, is there a possibility to find
>>>>>>>>              
>>> the
>>>    
>>>>>>>> first
>>>>>>>> 5 from I frame  and the last 5 from I frame (or, the first 1 and last
>>>>>>>>              
>>> 1
>>>    
>>>>>>>> from
>>>>>>>> B frame / P frame) ?
>>>>>>>> Regards,
>>>>>>>> Alex
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Nhat Huy wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>              
>>>>>>>>> On Thu, Sep 9, 2010 at 9:00 PM, Alex Grosu <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>                
>>>>>>>>>> Hello Huy
>>>>>>>>>> 
>>>>>>>>>> Thanks a lot for the answer. I found this solution 2 days ago, but
>>>>>>>>>> first
>>>>>>>>>> I
>>>>>>>>>> thought it was a dirty one. I wanted to check this before posting
>>>>>>>>>> here
>>>>>>>>>> again, but you took it ahead. Your pointing shows me that in fact
>>>>>>>>>> this
>>>>>>>>>> is
>>>>>>>>>> not a dirty solution, so thanks a lot for support.
>>>>>>>>>> So, what I had to do (as Huy suggested), was to couple 7 8 and 5
>>>>>>>>>>                  
>>> NAL
>>>    
>>>>>>>>>> types
>>>>>>>>>> into only one buffer and to present it to libavcodec (in
>>>>>>>>>> avcodec_decode_video2 function).
>>>>>>>>>> Thus, using the [ and ] characters to delimit buffer boundaries, I
>>>>>>>>>> was
>>>>>>>>>> coupling following before send them to libavcodec:
>>>>>>>>>> [7 8 5] [1] [1] [1] [1] [7 8 5] [1] [1] [1] and so on.
>>>>>>>>>> I don't receive errors anymore. Everything works now
>>>>>>>>>> 
>>>>>>>>>> Thank you
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Nhat Huy wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>                  
>>>>>>>>>>> On Thu, Sep 2, 2010 at 3:58 PM, Alex Grosu <[email protected]> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>                    
>>>>>>>>>>>> Hello
>>>>>>>>>>>> 
>>>>>>>>>>>> I am currently decoding h264 streams (with libavcodec) and I am
>>>>>>>>>>>> stuck
>>>>>>>>>>>> with
>>>>>>>>>>>> the logic for SPS (sequence parameter set) and PPS (picture
>>>>>>>>>>>> parameter
>>>>>>>>>>>> set).
>>>>>>>>>>>> The board from which I am receiving the packages is sending a NAL
>>>>>>>>>>>> unit
>>>>>>>>>>>> in
>>>>>>>>>>>> each packet. The sequence for NAL types is:
>>>>>>>>>>>> 7(SPS) 8(PPS) 5 1 1 1 1 7 8 5 1 1 1...
>>>>>>>>>>>> I am putting in front of each NAL the start sequence 0x000001 and
>>>>>>>>>>>> sending
>>>>>>>>>>>> this packet to avcodec_decode_video2. When 7 and 8 types are
>>>>>>>>>>>> coming,
>>>>>>>>>>>> I
>>>>>>>>>>>> am
>>>>>>>>>>>> doing the same logic, but avcodec_decode_video2 is returning -1.
>>>>>>>>>>>> After
>>>>>>>>>>>> this,
>>>>>>>>>>>> all the received NALs are decoded and the images are displayed.
>>>>>>>>>>>> Every
>>>>>>>>>>>> time
>>>>>>>>>>>> when SPS and PPS are coming, avcodec_decode_video2 returns -1
>>>>>>>>>>>> (fails).
>>>>>>>>>>>> I
>>>>>>>>>>>> searched all over the net , and I still can't understand how to
>>>>>>>>>>>> fill
>>>>>>>>>>>> up
>>>>>>>>>>>> the
>>>>>>>>>>>> extradata and extradata_size from AVCodecContext used. All I
>>>>>>>>>>>>                      
>>> found
>>>    
>>>>>>>>>>>> is
>>>>>>>>>>>> this
>>>>>>>>>>>> link:
>>>>>>>>>>>> 
>>>>>>>>>>>>                      
>>> http://www.mail-archive.com/[email protected]/msg04939.html.
>>>    
>>>>>>>>>>>> As I saw from here:
>>>>>>>>>>>> "To decode H.264 stream you need to have SPS and PPS NAL units
>>>>>>>>>>>>                      
>>> also
>>>    
>>>>>>>>>>>> .".
>>>>>>>>>>>> Ok,
>>>>>>>>>>>> I have them, but frankly I don't know how to use them. At first,
>>>>>>>>>>>>                      
>>> I
>>>    
>>>>>>>>>>>> thought
>>>>>>>>>>>> that avcodec_decode_video2 will "automatically" use them. But
>>>>>>>>>>>> returning
>>>>>>>>>>>> -1,
>>>>>>>>>>>> I don't see how. Also, If I am discarding SPS and PPS from
>>>>>>>>>>>>                      
>>> sending
>>>    
>>>>>>>>>>>> them
>>>>>>>>>>>> to
>>>>>>>>>>>> libavcodec, nothing is decoded anymore.
>>>>>>>>>>>> Can you please give me a hint?
>>>>>>>>>>>> Thank you a lot !
>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>> libav-user mailing list
>>>>>>>>>>>> [email protected]
>>>>>>>>>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>                      
>>>>>>>>>>> Hi,
>>>>>>>>>>> 
>>>>>>>>>>> I think you should read H.264 bit stream structure and use these
>>>>>>>>>>> tools
>>>>>>>>>>> to
>>>>>>>>>>> analyze a NAL unit of H.264 bitstream.
>>>>>>>>>>> http://www.codecian.com/
>>>>>>>>>>> http://tsviatko.jongov.com/index_prj_h264videoesviewer.htm
>>>>>>>>>>> 
>>>>>>>>>>> When FFmpeg decode a NAL unit, if the nal_unit_type equal 7,8 it
>>>>>>>>>>> will
>>>>>>>>>>> continue decode to find the IDR nal_unit. In your case, I think
>>>>>>>>>>>                    
>>> that
>>>    
>>>>>>>>>>> FFmpeg
>>>>>>>>>>> found SPS, PPS but it can not find IDR thus it returns -1.
>>>>>>>>>>> 
>>>>>>>>>>> Hope it helps.
>>>>>>>>>>> 
>>>>>>>>>>> Huy.
>>>>>>>>>>> _______________________________________________
>>>>>>>>>>> libav-user mailing list
>>>>>>>>>>> [email protected]
>>>>>>>>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>                    
>>>>>>>>>> _______________________________________________
>>>>>>>>>> libav-user mailing list
>>>>>>>>>> [email protected]
>>>>>>>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>                  
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> If you coupled H.264 slice as  [7 8 5] [1] [1] [1] [1] [7 8 5] [1]
>>>>>>>>>                
>>> [1]
>>>    
>>>>>>>>> [1]
>>>>>>>>> and so on, it can work well only in single slice/ frame case.
>>>>>>>>> If you use multiple slice , it goes wrong.
>>>>>>>>> 
>>>>>>>>> For ex. my sequence has 5 frame I B B B P, and number of slices is 4
>>>>>>>>> per
>>>>>>>>> frame. The NAL_unit_type of bit stream is:
>>>>>>>>> [7 8 5 5 5 5][1 1 1 1][1 1 1 1][1 1 1 1][1 1 1 1]
>>>>>>>>> I use [ and ] to mark the frame boundary.
>>>>>>>>> 
>>>>>>>>> When you feed avcodec_decode_video2, the input buffer must contains
>>>>>>>>> whole
>>>>>>>>> frame data.
>>>>>>>>> Thus, the problem that you have to calculate exactly the input
>>>>>>>>>                
>>> buffer
>>>    
>>>>>>>>> length, and feed it enough data plus some padding if necessary
>>>>>>>>>                
>>> before
>>>    
>>>>>>>>> pass
>>>>>>>>> it to avcodec_decode_video2.
>>>>>>>>> 
>>>>>>>>> Huy.
>>>>>>>>> _______________________________________________
>>>>>>>>> libav-user mailing list
>>>>>>>>> [email protected]
>>>>>>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>                
>>>>>>>> _______________________________________________
>>>>>>>> libav-user mailing list
>>>>>>>> [email protected]
>>>>>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>              
>>>>>>> Hi,
>>>>>>> 
>>>>>>> There is a syntax element named first_mb_in_slice in slice header. It
>>>>>>> describes the position of the first macroblock in that slice. It
>>>>>>>            
>>> equals
>>>    
>>>>>>> 0
>>>>>>> to
>>>>>>> indicate a new frame.
>>>>>>> 
>>>>>>> I am not sure that this information can help or not.
>>>>>>> 
>>>>>>> I just know that when I call av_read_frame, I can receive a buffer
>>>>>>> contained
>>>>>>> whole frame data. But I do not know how can it do this ?
>>>>>>> 
>>>>>>> Huy.
>>>>>>> _______________________________________________
>>>>>>> libav-user mailing list
>>>>>>> [email protected]
>>>>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>            
>>>>>> _______________________________________________
>>>>>> libav-user mailing list
>>>>>> [email protected]
>>>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>>>> 
>>>>>> 
>>>>>>          
>>>>> _______________________________________________
>>>>> libav-user mailing list
>>>>> [email protected]
>>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>>> 
>>>>>        
>>>> _______________________________________________
>>>> libav-user mailing list
>>>> [email protected]
>>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>>> 
>>>>      
>>> _______________________________________________
>>> libav-user mailing list
>>> [email protected]
>>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>>> 
>>>    
>> _______________________________________________
>> libav-user mailing list
>> [email protected]
>> https://lists.mplayerhq.hu/mailman/listinfo/libav-user
>> 
>>  
> _______________________________________________
> libav-user mailing list
> [email protected]
> https://lists.mplayerhq.hu/mailman/listinfo/libav-user

_______________________________________________
libav-user mailing list
[email protected]
https://lists.mplayerhq.hu/mailman/listinfo/libav-user

Re: [libav-user] Decode H264 streams: how to fill AVCodecContext from SPS PPS

Reply via email to