> The encoder unfortunately only supplies frames in Annex B byte stream format 
> requiring the frames to be parsed.

Are you sure about this?  (Often, hardware encoders have firmware upgrades 
available.)


> Previously I was using my own class to identify the NAL unit's in conjunction 
> with the H264VideoDiscreteFramer which worked fine but it's heavy on the CPU. 
> So I've been trying to use the H264VideoFramer and just pass the full frames 
> in which works ok and is faster than my solution except that I'm seeing a lot 
> of truncated frames.
> 
> Having looked into the code it appears to be caused by the behaviour of the 
> StreamParser class; specifically the ensureValidBytes1() method which calls 
> getNextFrame() on my source with maxSize = BANK_SIZE - fTotNumValidBytes. The 
> method switches banks to ensure that the larger of numBytesNeeded or the 
> input sources maxFrameSize() will fit.
> 
> I can 'fix' the problem by increasing BANK_SIZE and implementing 
> maxFrameSize() on my source but I'm not totally happy with this solution 
> because I would prefer not to modify the library source and I'm just guessing 
> for the maxFrameSize() value.
> 
> I was wondering whether it's possible to return a partial frame from my video 
> source?

Yes, but not in the way that you might think :-)  A H.264 encoder actually 
delivers “NAL units”.  “NAL units” are what actually get parsed by our code, 
and packed into RTP packets.

Often, a “NAL unit” is a complete frame.  It is possible, however, for a ‘key 
frame’ to be split up - by your encoder - into multiple ‘slice’ NAL units.  For 
datagram streaming (e.g., over RTP), it is *much* better to have your key 
frames broken up into multiple ‘slice’ NAL units, than to have the key frame be 
a single, large NAL unit - which is what you have now.  This is especially true 
if your key frames are exceptionally large: ~150000 bytes or larger, which 
appears to be the case for you, because you are hitting the BANK_SIZE limit 
(which was deliberately set to be larger than realistically needed).

Note that a 150000 byte key frame NAL unit will get transmitted as more than 
1000 RTP packets (datagrams).  (Our code automatically handles the required 
fragmentation.)  If *any* of these 1000 packets gets lost in transit, then the 
entire key frame will be undeliverable.

If, instead, your encoder delivers each key frame as multiple ‘slice’ NAL 
units, then your streaming will be much more resilient to network packet loss.

So, your first task should be to check whether your encoder:
1/ can be reconfigured to deliver discrete frames, rather than a stream with 
each NAL unit prepended by a 0x00 0x00 0x00 0x01 ‘start code’, and
2/ can be reconfigured to deliver key frames as multiple ‘slice’ NAL units, 
rather than as a single (ridiculously large) NAL unit.


Ross Finlayson
Live Networks, Inc.
http://www.live555.com/

_______________________________________________
live-devel mailing list
[email protected]
http://lists.live555.com/mailman/listinfo/live-devel

Reply via email to