Hi Chris,

Thanks for the tremendously fast reply.  More below:

On Feb 25, 2009, at 3:52 PM, Christopher Wright wrote:

I have run into two fairly serious performance issues.
Issue 1: Performance on a PPC machine drops off a cliff. PPC machines that could play videos smoothly in the old version have unusable framerates on the new version. To the best of my belief, the only significant difference between the two (as far as video playback is concerned) is simply that the frames are being rendered with QC in the new version. Is there a known performance hit for Quartz Composer on PPC hardware? Is there some kind of optimization flag I need to set to target a build for PPC? I don't yet have PPC hardware to test on, so can't yet investigate directly. I'm trying to scrounge up some hardware now; I wasn't expecting to hear of factor-of-ten (or more!) type slowdowns from the PPC crowd.

There's no severe inherent performance hit from using PPC as far as I've seen. Depending on how you're feeding images to QC, it might be doing colorspace conversions (BGRA->ARGB or whatever) or some other bookkeeping that takes a long time. What size video are you typically using? SD? HD? For smaller video, I wouldn't expect the performance hit to be too severe if it was in fact color space conversion.

People end up trying to use file sizes all over the map. HD is a common choice, though. (But then there are situations like the customer who wrote today saying he was using a 3794x1050 file meant to stretch over three screens. Yarg.)

My attempt at optimizing the rendering is reflected in the creation of the QTMovie's rendering context. It is generated as follows, depending on both the number of screens the image will ultimately be drawn to and what pixel format we should be using (error handling removed, the layer object is my own wrapper that does the rendering with a QC renderer):

        if ([_layers count] == 1) {
                layer = [_layers objectAtIndex:0];
                err = QTOpenGLTextureContextCreate(kCFAllocatorDefault,
                                                 [[layer.videoView 
openGLContext] CGLContextObj],
                                                 [[layer.videoView pixelFormat] 
CGLPixelFormatObj],
                                                 nil,
                                                 &_qtVisualContext);
} else { // Need to render to a pixel buffer instead of directly to the card's VRAM
                // See: http://developer.apple.com/qa/qa2005/qa1443.html
                
                CFMutableDictionaryRef  pixelBufferOptions;
                CFMutableDictionaryRef  visualContextOptions;
                
pixelBufferOptions = CFDictionaryCreateMutable(kCFAllocatorDefault, 0, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
#if __BIG_ENDIAN__
SetNumberValue(pixelBufferOptions, kCVPixelBufferPixelFormatTypeKey, k32ARGBPixelFormat);
#else
SetNumberValue(pixelBufferOptions, kCVPixelBufferPixelFormatTypeKey, k32BGRAPixelFormat);
#endif
SetNumberValue(pixelBufferOptions, kCVPixelBufferWidthKey, (int)_naturalSize.width); SetNumberValue(pixelBufferOptions, kCVPixelBufferHeightKey, (int)_naturalSize.height); SetNumberValue(pixelBufferOptions, kCVPixelBufferBytesPerRowAlignmentKey, 16);
                
visualContextOptions = CFDictionaryCreateMutable(kCFAllocatorDefault, 0, &kCFTypeDictionaryKeyCallBacks, &kCFTypeDictionaryValueCallBacks);
                CFDictionarySetValue(visualContextOptions,
                                                         
kQTVisualContextPixelBufferAttributesKey,
                                                         pixelBufferOptions);
                
err = QTPixelBufferContextCreate(kCFAllocatorDefault, visualContextOptions, &_qtVisualContext);
        }

I believe this would cover the colorspace conversion correctly, but I may have messed it up...

Do you have any technical users that could use Shark to profile it, and see what the hangup is?

Yes, I have some savvy users, although the savvy and the PPC may not be overlapping sets--not sure. I'll see if I can get some data that way.

Issue 2: NSOpenGLCPSwapInterval. When I force vertical sync with NSOpenGLCPSwapInterval, QC rendering on the display link callback thread will frequently stall when the main thread is busy, such as during a big GUI redraw of the application's workspace window. If I turn off vertical sync then the stalls cease, but of course then I get tearing. Is there some way to both enforce vertical sync and also keep QC rendering at a consistent rate on the background threads?

Since you're using a QCRenderer, are you using a CVDisplayLink to render it? This will render on a separate thread, which usually won't stall on GUI updates (QCViews, and QCRenderers that are rendered with the main thread will compete with GUI drawing, and will stall frequently -- this sounds like what's happening, but it's difficult to tell from casual inspection. it appears that you are in fact using display links, but I don't know the structure of the app).

Yes, that is correct. I'm using a CVDisplayLink, and in the display link callback I'm:

- checking if a new video frame is available, and grabbing it if it is

- passing the current frame to the QC image input port for rendering.

This may be done several times for each currently running video. After each video has been updated I glFlush() and am done with processing on the display link callback.

Or, perchance, is the GUI update/display code in the display link thread (by design or mistake)?

Not by my design--although one of the worst culprits is a piece of GUI that is being rendered using CoreAnimation layers--but if they are being rendered on the same display link thread as I'm creating for my own use I have deeply misunderstood the way the display link works, or the way CoreAnimation works, or both.

I've seen a number of weird quirks where GUI updates inadvertently happen on display link threads due to simple mistakes. Or are any synchronization blocks or locks in place that could be in conflict?

Yes, there are a few sync blocks and locks involved. I will re- examine them again before making any cavalier claims that they couldn't possibly be involved. :)

Otherwise, the composition is, as you said, very simple. Is there a reason you switched to QC from OpenGL for this? (there's no way you're going to realize a significant performance benefit from QC when you're drawing a single mostly unfiltered quad.)


The primary benefit is to empower the user. This way it was easy to allow them to drop in a custom QC file for rendering. It was also a lot easier to code, since I didn't have to code anything. :)

Cheers,
Chris
_______________________________________________
Do not post admin requests to the list. They will be ignored.
Quartzcomposer-dev mailing list      ([email protected])
Help/Unsubscribe/Update your Subscription:
http://lists.apple.com/mailman/options/quartzcomposer-dev/archive%40mail-archive.com

This email sent to [email protected]

Reply via email to