On Thu, 08 Nov 2018 08:26:49 -0800
Eric Anholt <[email protected]> wrote:

> >> > +static void vc4_plane_calc_load(struct drm_plane_state *state)
> >> > +{
> >> > +        unsigned int hvs_load_shift, vrefresh, i;
> >> > +        struct drm_framebuffer *fb = state->fb;
> >> > +        struct vc4_plane_state *vc4_state;
> >> > +        struct drm_crtc_state *crtc_state;
> >> > +        unsigned int vscale_factor;
> >> > +
> >> > +        vc4_state = to_vc4_plane_state(state);
> >> > +        crtc_state = drm_atomic_get_existing_crtc_state(state->state,
> >> > +                                                        state->crtc);
> >> > +        vrefresh = drm_mode_vrefresh(&crtc_state->adjusted_mode);
> >> > +
> >> > +        /* The HVS is able to process 2 pixels/cycle when scaling the 
> >> > source,
> >> > +         * 4 pixels/cycle otherwise.
> >> > +         * Alpha blending step seems to be pipelined and it's always 
> >> > operating
> >> > +         * at 4 pixels/cycle, so the limiting aspect here seems to be 
> >> > the
> >> > +         * scaler block.
> >> > +         * HVS load is expressed in clk-cycles/sec (AKA Hz).
> >> > +         */
> >> > +        if (vc4_state->x_scaling[0] != VC4_SCALING_NONE ||
> >> > +            vc4_state->x_scaling[1] != VC4_SCALING_NONE ||
> >> > +            vc4_state->y_scaling[0] != VC4_SCALING_NONE ||
> >> > +            vc4_state->y_scaling[1] != VC4_SCALING_NONE)
> >> > +                hvs_load_shift = 1;
> >> > +        else
> >> > +                hvs_load_shift = 2;
> >> > +
> >> > +        vc4_state->membus_load = 0;
> >> > +        vc4_state->hvs_load = 0;
> >> > +        for (i = 0; i < fb->format->num_planes; i++) {
> >> > +                unsigned long pixels_load;    
> >> 
> >> I'm scared any time I see longs.  Do you want 32 or 64 bits here?  
> >
> > I just assumed a 32bit unsigned var would be enough, so unsigned long
> > seemed just fine. I can use u32 or u64 if you prefer.  
> 
> Yes, please.  See also Maxime's recent trouble with a 64-bit kernel.

Will use u32 then.

> 
> >> > +                /* Even if the bandwidth/plane required for a single 
> >> > frame is
> >> > +                 *
> >> > +                 * vc4_state->src_w[i] * vc4_state->src_h[i] * cpp * 
> >> > vrefresh
> >> > +                 *
> >> > +                 * when downscaling, we have to read more pixels per 
> >> > line in
> >> > +                 * the time frame reserved for a single line, so the 
> >> > bandwidth
> >> > +                 * demand can be punctually higher. To account for 
> >> > that, we
> >> > +                 * calculate the down-scaling factor and multiply the 
> >> > plane
> >> > +                 * load by this number. We're likely over-estimating 
> >> > the read
> >> > +                 * demand, but that's better than under-estimating it.
> >> > +                 */
> >> > +                vscale_factor = DIV_ROUND_UP(vc4_state->src_h[i],
> >> > +                                             vc4_state->crtc_h);
> >> > +                pixels_load = vc4_state->src_w[i] * vc4_state->src_h[i] 
> >> > *
> >> > +                              vscale_factor;    
> >> 
> >> If we're upscaling (common for video, right?), aren't we under-counting
> >> the cost?  You need to scale/colorspace-convert crtc_w * crtc_h at 2
> >> pixels per cycle.  
> >
> > That's not entirely clear to me. I'm not sure what the scaler does when
> > upscaling. Are the same pixels read several times from the memory? If
> > that's the case, then the membus load should indeed be based on the
> > crtc_w,h.  
> 
> I'm going to punt on this question because that would be a *lot* of
> verilog tracing to figure out for me (and I'm not sure I'd even trust
> what I came up with).
> 
> > Also, when the spec says the HVS can process 4pixels/cycles, is it 4
> > input pixels or 4 output pixels per cycle?  
> 
> Well, it's 4 pixels per cycle when not scaling, so both :)

Sorry, I meant 2pixels/cycle :).

> 
> I think the scaling pipeline is doing two output pixels per cycle.
> Nothing else would make sense to me.

Okay, so the HVS load should be based on crtc_w/h and the membus load
should be based on src_w/h and the scaling ratio, right?
_______________________________________________
dri-devel mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to