Hello David,

Is the code you use to profile this available?

Little bit of code like this can be very compiler and architecture dependent. 
There is a lot of potential optimization. Even the way you convert/round the 
result of the method to the integer your assigning it to can make a big 
difference, you could look at the round methods in the itk::Math namespace.

Also what architecture and sse instruction sets are you compiling for? x64 arch?

Brad

On Aug 31, 2012, at 4:23 PM, David Doria <[email protected]> wrote:

> The current implementation of CovariantVector::GetSquaredNorm() seems to 
> assume that the component type may have a problem being multiplied and added 
> (overflow). However, when the component type IS capable of these operations, 
> a penalty is still paid in the conversion of everything to double 
> (itk::NumericTraits<every POD>::RealValueType = double).
> 
> In my program, I have images of CovariantVector<int, 3> pixels and I get 
> about a 20% speed up if I change the current implementation:
> 
> template< class T, unsigned int NVectorDimension >
> typename CovariantVector< T, NVectorDimension >::RealValueType
> CovariantVector< T, NVectorDimension >
> ::GetSquaredNorm(void) const
> {
>   RealValueType sum = NumericTraits< RealValueType >::Zero;
> 
>   for ( unsigned int i = 0; i < NVectorDimension; i++ )
>     {
>     const RealValueType value = ( *this )[i];
>     sum += value * value;
>     }
>   return sum;
> }
> 
> to
> 
>  template< class T, unsigned int NVectorDimension >
>  typename CovariantVector< T, NVectorDimension >::ValueType
>  CovariantVector< T, NVectorDimension >
>  ::GetSquaredNorm(void) const
>  {
>    ValueType sum = NumericTraits< ValueType >::Zero;
>  
>    for ( unsigned int i = 0; i < NVectorDimension; i++ )
>      {
>      const ValueType value = ( *this )[i];
>      sum += value * value;
>      }
>    return sum;
>  }
> 
> (note that the return type as well as the internal type has changed).
> 
> I can't think of a reasonable way to determine automatically if there will be 
> a problem (it is not just a signed/unsigned type of problem, but rather it 
> actually depends how large the values are relative to the size of the type). 
> Would it make sense to add a second function called something like 
> GetSquaredNormNoCast() or something like that that can be used in cases where 
> the developer knows that there will be no chance of overflow and would like 
> to get the speed savings?
> 
> Thanks,
> 
> David
> _______________________________________________
> Powered by www.kitware.com
> 
> Visit other Kitware open-source projects at
> http://www.kitware.com/opensource/opensource.html
> 
> Kitware offers ITK Training Courses, for more information visit:
> http://kitware.com/products/protraining.php
> 
> Please keep messages on-topic and check the ITK FAQ at:
> http://www.itk.org/Wiki/ITK_FAQ
> 
> Follow this link to subscribe/unsubscribe:
> http://www.itk.org/mailman/listinfo/insight-developers

_______________________________________________
Powered by www.kitware.com

Visit other Kitware open-source projects at
http://www.kitware.com/opensource/opensource.html

Kitware offers ITK Training Courses, for more information visit:
http://kitware.com/products/protraining.php

Please keep messages on-topic and check the ITK FAQ at:
http://www.itk.org/Wiki/ITK_FAQ

Follow this link to subscribe/unsubscribe:
http://www.itk.org/mailman/listinfo/insight-developers

Reply via email to