Thanks. That almost fixed it, Dmitry.  The biggest difference now is that 
msvc does two registers at a time. That would be taken care of with 
sse4-i8x32, which doesn't exist. Can I add it if I compile ispc myself? 
What about avx2-i8x32? It would be nice to just tell ispc to optimize for 
i8 without mentioning sse/avx.

Bruno

On Wednesday, November 1, 2017 at 2:46:23 PM UTC-3, Dmitry Babokin wrote:

> Have you tried compiling with i8 target? I.e. sse4-i8x16.
>
> Sent from my iPhone
>
> On Nov 1, 2017, at 10:19 AM, Bruno Martínez <[email protected] 
> <javascript:>> wrote:
>
> Hi,
>
> ispc generates code then times slower than msvc2013 for a simple loop add:
>
> void simple_msvc(int8_t vin1[], int8_t vin2[], int8_t vout[], int count)
> {
>     for (int index = 0; index < count; ++index)
>     {
>         vout[index] = vin1[index] + vin2[index];
>     }
> }
> export void simple_ispc(uniform int8 vin1[], uniform int8 vin2[], uniform 
> int8 vout[], uniform int count) {
>     foreach (index = 0 ... count) {
>         vout[index] = vin1[index] + vin2[index];
>     }
> }
>
> It seems that ispc operates with int32 and generates unnecessary shuffles.
>
> Can I tweak something?
>
> Bruno
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Intel SPMD Program Compiler Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> For more options, visit https://groups.google.com/d/optout.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Intel SPMD Program Compiler Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to