Re: How to find the index of max element in float<4> in avx2-i32x8 effeciently

2022-05-16 Thread Dmitry Babokin
Hello, Right now we don't optimize for horizontal operations on short vectors (i.e. float<4>) and there's no straightforward way to do reduce_max(float<4>). But this topic is being actively discussed, supporting that would require redesigning stdlib (so basically all width functions are avilable

Re: For guidance on porting cuda code to ispc

2022-04-18 Thread Dmitry Babokin
Hi Rupesh, Could post this on https://github.com/ispc/ispc/discussions ? We are deprecating this mailing list and more people will be able to see answers on Github Discussion, so let's continue this thread there. Dmitry. On Sun, Apr 17, 2022 at 11:15 AM Rupesh Kumar wrote: > > Any sort of

Re: Regarding running my first simple executable

2022-03-21 Thread Dmitry Babokin
Now you have a "simple" executable, that you can run: > ./simple PS mailing lists are deprecated, please use Github Discussions ( https://github.com/ispc/ispc/discussions) or Issues ( https://github.com/ispc/ispc/issues) On Sun, Mar 20, 2022 at 9:58 PM Rupesh Kumar wrote: > I am new to ispc

Re: SIMD array append idiom

2021-08-19 Thread Dmitry Babokin
Mike, I'm not sure that I understand your question correctly. Are you trying to pick some lanes of varying computations (i.e. based on should_append(x) selector) and serialize them in an array? If so, you probably need this: https://ispc.github.io/ispc.html#packed-load-and-store-operations

Release v1.16.0

2021-06-11 Thread Dmitry Babokin
=== v1.16.0 === (11 June 2021) An ISPC release with language extensions for performance fine tuning, cpu definitions for AlderLake and SapphireRapids targets, support for macOS ARM targets, and massive update of Intel GPUs support. Windows and Linux binaries in this release support both CPU and

Re: Converting ISPC compilation to clang format.

2021-05-03 Thread Dmitry Babokin
ISPC switches are definitely not compatible with clang switches. Could you point me to the ccache documentation stating the requirements to compiler switches format? I think it would be easier to add ISPC support to ccache directly, but not hacking the issue around making switches look like clang

Re: Installing VS Code extension

2021-04-23 Thread Dmitry Babokin
Could you please open an issue on github issue tracker? https://github.com/ispc/ispc/issues Dmitry. On Fri, Apr 23, 2021 at 4:01 PM Adhitha Dias wrote: > Hi, > > I am not able to install the ispc extension to VS code. I am getting the > below error. > > *[2021-04-19 21:45:32.215] [renderer1]

Re: Retaining asserts' hints under --opt=disable-assertions?

2021-01-14 Thread Dmitry Babokin
Hi, This issue has been sitting in my personal to-do list for quite long. We can do so many performance hints with asserts, likely and unlikely hints and it would be good not to have them as runtime checks. I think right now we don't have a mechanism to have a workaround for that. Could you

Release v1.15.0

2020-12-18 Thread Dmitry Babokin
=== v1.15.0 === (18 December 2020) An ISPC release with several improvements for CPU and Beta support of Intel graphics hardware architectures. The binaries in this release include CPU versions for Windows, Linux, and macOS, and a GPU-enabled Linux binary, which supports both CPU and GPU. CPU

Re: No armv7 softfp support?

2020-09-10 Thread Dmitry Babokin
ough I can't > find an official reference for that). > On Thursday, September 10, 2020 at 3:55:44 PM UTC-7 Dmitry Babokin wrote: > >> ISPC doesn't support soft float ABI. Seems we missed that current Android >> is soft float ABI... We need to fix that. >> >> Dmitry. &

Re: No armv7 softfp support?

2020-09-10 Thread Dmitry Babokin
ISPC doesn't support soft float ABI. Seems we missed that current Android is soft float ABI... We need to fix that. Dmitry. On Thu, Sep 10, 2020 at 3:00 PM Nives Ktich wrote: > I'm attempting to link an ISPC 1.13.0 generated object into an Android > armv7 (32bit) shared object via clang but

Re: Windows ASTC compression fine, linux has errors?

2020-09-01 Thread Dmitry Babokin
Do you have your code open sourced? Are you using released binaries or build ISPC yourself? I also would encourage you to try the latest release - v1.14.1. We fixed some bugs related to handling bools and others that might affect your code. But generally speaking observing differences between

Fwd: [ispc/ispc] Release v1.14.1 - === v1.14.1 === (28 August 2020)

2020-08-28 Thread Dmitry Babokin
-- Forwarded message - From: Dmitry Babokin Date: Fri, Aug 28, 2020 at 5:11 PM Subject: [ispc/ispc] Release v1.14.1 - === v1.14.1 === (28 August 2020) To: ispc/ispc Cc: Subscribed === v1.14.1 === (28 August 2020) <https://github.com/ispc/ispc/releases/tag/v1.14.1> Repo

Re: Significant Performance Regression After Switching Kernel to ISPC

2020-08-04 Thread Dmitry Babokin
Zach, What is the implementation you are comparing against? I assume you are comparing with C++, right? What compiler are you using? The key for performance in your case is "exp()" call. If C++ compiler has SVML library, that should explain the difference. ISPC scalarizes exp() call by default.

Fwd: [ispc/ispc] Release v1.14.0 - === v1.14.0 === (30 July 2020)

2020-07-31 Thread Dmitry Babokin
-- Forwarded message - From: Dmitry Babokin Date: Fri, Jul 31, 2020 at 2:40 PM Subject: [ispc/ispc] Release v1.14.0 - === v1.14.0 === (30 July 2020) To: ispc/ispc Cc: Subscribed === v1.14.0 === (30 July 2020) <https://github.com/ispc/ispc/releases/tag/v1.14.0> Repo

Re: Data dependent lane conflicts

2020-07-21 Thread Dmitry Babokin
ve some partitions underutilized and others full). > > As for *vp2intersectd*, it seems like it ought to be useful for > _something_ but just what hasn't come to mind yet. I expect once I get the > opportunity to play with it a bit more something may resolve itself. > > Che

Re: Data dependent lane conflicts

2020-07-21 Thread Dmitry Babokin
Hi Lars, Thank you for a detailed motivating examples describing the need for conflict detection capability in the language! Currently there's no way to generate vpconflictd instruction in ISPC. I see two paths to generate it: (1) introduce a library function, which has semantic of

Re: Why there isn't a three foreach loop?

2020-05-22 Thread Dmitry Babokin
We have it, for example here: https://github.com/ispc/ispc/blob/master/examples/aobench/ao.ispc#L216 Note, that tiled version is not much different. If you remove "_tiled" it will work as well. If you are asking why writing the former version, not the later, it yields different code sequence.

Fwd: [ispc/ispc] Release v1.13.0 - === v1.13.0 === (23 April 2020)

2020-04-24 Thread Dmitry Babokin
ISPC v1.13.0 was released! -- Forwarded message - From: Dmitry Babokin Date: Thu, Apr 23, 2020 at 11:28 PM Subject: [ispc/ispc] Release v1.13.0 - === v1.13.0 === (23 April 2020) To: ispc/ispc Cc: Subscribed === v1.13.0 === (23 April 2020) <https://github.com/ispc/i

Re: ISPC emits vcmpleps + vblendvps instead of vminps/vmaxps

2020-04-21 Thread Dmitry Babokin
Den måndag 13 april 2020 kl. 21:47:40 UTC+2 skrev Dmitry Babokin: >> >> Thanks for reporting it! It's related to recent changes. We'll fix it. >> >> Dmitry. >> >> On Mon, Apr 13, 2020 at 11:37 AM Michael Andersson < >> andersso...@hotmail.com> wrote: >

Re: Shipping ISPC binaries with software

2020-04-20 Thread Dmitry Babokin
ork called NMODL : > https://github.com/BlueBrain/nmodl. The draft manuscript describing > overall framework is available on arXiv here: > https://arxiv.org/pdf/1905.02241.pdf > > I will be happy to provide any additional information needed. > > -Pramod > > On Friday, April 3, 20

Re: ISPC emits vcmpleps + vblendvps instead of vminps/vmaxps

2020-04-13 Thread Dmitry Babokin
11. > > Cheers, > Michael > > Den måndag 13 april 2020 kl. 18:38:44 UTC+2 skrev Dmitry Babokin: >> >> Hi Michael, >> >> I'm glad that I have an easy answer for you. Try ispc trunk, instead of >> v1.12. It has a fix. Download page has links to ispc trunk for L

Re: ISPC emits vcmpleps + vblendvps instead of vminps/vmaxps

2020-04-13 Thread Dmitry Babokin
Hi Michael, I'm glad that I have an easy answer for you. Try ispc trunk, instead of v1.12. It has a fix. Download page has links to ispc trunk for Linux and Windows. If you need Mac, I can build it for you manually. Compile Explorer also has ispc trunk. I'm curious if this fixes all code gen

Re: Shipping ISPC binaries with software

2020-04-03 Thread Dmitry Babokin
Hello, Disclaimer: I'm not a lawyer and my understanding of licensing issues is far from perfect. Our license is 3-clause BSD license, which is quite permissive. We just moved our downloads to Github release, which should be reliable download location. So you can consider downloading ISPC on

Downloads has moved to github releases

2020-04-02 Thread Dmitry Babokin
Hello, We moved ISPC downloads to github releases ( https://github.com/ispc/ispc/releases). Sourceforge location will be discontinued in about a week, unless someone really needs it and speaks up. If you are downloading ISPC using scripts or pointing to ISPC downloads in your documentation, it's

Re: How to use struct without gather/scatter performance warnings

2020-03-27 Thread Dmitry Babokin
_b); > > __m128 res_upper = _mm256_extractf128_ps(res_c, 1); > __m128 res_lower = _mm256_extractf128_ps(res_c, 0); > > __m128 res = _mm_add_ps(res_upper, res_lower); > > _mm_stream_ps([i].data[0], res); > } > } > > > > On Wednesday, March 25, 2020 at 2:48:17 PM UTC-5, Dm

Re: How to use struct without gather/scatter performance warnings

2020-03-25 Thread Dmitry Babokin
Pete Brubaker wrote: >>>>>>> >>>>>>> Hi David, >>>>>>> >>>>>>> In looking over your code, unless you can reorder the data to SoA >>>>>>> your best strategy is to use the following method. This is

Re: How to use struct without gather/scatter performance warnings

2020-03-23 Thread Dmitry Babokin
ch version but lacks YMM, so I'm guessing it could be made more >> performant. >> Thank you for your help. >> >> David >> >> On Sunday, March 22, 2020 at 3:51:36 AM UTC-5, Dmitry Babokin wrote: >>> >>> David, >>> >>> Typically

Re: How to use struct without gather/scatter performance warnings

2020-03-22 Thread Dmitry Babokin
David, Typically if you target avx2 (using --target=avx2-i32x8), the code should use ymm registers. If you use --target=avx2-i32x4, then your data is 128 bit wide (for int and float vectors), which means that xmm registers will be used. Another possibility that you are using uniform float<4>,

Re: Is ispc deterministic or is it possible to make it be?

2020-02-25 Thread Dmitry Babokin
We don't guarantee bit reproducibility for floating point results across platforms (especially across x86 / ARM) - we never specifically tested for that. But I would expect that in most of the cases the same results. If you find the case with different results, please report them, I would be

Re: Is ispc deterministic or is it possible to make it be?

2020-02-25 Thread Dmitry Babokin
Non-determenism on this slide refers to allowing compiler to rearrange/reassociate/optimize floating point expression in the way that it may change numeric value of the result (but still being valid result in terms of satisfying the same mathematical formula in the source code). Note, compiled

Re: Array initialization

2019-11-06 Thread Dmitry Babokin
Tomek, It will initialize first value to the constant that you've supplied and the rest will be zero-initialized. Compile your code to see what's going on: > ispc --emit-llvm-text t.ispc -o - Dmitry. On Tue, Nov 5, 2019 at 3:38 AM Tomek wrote: > Hi, > > I'd like to initialise like this: > >

Re: A mishap or a bug in ISPC? (pointer assignment issue)

2019-11-04 Thread Dmitry Babokin
Hi Tomek, It's definitely a bug. In both cases address of static variable is a constant, so it should work as it works in C. please file bug, we'll fix it. Dmitry. On Mon, Nov 4, 2019 at 2:15 AM Tomek wrote: > Hi:) > > I would like to do this: > > /* a simple case */ > > static uniform float

Re: Dynamic allocation of SOA type.

2019-10-28 Thread Dmitry Babokin
Hi Alex, soa<> support is quite buggy, this is yet another problem in the collection of soa<> problems. We need either to solve them at once or deprecate soa<> support. Could you please file an issue with this problem? Dmitry. On Thu, Oct 24, 2019 at 9:08 AM Alex Yuan wrote: > Hi, > > Can

Re: returning struct by value from ispc-exported function?

2019-10-21 Thread Dmitry Babokin
Michael, This is supposed to work. Please file an issue. Dmitry. On Mon, Oct 21, 2019 at 2:04 PM Михаил Усачёв wrote: > ispc-program compiled to avx1 always gives zeros. My CPU is AMD fx-8350 > (avx1 is supported). > I can not attach VS 2017 solution (groups.google.com gives me "error >

Re: Where are the AVX-512 speedup results for the ISPC example programs ?

2019-10-04 Thread Dmitry Babokin
I could test this myself, it just isn't the easiest thing to turn > on and off on production systems. > > Cheers, > -Brian > > On Wednesday, September 25, 2019 at 6:41:24 PM UTC-7, Dmitry Babokin wrote: >> >> I'm attaching perf measurements on SKX machine for avx2

Re: Where are the AVX-512 speedup results for the ISPC example programs ?

2019-09-25 Thread Dmitry Babokin
h work for that many cores. Raw speedup geomean (against clang-8 compiler) is: avx2-i32x8: 6.85 avx2-i32x16: 7.33 avx512skx-i32x8: 7.13 avx512skx-i32x16: 9.18 Dmitry. On Wed, Sep 25, 2019 at 3:56 PM Dmitry Babokin wrote: > Hi, > > We haven't updated performance numbers for a while, t

Re: Where are the AVX-512 speedup results for the ISPC example programs ?

2019-09-25 Thread Dmitry Babokin
Hi, We haven't updated performance numbers for a while, thanks for pointing this out. I'll make measurements on the machine that I have and will post the results here. And we'll update the "official" numbers a bit later. AVX512 is indeed ideal target for ISPC. Though a few factors need to be

Re: Gather warnings in a code with plain memory layout

2019-07-31 Thread Dmitry Babokin
t instructions? See the mini > project, attached (type "make" or "make release" inside). It's just about > cluttering the console when we develop - to remove these type of warnings > we need to use many pragmas:) > > Tomek > > > On Friday, July 19, 2019 at 1

Re: Passing vectors from C++

2019-05-31 Thread Dmitry Babokin
Short vectors can't be passed by value in extern functions. You can pass them by pointer. Also note, that short vectors definition as a struct in interface header has alignment attribute, so it's aligned at least to 16 bytes. On Thu, May 30, 2019 at 7:49 AM Edward Catchpole wrote: > Is there a

Re: varying pointers to varying types

2019-04-30 Thread Dmitry Babokin
Sorry for the late response. I agree with Matt that it looks like a bug. I was checking our test suit and I found a test, which does exactly what you've described with the struct wrapper. So probably we are probably missing the very simple test. I don't think that something changed in the latest

Re: Storing interleaved values without scatter

2019-04-22 Thread Dmitry Babokin
(data + 2 * i + 0) = data0; > *(data + 2 * i + 1) = data1; > } > } > > Just switch between 1.10 and 1.11 and see the magic :) > > > On Tue, Apr 9, 2019 at 7:57 PM Dmitry Babokin wrote: > >> There are two options - (1) file a bug and wait until we fix it. It's >> gen

Re: how to compile w/ispc for multiple avx targets

2019-04-11 Thread Dmitry Babokin
Hi Shachar, When you use multiple targets in the same compilation, it triggers enabling of auto-dispatch code. The problem is that auto-dispatch needs to make decision for the dispatch solely on CPUID. Hence two targets for the same ISA, but with different width cause ambiguity. Who are you

Re: Storing interleaved values without scatter

2019-04-09 Thread Dmitry Babokin
There are two options - (1) file a bug and wait until we fix it. It's generally compiler responsibility to do this optimization. It's actually was implemented at some point, but it doesn't work in most of the cases and we are planning to fix it. So one more test case and and a reminder in the bug

Re: Disable specific code paths similar to __builtin_unreachable in GCC and LLVM

2019-03-15 Thread Dmitry Babokin
on > where to start looking. > > Cheers, > Christoph > > On Friday, March 15, 2019 at 5:53:19 PM UTC+1, Dmitry Babokin wrote: >> >> No, we don't have mechanisms like __builtin_unreachable(). I'm not sure >> if extracting some data flow facts from "if ( none ( S &

Re: Disable specific code paths similar to __builtin_unreachable in GCC and LLVM

2019-03-15 Thread Dmitry Babokin
No, we don't have mechanisms like __builtin_unreachable(). I'm not sure if extracting some data flow facts from "if ( none ( S > 25 ) ) flag_unreachable;" to apply to "if (S > 25)" is good idea, but we should definitely think about mechanisms to fine tune CFG. Would be good if you file an issue

Re: [ispc 1.10.0] Broken -MMM dependencies on Linux

2019-02-05 Thread Dmitry Babokin
Hi Jean, We've reproduced the issue. Looks like the problem was introduced when we migrated to CMake-based build. We are working on the fix. You are welcome to crate an issue on github. Thanks for reporting the problem! Dmitry. On Mon, Feb 4, 2019 at 12:04 PM Jean Ben wrote: > > Hi, > > I've

ISPC 1.10.0 is released

2019-01-18 Thread Dmitry Babokin
Download page: http://ispc.github.io/downloads.html === v1.10.0 === (18 January 2019) An ISPC update, which brings several new features, has a bunch of stability and performance bug fixes, and infrastructure improvements for those who are interested in participating in hacking on the ISPC trunk.

Re: Confused about foreach semantics

2018-09-21 Thread Dmitry Babokin
ry, > > On Thursday, September 20, 2018 at 5:05:34 PM UTC-6, Dmitry Babokin wrote: >> >> First of all, I don't see how foreach may increase parallelism in this >> case, as the swap happen for varying values, not for scalars. I.e. the >> following code fragment >&

Re: Confused about foreach semantics

2018-09-20 Thread Dmitry Babokin
Scott, First of all, I don't see how foreach may increase parallelism in this case, as the swap happen for varying values, not for scalars. I.e. the following code fragment float aux[2]; aux[0] = c[2*x + 0]; aux[1] = c[2*x + 1]; c[2*x + 0] = c[2*y + 0];

Re: How to Use the Box Blur Example

2018-08-21 Thread Dmitry Babokin
create a C wrapper which works on chunks of 4-8 pixels. > It gives the chunk to ISPC function which updated a buffer chunk which is > given as well. > Would that be more efficient or better have 2 levels of ISPC? > > > On Tue, Aug 21, 2018 at 2:29 AM Dmitry Babokin wrote: > >>

Re: How to Use the Box Blur Example

2018-08-20 Thread Dmitry Babokin
ation it will > handle pixels (0,0), (1,0), (2,0), (3,0), i.e. jj is (0,1,2,3), ii is > uniform int 0, which is casted to varying int (0,0,0,0). > } > } > > > Which seems to require the whole image as input. > > > On Friday, August 17, 2018 at 10:01:26 PM UTC+

Re: ISPC 1.9.2 is released

2018-08-17 Thread Dmitry Babokin
ovember 11, 2017 at 6:12:41 AM UTC+2, Dmitry Babokin wrote: >> >> Download page: http://ispc.github.io/downloads.html >> >> === v1.9.2 === (10 November 2017) >> >> An ISPC update, which brings out-of-the-box debug support on Windows, >> better performanc

Re: new compiler error in 1.9.2

2018-01-23 Thread Dmitry Babokin
This looks like a correct code to me. The code to check this kind of things was changed in 1.9.2, but reported behavior is not intended. Please file an issue, I'll have a look. On Tue, Jan 23, 2018 at 8:14 PM, Brian Green wrote: > When upgrading to 1.9.2 from 1.9.1

Re: Arm Neon Support Status of ISPC

2017-12-27 Thread Dmitry Babokin
I'm not aware of active users of ARM Neon port. It should be functional, if not it should be easy to bring it back to life. Dmitry On Fri, Dec 22, 2017 at 9:30 PM, B.Stastny wrote: > Does anyone know the status of ARM Neon support for ISPC? I see there are > some updates

ISPC 1.9.2 is released

2017-11-10 Thread Dmitry Babokin
Download page: http://ispc.github.io/downloads.html === v1.9.2 === (10 November 2017) An ISPC update, which brings out-of-the-box debug support on Windows, better performance of most of the targets and a bunch of stability and performance bug fixes. The release is based on patched LLVM 5.0

Re: crash -O0 and generic-x1

2017-11-05 Thread Dmitry Babokin
Generic targets are not well maintained and there are multiple known failures. And I don't think that's going to change. On Thu, Nov 2, 2017 at 5:12 PM, Brian Green wrote: > I am encountering a crash when combining the generic-x1 target at the -O0 > flag. For

Re: ispc 1.9.2rc1

2017-10-24 Thread Dmitry Babokin
Hope to get it out in a couple of weeks. On Tue, Oct 24, 2017 at 3:29 AM, Ali Nakipoğlu <ali.nakipo...@fiction.io> wrote: > Hi Dmitry, > > Any ETA for 1.9.2? > > Kind Regards, > > Ali > > > On Wednesday, May 17, 2017 at 8:33:59 AM UTC+1, Dmitry Babokin wr

Re: ispc 1.9.2rc1

2017-08-16 Thread Dmitry Babokin
warnings from valid code with 1.9.1). > > If the release is not happening in the near future, would you happen to > have a MSVC 2013 -compatible set of binaries for the release candidate? > > On Friday, June 16, 2017 at 10:52:21 AM UTC+3, Dmitry Babokin wrote: >> >> Yes, t

Re: Subtraction error

2017-06-22 Thread Dmitry Babokin
ticles_fx[]. On Fri, Jun 16, 2017 at 2:05 AM, Dmitry Babokin <babo...@gmail.com> wrote: > I can reproduce and I'm looking at this issue. Though I'm currently on > vacation, so it may took a little longer that usually. > > On Tue, Jun 13, 2017 at 4:28 AM, <peterh...@hotmail.

Re: ispc 1.9.2rc1

2017-06-16 Thread Dmitry Babokin
ok 51.172697 second(s) (1000 > iters, 4096x4096) > > ISPC 1.9.2rc1 > No scatter / gather performance warnings and much better performance: > > Nexus::TestWaterSimPerformance: water sim took 13.282045 second(s) (1000 > iters, 4096x4096) > > > On Monday, June 5, 2017 at 7:0

Re: ispc 1.9.2rc1

2017-06-05 Thread Dmitry Babokin
, 13.938580 Mp, 1.880708 second(s) > > > Thanks for your effort in improving ISCP compiler ! > > > > On Wednesday, May 17, 2017 at 9:33:58 AM UTC+2, Dmitry Babokin wrote: >> >> Hello, >> >> We are going to release ispc 1.9.2 soon and have prepared a rele

Re: question about avx512

2017-05-30 Thread Dmitry Babokin
. > > Jeff > > On Friday, October 21, 2016 at 6:50:02 AM UTC-7, Dmitry Babokin wrote: >> >> You basically ask the world to be simpler. I wish it would be simpler, >> but it's not :) >> >> AVX512 is umbrella name for the set of ISA extensions, which work wit

ispc 1.9.2rc1

2017-05-17 Thread Dmitry Babokin
Hello, We are going to release ispc 1.9.2 soon and have prepared a release candidate for those of you who prefer to use pre-built binaries. Please give it a try and let us know if you see any problems with your code. Windows (VS2015): https://drive.google.com/open?id=0Bxh4sVF04yhxYnRRczhlNzZOblU

Re: Using ISPC in production

2017-05-05 Thread Dmitry Babokin
. On Fri, May 5, 2017 at 9:44 AM, Ali Nakipoğlu <ali.nakipo...@fiction.io> wrote: > Hi Dmitry, > > Thank you very much for your kind response. > > I will start experimenting. Is there any release schedule? > > Ali > > On Thursday, May 4, 2017 at 11:55:38 PM UTC+1

Re: Using ISPC in production

2017-05-04 Thread Dmitry Babokin
Ali, We have customers who use ISPC in production. Though we do have stability issues, but we prefer to release without known regressions. If you have something which blocks you, let us know and we'll give it higher priority. Dmitry. On Thu, May 4, 2017 at 7:01 AM, Ali Nakipoğlu

Re: ISPC Profiling program

2017-01-05 Thread Dmitry Babokin
Any profiling tool on your platform should work with ispc binaries. For example, Intel VTune. Dmitry. On Thu, Jan 5, 2017 at 12:07 AM, Dženana Kapetanović < dzenanakapetanovi...@gmail.com> wrote: > Can somebody tell me any profilling tool for ISPC programm? > > -- > You received this message

Re: how to serialize in ispc into a uniform list execution

2016-09-25 Thread Dmitry Babokin
I guess you need "foreach_unique": http://ispc.github.io/ispc.html#iteration-over-unique-elements-foreach-unique On Sun, Sep 25, 2016 at 5:54 PM, Morten Mikkelsen wrote: > So I have a scenario where I am tracing through a tree structure and the > odds of hitting the same

Re: Surprising code being generated by ARM NEON backend

2016-09-12 Thread Dmitry Babokin
, September 8, 2016 at 7:30:40 PM UTC+1, Dmitry Babokin wrote: >> >> You should have really parallelization friendly code to get close to >> theoretical scaling on all vector units. >> >> For parallelization approaches, intrinsics are obviously not good enough, >> a

Re: compiling ispc on ubuntu

2016-07-08 Thread Dmitry Babokin
Hi, Alloy.py build should work, but not sure about other options. Just to make sure - have you checked that you correctly added clang built by alloy.py to your path? I.e, "which clang" points to newly built clang. Dmitry. On Fri, Jul 8, 2016 at 7:29 AM, Steve Heistand