Certainly, I appreciate your interest. I had to abandon a single-thread solution, because it would block the main thread for minutes. *Step1: *Using js workers (emscripten_create_worker(wname)) Worked very well. Bonus was to have 2GB of RAM (if available) in each worker. Caveat: lots of copying back and forth. No messaging between workers. Step2: Decision for wasm workers (emscripten_malloc_wasm_worker) The requirement to move to a different kind of workers came with the usage of SharedArrayBuffers. I could allocate my data in the main thread, and then send of parts of it for processing to a list of workers, without the need for copying stuff around. Not being familiar with pthreads nor wasm workers I followed the recommendation on this page: https://emscripten.org/docs/api_reference/wasm_workers.html?highlight=wasm%20worker
*" If an application is only developed to target WebAssembly, and portability is not a concern, then using Wasm Workers can provide great benefits in the form of simpler compiled output, less complexity, smaller code size and possibly better performance."* (see section " Pthreads vs Wasm Workers: Which One to Use?") Other than that I had no particular reason to choose wasm workers, although I liked the idea of just a couple of bytes on disk for the wasm workers. Cheers, Dieter s...@google.com schrieb am Freitag, 26. Mai 2023 um 22:47:12 UTC+2: > Can I ask why you chose not to use pthreads to start with? I'd like to > understand better why folks would choose wasm workers over pthreads. > > On Fri, May 26, 2023 at 3:25 AM 'Dieter Weidenbrück' via > emscripten-discuss <emscripte...@googlegroups.com> wrote: > >> Hi Sam, >> IIRC, when I started with Emscripten a while ago the program would abort >> in case of a memory error. As my app is comparable to a desktop app, this >> was not acceptable, so I set ABORTING_MALLOC to 0. I understand that this >> flag has a different meaning today. Here is how all my allocation calls >> work: >> >> Error_T allocMemPtr(MemPtr_T *p,uint32_T size,boolean_T clear) { >> _MemPtr_T mp; >> >> if (clear) >> mp = (_MemPtr_T)calloc(1,size + sizeof(_Mem_T)); >> else >> mp = (_MemPtr_T)malloc(size + sizeof(_Mem_T)); >> if (mp) { >> mp->size = size; >> *p = (MemPtr_T)((char_T*)mp + sizeof(_Mem_T)); >> return kErr_NoErr; >> } >> return kErr_MemErr; >> } >> Error_T setMemPtrSize(MemPtr_T *p,uint32_T size){ >> _MemPtr_T m = _MP(*p); >> MemPtr_T newPtr; >> >> newPtr = realloc(m,size + sizeof(_Mem_T)); >> if (newPtr) { >> m = (_MemPtr_T)newPtr; >> m->size = size; >> *p = (MemPtr_T)((char_T*)m + sizeof(_Mem_T)); >> return kErr_NoErr; >> } >> return kErr_MemErr; >> } >> >> So I should catch all errors. However, errors (i.e. return value == 0) >> are not reported by malloc or calloc during the problems I am experiencing. >> I added debug lines, but not a single failure was recorded. >> Removing ABORTING_MALLOC did not result in any change of error outcome. >> >> I see two different behaviors now: >> - setting up workers and checking that they run by >> static void startUpWorker(void) { >> #ifdef __EMSCRIPTEN__ >> int32_T w = emscripten_wasm_worker_self_id(); >> if (! emscripten_current_thread_is_wasm_worker()){ >> EM_ASM_({ >> console.log("Error: No worker: " + $0); >> },w); >> } >> #endif //__EMSCRIPTEN__ >> } >> - then I do my stuff and receive about 10 of the "Uncaught RuntimeError: >> memory access out of bounds" errors. >> - no failures of malloc/calloc recognized >> >> The second behavior is >> - in main() I call this routine: >> static void memtest(void) { >> #define NUM_CHUNKS 15 >> const int CHUNK_SIZE = 100 * 1024 * 1024; >> int i; >> void* p[NUM_CHUNKS]; >> Error_T err = kErr_NoErr; >> >> for (int i = 0; i < NUM_CHUNKS; i++) { >> err = allocMemPtr(&p[i],CHUNK_SIZE,FALSE); //see function above >> if (err != kErr_NoErr || p[i] == NULLPTR) { >> printf("Error chunk %d\n",i); >> break; >> } >> } >> for (int i = 0; i < NUM_CHUNKS; i++) { >> if (p[i] == NULLPTR) >> break; >> disposeMemPtr(p[i]); >> } >> } >> - then I start up the workers as described above >> - then I do my stuff >> - sometimes this results in error free behavior, but not always. If an >> error occurs, I only get one "Uncaught RuntimeError" message. >> >> I am pretty confident that I handle memory allocation correctly, because >> my background is in development of desktop apps in C for 30+ years, and >> there you better not have any leaks and keep the app running whenever >> possible. So I must be doing something wrong when dealing with multiple >> threads. >> I will try out pthreads next, because I have no idea anymore what the >> cause could be here. >> >> Cheers, >> Dieter >> s...@google.com schrieb am Donnerstag, 25. Mai 2023 um 23:20:33 UTC+2: >> >>> Is there some reason you added `-sABORTING_MALLOC=0`.. that looks a >>> little suspicious, since it means the program can continue after malloc >>> fails.. which mean that any callsite that doesn't check the return value of >>> malloc can lead to segfaults. If you remove that setting does the >>> behaviour change? >>> >>> >>> >>> On Thu, May 25, 2023 at 1:27 PM 'Dieter Weidenbrück' via >>> emscripten-discuss <emscripte...@googlegroups.com> wrote: >>> >>>> Hi Sam, >>>> >>>> I can run the code in a single thread without problems, and I have done >>>> that for a while. So I assume that the code is stable. >>>> >>>> Here is the command line I use in a .bat file: >>>> emcc ./src/main.c ^ >>>> ... >>>> ./src/w_com.c ^ >>>> -I ./include/ ^ >>>> -g3 ^ >>>> --source-map-base ./ ^ >>>> -gsource-map ^ >>>> -s ALLOW_MEMORY_GROWTH=1 ^ >>>> -s ENVIRONMENT=web,worker ^ >>>> --shell-file ./index_template.html ^ >>>> -s SUPPORT_ERRNO=0 ^ >>>> -s MODULARIZE=1 ^ >>>> -s ABORTING_MALLOC=0 ^ >>>> -sWASM_WORKERS ^ >>>> -s "EXPORT_NAME='wasmMod'" ^ >>>> -s EXPORTED_FUNCTIONS="['_malloc','_free','_main']" ^ >>>> -s EXPORTED_RUNTIME_METHODS= >>>> "['cwrap','UTF16ToString','UTF8ToString','stringToUTF8','allocateUTF8']" >>>> ^ >>>> -o index.html >>>> >>>> I will start familiarizing myself with pthreads to test whether that >>>> would work better. >>>> >>>> BTW, as an old C programmer I am fascinated by emscripten and its >>>> possibilities. Excellent job! >>>> >>>> Cheers, >>>> Dieter >>>> >>>> s...@google.com schrieb am Donnerstag, 25. Mai 2023 um 20:29:58 UTC+2: >>>> >>>>> This looks like some kind of memory corruption, most likely due to the >>>>> use of muiltithreading/wasm_workers Are you able to build a single >>>>> threaded version of your program, or one that uses normal pthreads rather >>>>> than wasm workers? >>>>> >>>>> Also, can you share the full link command you are using? >>>>> >>>>> cheers, >>>>> sam >>>>> >>>>> On Thu, May 25, 2023 at 9:20 AM 'Dieter Weidenbrück' via >>>>> emscripten-discuss <emscripte...@googlegroups.com> wrote: >>>>> >>>>>> This is a memory snapshot when using SAFE_HEAP. So here I am quite >>>>>> below the browser limits, still the segfault occurs in different places. >>>>>> Ignore the first console line, it results from Norton Utilities I >>>>>> think. >>>>>> >>>>>> [image: error2.png] >>>>>> >>>>>> Dieter Weidenbrück schrieb am Donnerstag, 25. Mai 2023 um 18:06:27 >>>>>> UTC+2: >>>>>> >>>>>>> Hi Sam, >>>>>>> I noticed already that I am bumping against browser limits, >>>>>>> especially with sanitizer switched on, so I reduced the pre-allocation >>>>>>> calls. >>>>>>> It turns out that asan uses so much memory that I can't use it to >>>>>>> analyze this case. >>>>>>> >>>>>>> I use >>>>>>> -s ALLOW_MEMORY_GROWTH=1 >>>>>>> but don't specify any MAXIMUM_MEMORY. >>>>>>> >>>>>>> No pthreads version so far. I might try this next. >>>>>>> >>>>>>> Cheers, >>>>>>> Dieter >>>>>>> >>>>>>> s...@google.com schrieb am Donnerstag, 25. Mai 2023 um 17:55:41 >>>>>>> UTC+2: >>>>>>> >>>>>>>> Firstly, if you are allocating 1.8Gb you are likely pushing up >>>>>>>> against browser limits. Are you specifying a MAXIMUM_MEMORY of larger >>>>>>>> than >>>>>>>> 2GB? >>>>>>>> >>>>>>>> Secondly, it looks like you are using wasm workers, which are still >>>>>>>> relatively new. Do you have a version of your code that uses pthreads >>>>>>>> instead? It might tell is if the issue is related to wasm workers. >>>>>>>> >>>>>>>> cheers, >>>>>>>> sam >>>>>>>> >>>>>>>> On Thu, May 25, 2023 at 8:06 AM 'Dieter Weidenbrück' via >>>>>>>> emscripten-discuss <emscripte...@googlegroups.com> wrote: >>>>>>>> >>>>>>>>> The joy was premature, even with pre-allocated heap size segfaults >>>>>>>>> occur. :( >>>>>>>>> >>>>>>>>> Dieter Weidenbrück schrieb am Donnerstag, 25. Mai 2023 um 16:28:37 >>>>>>>>> UTC+2: >>>>>>>>> >>>>>>>>>> All, >>>>>>>>>> I am experiencing segmentation faults when using wasm workers. >>>>>>>>>> Overview: >>>>>>>>>> I am working on a project with considerable 3D data sets. The >>>>>>>>>> code has been stable for a while when running in the main thread >>>>>>>>>> alone. >>>>>>>>>> Then I started using js workers (no shared memory), and again all >>>>>>>>>> was well. >>>>>>>>>> Now I've switched to SharedArrayBuffers and wasm workers, and I >>>>>>>>>> keep running into random problems. >>>>>>>>>> I have prepared the code such that I can run with 0 workers up to >>>>>>>>>> hardware.concurrency workers. All is well with 0 workers, but as >>>>>>>>>> soon as I >>>>>>>>>> use one or more workers, I keep getting segfaults because of invalid >>>>>>>>>> pointers, access out of bounds and similar. >>>>>>>>>> >>>>>>>>>> What happens in main thread and what in the wasm workers: >>>>>>>>>> I allocate all objects in the main thread when importing the 3D >>>>>>>>>> file. Then i fire off a function for each object that will do some >>>>>>>>>> serious >>>>>>>>>> calculations of the data, including allocating and disposing of >>>>>>>>>> memory. The >>>>>>>>>> workers allocate approx. 300 to 400 MB in addition to the main >>>>>>>>>> thread. All >>>>>>>>>> this happens in the same sharedArrayBuffer, of course. >>>>>>>>>> >>>>>>>>>> Here is what I've tried so far: >>>>>>>>>> - compiling with SAFE_HEAP=1 >>>>>>>>>> not a lot of helpful information, >>>>>>>>>> - compiling with -fsanitize=address >>>>>>>>>> everything works without problems here! >>>>>>>>>> - compiling with ASSERTIONS=2 >>>>>>>>>> gave me this information: >>>>>>>>>> [image: error.png] >>>>>>>>>> >>>>>>>>>> To me it looks like another resize call is executed while other >>>>>>>>>> workers keep working on the buffer, and then something gets into >>>>>>>>>> conflict. >>>>>>>>>> To test this, I allocated 1.8 GB right after startup in the main >>>>>>>>>> thread and disposed the mem blocks again just to trigger heap >>>>>>>>>> resize. After >>>>>>>>>> that everything works like a charm. >>>>>>>>>> >>>>>>>>>> Is there anything I am doing wrong? >>>>>>>>>> Sorry for not providing a sample, but there is a lot of code >>>>>>>>>> involved, and it is not easy to simulate this behavior. Happy to >>>>>>>>>> answer >>>>>>>>>> questions. >>>>>>>>>> >>>>>>>>>> All comments are appreciated. >>>>>>>>>> Thanks, >>>>>>>>>> Dieter >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "emscripten-discuss" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to emscripten-disc...@googlegroups.com. >>>>>>>>> To view this discussion on the web visit >>>>>>>>> https://groups.google.com/d/msgid/emscripten-discuss/80d56314-59d8-4332-bb2e-ebe00fe52ea3n%40googlegroups.com >>>>>>>>> >>>>>>>>> <https://groups.google.com/d/msgid/emscripten-discuss/80d56314-59d8-4332-bb2e-ebe00fe52ea3n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>>> . >>>>>>>>> >>>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "emscripten-discuss" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to emscripten-disc...@googlegroups.com. >>>>>> >>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/emscripten-discuss/cfc03512-f69f-44b0-8c14-1f1a8e4ffe9fn%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/emscripten-discuss/cfc03512-f69f-44b0-8c14-1f1a8e4ffe9fn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "emscripten-discuss" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to emscripten-disc...@googlegroups.com. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/emscripten-discuss/e568e189-4259-460f-9601-e7996927cdb7n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/emscripten-discuss/e568e189-4259-460f-9601-e7996927cdb7n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- >> You received this message because you are subscribed to the Google Groups >> "emscripten-discuss" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to emscripten-disc...@googlegroups.com. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/emscripten-discuss/b20d2de8-2532-4441-b8fc-3ef8f049f7f0n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/emscripten-discuss/b20d2de8-2532-4441-b8fc-3ef8f049f7f0n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "emscripten-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to emscripten-discuss+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/emscripten-discuss/45759244-a436-4c3d-941f-6a040a70adb4n%40googlegroups.com.