Sounds like you need some sort of read-ahead, where a small read would actually read a larger portion of the file into a (size-configurable) buffer on the worker-thread side with a single round-trip to the main-thread, and subsequent small reads wouldn't require a main-thread round-trip until the end of the buffer is reached, and the next read would fill up the buffer again...
Disclaimer: I haven't looked at the code, even if every small read does a complete, blocking thread-roundtrip, this doesn't explain the high blocking times you are seeing. I guess there is another problem that a single read has a too high blocking time (may be it can only do at most 1 read per frame, or something similar). Cheers, -Floh. Am Donnerstag, 5. November 2015 15:09:04 UTC+1 schrieb Robert Goulet: > > Where is the code that deals with this locking mechanics? I'd like to take > a look at it. > > Also, it doesn't matter much for us how long the entire process takes. > It's more each read time that matters. If we can improve that, would be > great! > > On Wednesday, November 4, 2015 at 5:22:32 PM UTC-5, Alon Zakai wrote: >> >> Thanks for the testcase. I see the same results. >> >> It looks like reducing the number of reads helps a lot. The overhead is >> affected mostly by number of reads. Which could make sense if the main >> thread is busy (since it's the main browser thread, it could be busy doing >> anything from rendering to doing some work for another tab) and the worker >> needs to wait on it. Also, we send a message to the main thread, so any >> general activity on the event queue could lead to the message being >> received later. >> >> It's also possible that the mutex and futex stuff we do for the blocking >> call has overhead. Jukka, do we have a way to profile that? >> >> >> On Wed, Nov 4, 2015 at 2:06 PM, Robert Goulet <[email protected]> >> wrote: >> >>> Here, I quickly wrote this small test case >>> <https://autodesk.box.com/s/h12u5woqkuwyzg4uk7hw6puck8jg8orh> which >>> reproduce the problem. >>> >>> I get the following output from it: >>> >>> Preallocating 1 workers for a pthread spawn pool. >>> Writing test file (1048576 bytes)... >>> Reading file from main thread... >>> Completed in 6.730000ms (~0.205156ms per read of 32768 bytes) >>> Reading file from another thread... >>> Completed in 3499.585000ms (~109.355312ms per read of 32768 bytes) >>> Done. >>> >>> Please let me know if there's anything we can do to fix this major >>> difference between the two. >>> >>> Thanks! >>> >>> On Wednesday, November 4, 2015 at 3:25:58 PM UTC-5, Robert Goulet wrote: >>>> >>>> That is for many small/medium reads. It seems the size of the read does >>>> not have any impact on performance thought. >>>> >>>> Essentially, the thread we create does the following: >>>> >>>> while (true) { >>>> _queue_semaphore.wait(); >>>> if (_exit_thread) >>>> break; >>>> process_request(); >>>> } >>>> >>>> in that case, the process_request() function takes 200ms+ to execute >>>> (profiled within the process_request() function itself, so that it does >>>> not >>>> include locking mechanics overhead). >>>> >>>> If we just call process_request() right away upon adding request >>>> instead of inserting the request in the thread queue (which essentially >>>> just bypass the thread completely), the process_request() function takes >>>> <0.2ms to execute. The only thing this function does is a switch case >>>> between fopen(), fread() and fclose(). I've narrowed it down to being >>>> these >>>> filesystem function who are taking a much longer time to return. >>>> >>>> The main thread is waiting on the thread queue to complete before >>>> returning, so I don't see why it would block the thread from doing its >>>> work? >>>> >>>> On Wednesday, November 4, 2015 at 12:53:21 PM UTC-5, Alon Zakai wrote: >>>>> >>>>> The filesystem itself resides in JS, which can only be accessed from >>>>> the main thread. Workers therefore need to send messages to communicate >>>>> with it. However, 200ms seems ridiculously high - is that for a single >>>>> read()? Or many small reads of small amounts? If you can make a small >>>>> standalone testcase showing the issue, that would be useful for >>>>> benchmarking. >>>>> >>>>> A possibility is that the blocking is the issue, and the main thread >>>>> is busy with something else. >>>>> >>>>> On Wed, Nov 4, 2015 at 7:57 AM, Robert Goulet <[email protected]> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> we are using the new pthread support in Emscripten, and one thing we >>>>>> noticed is how much slower filesystem functions are when executed in a >>>>>> thread. We saw this in the documentation: >>>>>> >>>>>> *Currently several of the functions in the C runtime, such as >>>>>>> filesystem functions like fopen(), fread(), printf(), fprintf() etc. >>>>>>> are >>>>>>> not multithreaded, but instead their execution is proxied over to the >>>>>>> main >>>>>>> application thread.* >>>>>> >>>>>> >>>>>> I'm just trying to understand what we are dealing with. At this point >>>>>> I am guessing this is the reason why it is much slower to read a file >>>>>> using >>>>>> fread in a thread. We are seeing 1000x slowdowns compared to running in >>>>>> the >>>>>> main thread directly. For example, running in main thread, a read >>>>>> request >>>>>> can complete in 0.2ms, while in a thread is takes 200ms. Most likely >>>>>> that's >>>>>> the overhead of waiting on the main thread to process proxied requests? >>>>>> >>>>>> Is there any technical blockers preventing filesystem functions to be >>>>>> multithreaded so that they are no longer put in the main thread proxy >>>>>> queue? >>>>>> >>>>>> Thanks! >>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "emscripten-discuss" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "emscripten-discuss" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "emscripten-discuss" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
