On May 16, 2014, at 3:33 PM, Francois Caron <[email protected]> wrote:

> Small modifications to limit calls to pthread_create.
> 
> Here’s the latest patch.
> 
> Regards,
> François
> 
> <tiles.patch>
> 
> On May 15, 2014, at 4:22 PM, Francois Caron <[email protected]> 
> wrote:
> 
>> Updated patch.
>> 
>> Changed the multithreading to have the workers sync for the next job. This 
>> makes the master thread’s code much simpler.
>> 
>> Removed the changes in analyze.c to avoid polluting the patch with 
>> unnecessary modifications.
>> 
>> Thanks,
>> François
>> 
>> <tiles.patch>
>> 
>> On May 15, 2014, at 9:06 AM, Francois Caron <[email protected]> 
>> wrote:
>> 
>>> Patch to review for tile-based multithreading.
>>> 
>>> Speedup of about 3.5X when using 8 threads and 16 tiles on a 720P sequence. 
>>> Couldn’t test with latest code as latest assembly cannot build on OS X.
>>> 
>>> Thanks,
>>> François
>>> 
>>> <tiles.patch>
>> 
> 

Hi,

Here are the latest modifications that address the previous review. Moreover, 
the implementation has been significantly changed. The master thread now simply 
wakes up the first worker thread. Each worker then wakes up a coworker if one 
is present. This avoids thundering wake up calls with all the workers fighting 
for mutex control.

The cleanup code has also been revamped to avoid destroying never initialized 
resources. Note that the pthread API is unfriendly in this manner. Flags were 
added to keep track of the allocated resources.

I’ve attached the patch, the previous review that I’ve commented. I’m also 
posting a quick and dirty validation process.

Regards,
François

# Multi-threading support validation.
# Using MT has the encoder send out multiple slices per frame. Each tile uses a
# single slice. Not using MT has the encoder send out 1 slice per frame using
# multiple entry points in the frame.
#
# For verification purposes, we can use a single thread (ST) in MT mode. On the
# command line: -p "tiles=x,y mt-mode=1 nb-workers=0,0".
# This tells the encoder to run in MT mode with a single thread. We can use this
# to make sure that when multiple threads are actually running in parallel, the
# encoder still behaves correctly. As long as we are using the same number of
# tiles and frames, we can use the MT "emulation" to validate the true MT
# results.

# RaceHorses sequence available on f265.org.
# ST baseline.
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=0,0" 
RaceHorses_832x480_30.mp4 race_st.265

# Multiple MT runs to check for anything funky.
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
RaceHorses_832x480_30.mp4 race_mt_a.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
RaceHorses_832x480_30.mp4 race_mt_b.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
RaceHorses_832x480_30.mp4 race_mt_c.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
RaceHorses_832x480_30.mp4 race_mt_d.265

# Use MD5 sums to validate the output streams.
$ ls race*.265 | xargs md5 | awk '{ print $4 }' | uniq
10eacdc7208055865e6d823b3026ef23

# Speed sequence available on f265.org.
# ST baseline.
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=0,0" 
Speed_1024x768_30.mp4 speed_st.265

# Multiple MT runs to check for anything funky.
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
Speed_1024x768_30.mp4 speed_mt_a.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
Speed_1024x768_30.mp4 speed_mt_b.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
Speed_1024x768_30.mp4 speed_mt_c.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
Speed_1024x768_30.mp4 speed_mt_d.265

# Use MD5 sums to validate the output streams.
$ ls speed*.265 | xargs md5 | awk '{ print $4 }' | uniq
df7b62046526ec9182a57986d76a8c12

# Square sequence available on f265.org.
# ST baseline.
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=0,0" 
Square_416x240_60.mp4 square_st.265

# Multiple MT runs to check for anything funky.
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
Square_416x240_60.mp4 square_mt_a.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
Square_416x240_60.mp4 square_mt_b.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
Square_416x240_60.mp4 square_mt_c.265
$ ./build/f265cli -p "quality=25 tiles=2,2 mt-mode=1 nb-workers=1,0" 
Square_416x240_60.mp4 square_mt_d.265

$ ls square*.265 | xargs md5 | awk '{ print $4 }' | uniq
47940d3c9d92afed22eb3a60d5867747


--
To unsubscribe visit http://f265.org
or send a mail to [email protected].

Reply via email to