Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-15 Thread Larry Gritz
I'm sorry for the long delay here, I got sidetracked for quite a while trying 
to unravel a site-specific problem -- in the process of trying to benchmark 
different OpenEXR versions, I found out that I was getting vastly different 
speeds even on the same exr version depending on whether I built libIlmImf 
myself or used the system libraries. It seems to have boiled down to compiler 
releases  (gcc 4.4 vs gcc 4.8 vs clang -- the latter two make much faster code 
for some reason) so it's important to do these kinds of benchmarks certain that 
you used the same toolchain for each option you're benchmarking.

Anyway, the long and short of it is that I'm unable to replicate Peter's 
results. For me, OpenEXR 2.2 is not any slower than 1.7 in my benchmarks. If 
anything, 2.2 is slightly faster. The identical benchmark using tiled, 
MIP-mapped TIFF files is still about 15% faster than OpenEXR, even when I use 
the compiler versions that give the best exr results.

So I'm still very eager to get suggestions for what to try next, and if anybody 
more familiar with OpenEXR internals is interested in taking  deeper look at 
why performance may not be what we hope.

I don't think it's solely the thread issue that has been mentioned previously 
(and for which I have submitted a patch for OpenEXR) -- my results hold for the 
single-threaded case.

-- lg


> On Jan 2, 2016, at 10:28 PM, Larry Gritz  wrote:
> 
> That's a fascinating clue!
> 
> I'm not using "the latest", but I'm certainly using 2.x (2.1 I think?) 
> because of our need for deep files. It never occurred me to test with 1.x, 
> but it will be easy to go back to up the OIIO-based synthetic benchmark I 
> used for my original findings and rerun it after building against OpenEXR 1.7.
> 
> This is very hopeful! If it used to be fast to read OpenEXR and now is slow, 
> that means that the speed problem may not be inherent in the file layout 
> itself or the basic architecture of the library, but could just be a 
> performance regression in the library that went undetected and could be 
> fixed. That would really be great for everybody.
> 
> I'll look into this as soon as I'm back at work on Monday and will report 
> back. 
> 
> Thanks, Peter.
> 
>   -- lg
> 
> 
>> On Jan 2, 2016, at 10:06 PM, Peter Pearson > > wrote:
>> 
>> Hi,
>> 
>> This is really a continuation of Larry's deviation (but a very useful and 
>> timely one!) in the '.exr vs .tx' thread - I've only just subscribed to this 
>> list, so can't respond directly to that.
>> 
>> I've seen pretty much the same issue regarding read speeds, with TIFF being 
>> significantly faster (around 1.5-1.7x) in my tests within a renderer - 
>> incoherent access for varying mipmap levels (or subimages in TIFF's case), 
>> on a local disk on Linux.
>> 
>> There's also this bug:
>> https://github.com/openexr/openexr/pull/170 
>> 
>> 
>> that in my case severely affects concurrent access when using worker threads 
>> to do the reading work - without Larry's patch, 3/4 threads is the best I 
>> can hope for, after that there's so much contention that CPU usage just 
>> dwindles over time - with the patch, I can push 24 threads easily, and 
>> wall-clock time goes down by orders of magnitude.
>> 
>> Ignoring the above patch for a moment though, I've previously (2/3 years 
>> ago) found that at least within PRMan, OpenEXR reading was noticeably faster 
>> than TIFF reading, even when reading half for EXR and 8-bit for TIFF, so 
>> TIFF technically has an unfair advantage. This was over NFS on a saturated 
>> network - on a local disk, TIFF could sometimes win. At the time, profiling 
>> (at the NFS level) seemed to indicate that the stat() call done within 
>> TIFFSetDirectory() to change mipmap levels caused disproportionate slowdowns 
>> for TIFF, as the raw reading and decompression was generally faster than EXR 
>> (although EXR compresses more in general, and things like tile size and 
>> planar config will obviously affect the comparison one way or another). 
>> However, as the studio I was with at the time used PRMan, we were using 
>> Pixar's format which was faster and more compact than EXR, so just used 
>> Pixar's format instead of investigating further.
>> 
>> I've always since assumed based on this that OpenEXR was the better format 
>> than TIFF for rendering, but I've recently been doing some work on texture 
>> reading for a renderer, and as well as noticing the scalability issue Larry 
>> found, have also found that TIFF is faster than OpenEXR is using zip 
>> compression - which as far as I'm aware (ignoring DWA type for the moment) 
>> is the fastest for tiled?
>> 
>> I've done a quick test today with IlmBase 1.0.2 and OpenEXR 1.7.0 just as a 
>> "I'm sure it didn't used to be *this* bad" sanity-check, and indeed - as 
>> well as the IlmBase ThreadPool bug above not 

Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-15 Thread Peter Pearson
Replies inline...

On 16 January 2016 at 07:52, Larry Gritz  wrote:

> I'm sorry for the long delay here, I got sidetracked for quite a while
> trying to unravel a site-specific problem -- in the process of trying to
> benchmark different OpenEXR versions, I found out that I was getting vastly
> different speeds even on the same exr version depending on whether I built
> libIlmImf myself or used the system libraries. It seems to have boiled down
> to compiler releases  (gcc 4.4 vs gcc 4.8 vs clang -- the latter two make
> much faster code for some reason) so it's important to do these kinds of
> benchmarks certain that you used the same toolchain for each option you're
> benchmarking.
>


I was using GCC 4.7.2 for both tests, and building everything (IlmIlmf as
well as OpenEXR) in both cases, not using system libs (the system hasn't
got them installed). I've just done a *very* rough single-threaded test of
just opening a tiled image once, and the speeds between 1.7 and 2.0 are
close to identical, so maybe the discrepancy I can see (tested it again
within renderer) is to do with the usage profile there of multiple threads
interacting with other renderer stuff...



> Anyway, the long and short of it is that I'm unable to replicate Peter's
> results. For me, OpenEXR 2.2 is not any slower than 1.7 in my benchmarks.
> If anything, 2.2 is slightly faster. The identical benchmark using tiled,
> MIP-mapped TIFF files is still about 15% faster than OpenEXR, even when I
> use the compiler versions that give the best exr results.
>
> So I'm still very eager to get suggestions for what to try next, and if
> anybody more familiar with OpenEXR internals is interested in taking
>  deeper look at why performance may not be what we hope.
>


I've got no evidence this *is* the issue for EXR reading, but in terms of
performance, I've long suspected that the use of IOStreams within OpenEXR
might account for some performance penalty compared to raw fread()s -
streams in C++ are generally slower, and getting the buffering right for
high-performance stuff is tricky, definitely cross-platform.

Also, reading and writing of values in OpenEXR goes through ImfXdr.h's
conversion routines doing bitshifting for I assume endianness conversion? -
I guess the x86 port for OpenEXR had to convert this, whereas the SGI
versions didn't, and we're stuck with it now?

On top of that, in the multi-threading scenario, while using a LUT for
half->float conversion is faster than not using it, it causes absolute
havoc in terms of L1/L2 cache thrashing - from disk I've sometimes found
reading full float EXRs faster than half EXRs due to this, but that's
probably only when the OS disk cache has them, so in general it's not a
huge issue given the IO saving that'll happen in most real-world usage for
big facilities...

Cheers,
Peter
___
Oiio-dev mailing list
Oiio-dev@lists.openimageio.org
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org


Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-15 Thread Shane Ambler

On 16/01/2016 05:35, Karl Rasche wrote:

I found out that I was getting vastly different speeds even on the same
exr version depending on whether I built libIlmImf myself or used the
system libraries. It seems to have boiled down to compiler releases  (gcc
4.4 vs gcc 4.8 vs clang -- the latter two make much faster code for some
reason)



I've seen similar things, at least with gcc 4.1 vs 4.8, with the latter
begin significantly faster in certain decode situations (like 10-15%
faster).


Well gcc42 knows upto sse3, gcc44 knows avx and gcc47 knows avx2 and f16c

Just taking advantage of the newer simd sets would help,
then add some better optimisation from experience with them...


--
FreeBSD - the place to B...Software Developing

Shane Ambler

___
Oiio-dev mailing list
Oiio-dev@lists.openimageio.org
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org


Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-15 Thread Kevin Wheatley


Sent on the go...

> On 15 Jan 2016, at 21:09, Peter Pearson  wrote:
> 
> Replies inline...
> 
>> On 
> 
> Also, reading and writing of values in OpenEXR goes through ImfXdr.h's 
> conversion routines doing bitshifting for I assume endianness conversion? - I 
> guess the x86 port for OpenEXR had to convert this, whereas the SGI versions 
> didn't, and we're stuck with it now?

There are certainly some case where even the non xdr paths are potentially 
slower than needed, sometimes it calls stdlib memory routines, other time it is 
implemented as a basic loop.
> 
> On top of that, in the multi-threading scenario, while using a LUT for 
> half->float conversion is faster than not using it, it causes absolute havoc 
> in terms of L1/L2 cache thrashing - from disk I've sometimes found reading 
> full float EXRs faster than half EXRs due to this, but that's probably only 
> when the OS disk cache has them, so in general it's not a huge issue given 
> the IO saving that'll happen in most real-world usage for big facilities...

It would be nice if the copypixels and similar calls supported cpu specific 
implementations and there was an f16c implementation for the half conversions, 
once I patched the avx detection code in configure for gcc4.1, I got the big 
win mentioned in the dwa white paper, at lest in Nuke the conversion function 
inside the DWA parts of the library totally dropped to the bottom of the 
profiler hot spots and the performance jumped.

Karl, would that explain some of your differences with compiler versions?

I did think about rewriting the code myself but my assembler experience is 
somewhere around 80386 time frame (or ARM2/3, PDP8) so I put that near the 
bottom of my pile

Kevin
___
Oiio-dev mailing list
Oiio-dev@lists.openimageio.org
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org


Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-15 Thread Karl Rasche
> > Also, reading and writing of values in OpenEXR goes through ImfXdr.h's
> conversion routines doing bitshifting for I assume endianness conversion? -
> I guess the x86 port for OpenEXR had to convert this, whereas the SGI
> versions didn't, and we're stuck with it now?
>
> There are certainly some case where even the non xdr paths are potentially
> slower than needed, sometimes it calls stdlib memory routines, other time
> it is implemented as a basic loop.
>

For scanline files, we at least have optimized transfer/reformatting paths
and don't have to dip into copyIntoFrameBuffer() too much (which is
generally not the fastest thing in the world). But for tiled files the
optimized transfer/reformat paths aren't used.

copyIntoFrameBuffer() *might* be the cause of Larry's slowless vs
tiff-half, but I'm not sure (as I can't see much in the way of performance
difference between zip-half and tiff-half here).


> On top of that, in the multi-threading scenario, while using a LUT for
> half->float conversion is faster than not using it, it causes absolute
> havoc in terms of L1/L2 cache thrashing - from disk I've sometimes found
> reading full float EXRs faster than half EXRs due to this, but that's
> probably only when the OS disk cache has them, so in general it's not a
> huge issue given the IO saving that'll happen in most real-world usage for
> big facilities...
>
> It would be nice if the copypixels and similar calls supported cpu
> specific implementations and there was an f16c implementation for the half
> conversions,


Agreed, that would be a nice improvement for cases where you want to read
float pixels from a half file, and don't want to convert outside of IlmImf.
That code can suck up tons of time.



> once I patched the avx detection code in configure for gcc4.1, I got the
> big win mentioned in the dwa white paper, at lest in Nuke the conversion
> function inside the DWA parts of the library totally dropped to the bottom
> of the profiler hot spots and the performance jumped.


> Karl, would that explain some of your differences with compiler versions?
>

For whatever reason with gcc4.1 here, I didn't need the patch you're
talking about (could be that it's an as thing, not a gcc thing) to enable
the f16c code for decoding DWA files (which does help massively).

I've seen the gcc4.1 vs 4.8 difference when decoding DWA files on a
non-f16c system, or when decoding PIZ files too. I have seen 4.1 generate
some rather brain-dead asm from the sse intrinsics in ImfDwaCompressor, so
that could be the root of it. I haven't spent too much time worrying about
it -- just build with 4.8 :)



> I did think about rewriting the code myself but my assembler experience is
> somewhere around 80386 time frame (or ARM2/3, PDP8) so I put that near the
> bottom of my pile
>

Oh, c'mon; it's fun :P

I lost track of how much of my life I lost trying to write
fromHalfZigZag_f16c().

Karl
___
Oiio-dev mailing list
Oiio-dev@lists.openimageio.org
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org


Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-04 Thread Karl Rasche
I'm assuming you reverted IlmBase back to 1.x as well (not sure if this
> matters, but...)?
>

Yeah, I was using IlmBase 2.2.0 and 1.0.3 with OpenEXR 2.2.0 and 1.7.1
respectively.
___
Oiio-dev mailing list
Oiio-dev@lists.openimageio.org
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org


Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-04 Thread Karl Rasche
To throw in another data point; I tried to repro the tiff/exr discrepancy
using the recipe that Larry posted earlier.

Using local files, 1 thread, gcc 4.8.1, oiio master, and the
system-provided libtiff (3.9.4), I'm *not* able to trip the problem
(provided I'm looking at the right numbers in the testtex output -- I'm
using the first time that is listed next to the thread efficiency). I also
don't see any pref regression (at least with the 1 thread case, which may
be expected) between 1.7.1 and 2.2.


*  Type (# thr)  OpenEXR 2.2   OpenEXR 1.7.1*
 exr-half-zip (1) 77.1 s 77.7 s
 exr-half-piz (1) 79.6 s110.1 s
 tif-half-zip (1) 75.3 s 75.8 s


The PIZ difference between 1.7.1 and 2.2 is expected, since part of the 2.2
changes included an optimization of the static Huffman decoder used by PIZ.
For tiled reads in 2.2, PIZ @ 64x64 tiles is still a bit slower than ZIP in
other tests I've done, mostly due to the overhead of decoder initialization
(which is negligible for scanline reads, but totally dominates for tiled
reads). Since we're talking performance, I suspect tiled PIZ could go a
fair bit faster with some TLC. But that's another thread.

Karl


On Mon, Jan 4, 2016 at 9:20 AM, Piotr Stanczyk 
wrote:

> Thanks for the suggestion. I'll take some time to  look at what could have
> caused a regression / change
>
> On 2 January 2016 at 22:28, Larry Gritz  wrote:
>
>> That's a fascinating clue!
>>
>> I'm not using "the latest", but I'm certainly using 2.x (2.1 I think?)
>> because of our need for deep files. It never occurred me to test with 1.x,
>> but it will be easy to go back to up the OIIO-based synthetic benchmark I
>> used for my original findings and rerun it after building against OpenEXR
>> 1.7.
>>
>> This is very hopeful! If it used to be fast to read OpenEXR and now is
>> slow, that means that the speed problem may not be inherent in the file
>> layout itself or the basic architecture of the library, but could just be a
>> performance regression in the library that went undetected and could be
>> fixed. That would really be great for everybody.
>>
>> I'll look into this as soon as I'm back at work on Monday and will report
>> back.
>>
>> Thanks, Peter.
>>
>> -- lg
>>
>>
>> On Jan 2, 2016, at 10:06 PM, Peter Pearson 
>> wrote:
>>
>> Hi,
>>
>> This is really a continuation of Larry's deviation (but a very useful and
>> timely one!) in the '.exr vs .tx' thread - I've only just subscribed to
>> this list, so can't respond directly to that.
>>
>> I've seen pretty much the same issue regarding read speeds, with TIFF
>> being significantly faster (around 1.5-1.7x) in my tests within a renderer
>> - incoherent access for varying mipmap levels (or subimages in TIFF's
>> case), on a local disk on Linux.
>>
>> There's also this bug:
>> https://github.com/openexr/openexr/pull/170
>>
>> that in my case severely affects concurrent access when using worker
>> threads to do the reading work - without Larry's patch, 3/4 threads is the
>> best I can hope for, after that there's so much contention that CPU usage
>> just dwindles over time - with the patch, I can push 24 threads easily, and
>> wall-clock time goes down by orders of magnitude.
>>
>> Ignoring the above patch for a moment though, I've previously (2/3 years
>> ago) found that at least within PRMan, OpenEXR reading was noticeably
>> faster than TIFF reading, even when reading half for EXR and 8-bit for
>> TIFF, so TIFF technically has an unfair advantage. This was over NFS on a
>> saturated network - on a local disk, TIFF could sometimes win. At the time,
>> profiling (at the NFS level) seemed to indicate that the stat() call done
>> within TIFFSetDirectory() to change mipmap levels caused
>> disproportionate slowdowns for TIFF, as the raw reading and decompression
>> was generally faster than EXR (although EXR compresses more in general, and
>> things like tile size and planar config will obviously affect the
>> comparison one way or another). However, as the studio I was with at the
>> time used PRMan, we were using Pixar's format which was faster and more
>> compact than EXR, so just used Pixar's format instead of investigating
>> further.
>>
>> I've always since assumed based on this that OpenEXR was the better
>> format than TIFF for rendering, but I've recently been doing some work on
>> texture reading for a renderer, and as well as noticing the scalability
>> issue Larry found, have also found that TIFF is faster than OpenEXR is
>> using zip compression - which as far as I'm aware (ignoring DWA type for
>> the moment) is the fastest for tiled?
>>
>> I've done a quick test today with IlmBase 1.0.2 and OpenEXR 1.7.0 just as
>> a "I'm sure it didn't used to be *this* bad" sanity-check, and indeed - as
>> well as the IlmBase ThreadPool bug above not seeming to exist - reading of
>> the same EXRs with 1.7.0 vs 2.0.1 is almost twice as fast, 

Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-04 Thread Peter Pearson
I was using 32x32 tile size (not that I'd expect it to make *that* much
difference), and my tests between EXR and TIFF were at full float only
(couldn't be bothered to get half working for TIFF).

But I noticed the discrepancy between OpenEXR 1.7 and 2.0 in terms of speed
with both half and full float EXRs (zip, 3-channel, 32x32 tile size).

I'm assuming you reverted IlmBase back to 1.x as well (not sure if this
matters, but...)?

Peter

On 5 January 2016 at 08:44, Karl Rasche  wrote:

>
> To throw in another data point; I tried to repro the tiff/exr discrepancy
> using the recipe that Larry posted earlier.
>
> Using local files, 1 thread, gcc 4.8.1, oiio master, and the
> system-provided libtiff (3.9.4), I'm *not* able to trip the problem
> (provided I'm looking at the right numbers in the testtex output -- I'm
> using the first time that is listed next to the thread efficiency). I also
> don't see any pref regression (at least with the 1 thread case, which may
> be expected) between 1.7.1 and 2.2.
>
>
> *  Type (# thr)  OpenEXR 2.2   OpenEXR 1.7.1*
>  exr-half-zip (1) 77.1 s 77.7 s
>  exr-half-piz (1) 79.6 s110.1 s
>  tif-half-zip (1) 75.3 s 75.8 s
>
>
> The PIZ difference between 1.7.1 and 2.2 is expected, since part of the
> 2.2 changes included an optimization of the static Huffman decoder used by
> PIZ. For tiled reads in 2.2, PIZ @ 64x64 tiles is still a bit slower than
> ZIP in other tests I've done, mostly due to the overhead of decoder
> initialization (which is negligible for scanline reads, but totally
> dominates for tiled reads). Since we're talking performance, I suspect
> tiled PIZ could go a fair bit faster with some TLC. But that's another
> thread.
>
> Karl
>
>
> On Mon, Jan 4, 2016 at 9:20 AM, Piotr Stanczyk 
> wrote:
>
>> Thanks for the suggestion. I'll take some time to  look at what could
>> have caused a regression / change
>>
>> On 2 January 2016 at 22:28, Larry Gritz  wrote:
>>
>>> That's a fascinating clue!
>>>
>>> I'm not using "the latest", but I'm certainly using 2.x (2.1 I think?)
>>> because of our need for deep files. It never occurred me to test with 1.x,
>>> but it will be easy to go back to up the OIIO-based synthetic benchmark I
>>> used for my original findings and rerun it after building against OpenEXR
>>> 1.7.
>>>
>>> This is very hopeful! If it used to be fast to read OpenEXR and now is
>>> slow, that means that the speed problem may not be inherent in the file
>>> layout itself or the basic architecture of the library, but could just be a
>>> performance regression in the library that went undetected and could be
>>> fixed. That would really be great for everybody.
>>>
>>> I'll look into this as soon as I'm back at work on Monday and will
>>> report back.
>>>
>>> Thanks, Peter.
>>>
>>> -- lg
>>>
>>>
>>> On Jan 2, 2016, at 10:06 PM, Peter Pearson 
>>> wrote:
>>>
>>> Hi,
>>>
>>> This is really a continuation of Larry's deviation (but a very useful
>>> and timely one!) in the '.exr vs .tx' thread - I've only just subscribed to
>>> this list, so can't respond directly to that.
>>>
>>> I've seen pretty much the same issue regarding read speeds, with TIFF
>>> being significantly faster (around 1.5-1.7x) in my tests within a renderer
>>> - incoherent access for varying mipmap levels (or subimages in TIFF's
>>> case), on a local disk on Linux.
>>>
>>> There's also this bug:
>>> https://github.com/openexr/openexr/pull/170
>>>
>>> that in my case severely affects concurrent access when using worker
>>> threads to do the reading work - without Larry's patch, 3/4 threads is the
>>> best I can hope for, after that there's so much contention that CPU usage
>>> just dwindles over time - with the patch, I can push 24 threads easily, and
>>> wall-clock time goes down by orders of magnitude.
>>>
>>> Ignoring the above patch for a moment though, I've previously (2/3 years
>>> ago) found that at least within PRMan, OpenEXR reading was noticeably
>>> faster than TIFF reading, even when reading half for EXR and 8-bit for
>>> TIFF, so TIFF technically has an unfair advantage. This was over NFS on a
>>> saturated network - on a local disk, TIFF could sometimes win. At the time,
>>> profiling (at the NFS level) seemed to indicate that the stat() call done
>>> within TIFFSetDirectory() to change mipmap levels caused
>>> disproportionate slowdowns for TIFF, as the raw reading and decompression
>>> was generally faster than EXR (although EXR compresses more in general, and
>>> things like tile size and planar config will obviously affect the
>>> comparison one way or another). However, as the studio I was with at the
>>> time used PRMan, we were using Pixar's format which was faster and more
>>> compact than EXR, so just used Pixar's format instead of investigating
>>> further.
>>>
>>> I've always since assumed based on this that OpenEXR was the better
>>> 

Re: [Oiio-dev] OpenEXR 2.0 tiled read speeds

2016-01-02 Thread Larry Gritz
That's a fascinating clue!

I'm not using "the latest", but I'm certainly using 2.x (2.1 I think?) because 
of our need for deep files. It never occurred me to test with 1.x, but it will 
be easy to go back to up the OIIO-based synthetic benchmark I used for my 
original findings and rerun it after building against OpenEXR 1.7.

This is very hopeful! If it used to be fast to read OpenEXR and now is slow, 
that means that the speed problem may not be inherent in the file layout itself 
or the basic architecture of the library, but could just be a performance 
regression in the library that went undetected and could be fixed. That would 
really be great for everybody.

I'll look into this as soon as I'm back at work on Monday and will report back. 

Thanks, Peter.

-- lg


> On Jan 2, 2016, at 10:06 PM, Peter Pearson  wrote:
> 
> Hi,
> 
> This is really a continuation of Larry's deviation (but a very useful and 
> timely one!) in the '.exr vs .tx' thread - I've only just subscribed to this 
> list, so can't respond directly to that.
> 
> I've seen pretty much the same issue regarding read speeds, with TIFF being 
> significantly faster (around 1.5-1.7x) in my tests within a renderer - 
> incoherent access for varying mipmap levels (or subimages in TIFF's case), on 
> a local disk on Linux.
> 
> There's also this bug:
> https://github.com/openexr/openexr/pull/170 
> 
> 
> that in my case severely affects concurrent access when using worker threads 
> to do the reading work - without Larry's patch, 3/4 threads is the best I can 
> hope for, after that there's so much contention that CPU usage just dwindles 
> over time - with the patch, I can push 24 threads easily, and wall-clock time 
> goes down by orders of magnitude.
> 
> Ignoring the above patch for a moment though, I've previously (2/3 years ago) 
> found that at least within PRMan, OpenEXR reading was noticeably faster than 
> TIFF reading, even when reading half for EXR and 8-bit for TIFF, so TIFF 
> technically has an unfair advantage. This was over NFS on a saturated network 
> - on a local disk, TIFF could sometimes win. At the time, profiling (at the 
> NFS level) seemed to indicate that the stat() call done within 
> TIFFSetDirectory() to change mipmap levels caused disproportionate slowdowns 
> for TIFF, as the raw reading and decompression was generally faster than EXR 
> (although EXR compresses more in general, and things like tile size and 
> planar config will obviously affect the comparison one way or another). 
> However, as the studio I was with at the time used PRMan, we were using 
> Pixar's format which was faster and more compact than EXR, so just used 
> Pixar's format instead of investigating further.
> 
> I've always since assumed based on this that OpenEXR was the better format 
> than TIFF for rendering, but I've recently been doing some work on texture 
> reading for a renderer, and as well as noticing the scalability issue Larry 
> found, have also found that TIFF is faster than OpenEXR is using zip 
> compression - which as far as I'm aware (ignoring DWA type for the moment) is 
> the fastest for tiled?
> 
> I've done a quick test today with IlmBase 1.0.2 and OpenEXR 1.7.0 just as a 
> "I'm sure it didn't used to be *this* bad" sanity-check, and indeed - as well 
> as the IlmBase ThreadPool bug above not seeming to exist - reading of the 
> same EXRs with 1.7.0 vs 2.0.1 is almost twice as fast, for the wall-clock 
> time of reading them in a renderer.
> 
> Given that I've tested this in a renderer, it's a bit difficult to give exact 
> timings, but based on wall-clock texture read times, I'm roughly seeing 
> OpenEXR 1.7 read times be 50-70% of the OpenEXR 2.0.1 ones.
> 
> So it seems (possibly pending more investigation and timings) that there has 
> been a regression in read speeds since OpenEXR 2.0 (and currently, as I'm 
> assuming Larry was testing with the latest OpenEXR version?)
> 
> Cheers,
> Peter
> 
> 
> 
> ___
> Oiio-dev mailing list
> Oiio-dev@lists.openimageio.org
> http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org

--
Larry Gritz
l...@larrygritz.com


___
Oiio-dev mailing list
Oiio-dev@lists.openimageio.org
http://lists.openimageio.org/listinfo.cgi/oiio-dev-openimageio.org