On 2017-03-29 16:41, Matthew Woehlke wrote:
On 2017-03-29 07:26, Marc Mutz wrote:
That brings us straight back to the fundamental question: Why can the
C++
world at large cope with containers that are not CoW and Qt cannot?
The only
answer I have is "because Qt never tried". And that's the end of it. I
have
pointed to Herb's string measurements from a decade or two ago. I have
shown
that copying a std::vector up to 1K ints is faster than a QVector,
when
hammered by at least two threads.
4 KiB of memory is not very much. What happens if you have larger
objects (say, 100 objects with 96 bytes each)?
The same. QVector has a hw mutex around the ref counting. Only one core
can have write access to any given cache line. So the rate with which
you can update the ref count is limited by the rate a single core can
update it (in memory), divided by a factor that accounts for cache-line
ping-pong. It can be as high as 2.
Deep-copying does not write to the source object, and any number of
cores can share read access for a given cache line, each with its own
copy, so deep-copying scales linearly with the number of cores.
Therefore, for any given element size and count there exists a thread
count where deep-copying becomes faster than CoW. Yes, even for 1K
objects of 1K size each.
What if you have an API that needs value semantics (keep in mind one
benefit of CoW is implicit shared lifetime management) but tend to not
actually modify the "copied" list?
std::vector has value semantics. OTOH, QVector's CoW leaks its reference
semantics, e.g. if you take an iterator into a container, copy the
container, then write to the iterator, you wrote to both copies.
What benchmarks have been done on *real applications*? What were the
results?
What benchmarks have *you* done? The world outside Qt is happily working
with CoWless containers. It's proponents of CoW who need to show that
CoW is a global optimisation and not just for copying of certain element
counts and sizes.
(I just had to review _another_ pimpl'ed class that contained
nothing but two enums)
...and what happens if at some point in the future that class needs
three enums? Or some other member?
When you start with the class, you pack the two values into a bit-field
and add reserved space to a certain size. 4 or 8 bytes. When you run
out, you make a V2 and add an overload taking V2. That is perfectly ok,
since old code can't use new API. This doesn't mean you should never use
pimpl. But it means you shouldn't use it just because you can.
What, exactly, do you find objectionable about PIMPL in "modern C++"?
It
can't be that it's inefficient, because performance was never a goal of
PIMPL.
Performance is always a goal in C++. Even in Qt. Otherwise QRect would
be pimpled, too.
so I can't pass it by value into slots.
Why would you want to? No-one does that. People use cref, like for all
large
types. Qt makes sure that a copy is taken only when needed, ie. when
the slot
is in a different thread from the emitter. That is very rare, and
people can
be expected to pass a shared_ptr<vector> instead in these situations.
This (passing lists across thread boundaries in signals/slots) happens
quite a bit in https://github.com/kitware/vivia/. Doing so is a
fundamental part of the data processing architecture of at least two of
the applications there.
Qt supports thousands of applications. We shouldn't optimize for
corner-cases.
Also, explicit sharing borders on premature pessimization. If my slot
needs to modify the data, I have to go out of my way to avoid making an
unnecessary copy. (This argument would be more compelling if C++ had a
cow_ptr.)
You got that the wrong way around.
Thanks,
Marc
_______________________________________________
Development mailing list
[email protected]
http://lists.qt-project.org/mailman/listinfo/development