> On Oct. 21, 2013, 6:26 p.m., Jan Kundrát wrote:
> > Have you tried a naive implementation with a
> > std::vector<std::pair<Key,Value>>? You say that a typical use case has
> > eight entries; that's a very small number where a well-tuned vector could
> > easily beat the O(1) of QHash or the O(log n) of QMap by its O(n)
> > semantics. Linear scan might very well turn out to be faster due to its
> > better cache locality and the associated memory streaming benefits. The STL
> > class has a lower overhead than an implicitly shared QVector (and you do
> > not need implicit sharing of the actual entries, do you?).
> >
> > Anyway, the point is, if the number is small enough, the big-O notation
> > does not necessarily matter.
>
> Milian Wolff wrote:
> This is very true. Regarding std::vector not sharing, this is OK since
> UDSEntryPrivate _is_ sharing. Also, QString in Field is also shared (and with
> this patch actually makes use of this). I'd be interested in the performance
> of a (std::)vector/pair approach.
>
> Frank Reininghaus wrote:
> I fully agree that big-O is not always the most important thing. However,
> kioslaves can in principle add many more than 8 fields to a UDSEntry, and I
> had thought that there is a reason why it has always used a QHash.
>
> About the "std::vector<std::pair<Key,Value>>" idea: sounds interesting. I
> would assume that this approach uses more memory than my current patch though
> (if Value == Field). Note that my patch only adds one pointer to the UDSEntry
> to keep track of the uint keys (assuming that the QHash is shared with many
> UDSEntries), i.e., 8 bytes on a 64 bit system.
>
> In a std::pair<uint, Field>, you would add a uint to every field of the
> entry, and the compiler might actually add some padding to preserve the
> alignment of the members of the "Field", such that the uint effectively needs
> 8 bytes for every entry.
>
> Another approach would be to store a QVector<Field>/std::vector<Field>
> and a separate vector containing the uint keys. A linear scan of the keys
> would be much faster if they are all stored next to each other, right? In a
> std::vector<std::pair<uint, Field>>, the different keys would be many bytes
> apart.
>
> Jan Kundrát wrote:
> Regarding the "reason why it has always used a QHash" -- this might be
> true, or perhaps a programmer used the first thing which came across their
> mind. I do this all the time.
>
> You are right on the analysis of the memory consumption -- storing the
> keys in the vector (or in the QHash) does have an overhead. The advantage is
> that it will save you at least one pointer indirection compared to an
> implicitly shared QMap/QHash/... of keys. That might not be a good choice in
> this context.
>
> I do not have any numbers comparing performance of a linear scan over a
> vector<pair> on one hand and a scan of a vector<int>. Your idea might or
> might not be correct; the memory prefetch in CPUs is known to be an
> impressive piece of silicon. It's entirely possible that the speed
> improvement of a single vector<int> would get neutralized by a missing
> prefetch of the target vector<Field>.
>
> How does your current patch work when you replace a QVector with a
> std::vector?
>
> Milian Wolff wrote:
> I just thought some more about it and have a potentially even more
> performant approach, albeit more complicated code-wise. Food for thought
> though:
>
> Just use a std::vector<Field>, but add the uint (uds field) to Field.
> Then encapsulate m_str and m_long in a union and add a bool m_isStr or
> similar. I.e.:
>
> struct Field {
> // ctors ...
> union { // should be sizeof 8
> QString m_str;
> long long m_long;
> };
> uint m_uds; // sizeof 4 but would incur a 4byte padding
> bool m_isStr; // partially fill padding
> };
>
> If I'm not mistaken, this has the same size as the current Field struct.
> Then you'd fill this in a std::vector which gets sorted after it is filled by
> m_uds. In the lookup functions you can then use binary searches (though at an
> estimated size of 8, a simple linear search might perform better).
>
> /me would be interested in the performance of this approach. Note that
> you should still keep the QString cache to use implicit sharing there. The
> benefit here is that you have _no_ overhead as far as I can see.
>
> Cheers
>
> Frank Reininghaus wrote:
> Unfortunately, you cannot put a QString (or any other type with
> non-trivial constructor/destructor) into a union. It's easy to see why: when
> destructing such an object, how can the compiler know if the 8-byte object
> that is stored in the union is the d-pointer of a QString or a long long?
>
> Milian Wolff wrote:
> Doh! Of course you are right. Too bad ;-)
>
> Though I do wonder what would happen if one did a
>
> ~Field() {
> if (m_isStr) {
> m_str.~QString();
> }
> }
>
> Ugly as hell though, so definitely not worth it.
That would be fine in C++11 according to
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf
It is supported since Clang 3.1 and GCC 4.6, but not in MSVC, so I guess that
won't be possible.
- Alexander
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/113355/#review42115
-----------------------------------------------------------
On Oct. 21, 2013, 8:23 p.m., Frank Reininghaus wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/113355/
> -----------------------------------------------------------
>
> (Updated Oct. 21, 2013, 8:23 p.m.)
>
>
> Review request for kdelibs.
>
>
> Repository: kdelibs
>
>
> Description
> -------
>
> This patch is based on some discussions on the kfm-devel mailing list, see
> http://lists.kde.org/?t=137935784800003&r=1&w=2
>
> Mark found out that KIO's UDSEntry class is one of the major consumers of
> memory in applications which use KIO to list directories with a large number
> of files, and I found a way use implicit sharing to drastically reduce the
> amount of memory it needs. Many thanks to Milian for his great blog post
> http://milianw.de/blog/katekdevelop-sprint-vienna-2012-take-1, without which
> I would probably not have had such ideas.
>
>
> 1. The problem
>
> The UDSEntry keeps all sorts of information about files which can be stored
> in a string (name, user, group, etc.) or in a long long (file size,
> modification time, etc.). All these data can be accessed with a uint key, and
> UDSEntry returns the corresponding QString or long long in O(1) time.
> Internally, it stores the data in a QHash<uint, Field>, where Field is a
> struct that has a QString and a long long member.
>
> The problem is that QHash needs quite a lot of memory to provide the O(1)
> access, see http://doc.qt.digia.com/qq/qq19-containers.html for details, and
> that the minimum capacity of a QHash seems to be 17, even though the number
> of entries in a UDSEntry is often 8 in the rather typical standard kio_file
> use case.
>
>
> 2. Proposed solution
>
> (a) We can store the "Fields" in a QVector<Field>, which needs only very
> little overhead compared to the memory that the actual "Fields" need. When
> loading a UDSEntry from a QDataStream, we just append all "Fields" to this
> QVector one by one. Moreover, we need a QHash<uint, int>, which maps each key
> to the index of its Field in the QVector. This restructuring alone does not
> reduce the memory usage, of course.
>
> (b) The key observation is that the QDataStream, which KIO::ListJob reads the
> UDSEntries from, typically provides the different "Fields" in exactly the
> same order. This means that the QHash<uint, int> is usually exactly the same
> for every UDSEntry, and we can make use of implicit sharing to store only one
> copy of this QHash. I've modified
>
> void UDSEntryPrivate::load(QDataStream &s, UDSEntry &a)
>
> such that it remembers the most recent QHash<uint, int> and just adds an
> implicitly shared copy of it to "a" if the order of the Fields has not
> changed.
>
> (c) Moreover, some of the QString Fields in the UDSEntries in one directory
> are often the same, like, e.g., the user and the group. The "load" function
> also remembers the most recently read values for each Field in a static
> QVector<QString> and just puts an implicitly shared copy into the UDSEntry if
> possible.
>
>
> 3. Possible disadvantages
>
> (a) When using the "remove" member, the new version of UDSEntry does not
> remove the Field from the QVector<Field>. This means that removing and adding
> a "Field" repeatedly would let the memory usage grow indefinitely. According
> to David (http://lists.kde.org/?l=kfm-devel&m=138052519927973&w=2), this
> doesn't matter though because no known user of UDSEntry uses its remove()
> member. Maybe we should remove remove (pun stolen grom David) in the
> frameworks branch then?
>
> (b) In principle, the old version of UDSEntryPrivate::load(QDataStream&,
> UDSEntry&) was reentrant. This is not the case for my changed version.
> Reentrancy could be restored rather easily by protecting the access to the
> static data with a mutex, but given that most of KIO is not supposed to be
> used from outside the main thread AFAIK, I don't know if this is necessary.
>
>
> 4. Changes since the first version of the patch which I posted in
> http://lists.kde.org/?l=kfm-devel&m=137962995805432&w=2
>
> (a) Implemented the minor changes suggested by David in
> http://lists.kde.org/?l=kfm-devel&m=137975442807965&w=2
>
> (b) Added a unit test to verify that storing and loading UDSEntries from a
> stream works even if the order of the fields is permuted, and some fields are
> removed or added in between.
>
> (c) Fixed a bug which was uncovered by the test:
> cachedUdsFields.erase(cachedUdsFields.begin() + i, cachedUdsFields.end())
> instead of cachedUdsFields.erase(cachedUdsFields.begin() + i)
>
> (d) Use QVector::reserve to reserve the appropriate size for the
> QVector<Field>. Saves some time when loading the UDSEntry, and reduces the
> memory usage further.
>
> (e) Changed the type of the loop variable from quint32 to int to fix some
> compiler warnings.
>
>
> Diffs
> -----
>
> kio/kio/udsentry.cpp 1e1f503
> kio/tests/CMakeLists.txt 1019312
> kio/tests/udsentrybenchmark.cpp PRE-CREATION
> kio/tests/udsentrytest.h PRE-CREATION
> kio/tests/udsentrytest.cpp PRE-CREATION
>
> Diff: http://git.reviewboard.kde.org/r/113355/diff/
>
>
> Testing
> -------
>
> Old and new unit tests pass. The memory usage of Dolphin when loading a
> directory with 100,000 files in Icons View is reduced from 165.4 MB to 113.3
> MB. Any application that uses a file dialog, a KDirLister, or anything else
> that uses a KIO::ListJob to list directory contents should benefit from
> similar savings.
>
>
> Thanks,
>
> Frank Reininghaus
>
>