> On Oct. 21, 2013, 7:36 p.m., Mark Gaiser wrote:
> > Oh wow! When reading through the comments i was quite surprised to see a 
> > question for me :)
> > A very nice discussion btw!
> > 
> > As for the benchmarking. It's actually already available in a testcase if 
> > you want that: 
> > http://quickgit.kde.org/?p=kdelibs.git&a=blob&h=b5dcdb8f07a5e5dea72e80bcf8237995420cca69&hb=a959675fb3ca4d936cd6f8aeff437a1a444b7c38&f=staging%2Fkio%2Fautotests%2Flistdirtest.cpp
> > 
> > The testcase that i usually run is: 
> > https://gitorious.org/kdirchainrebuild/master/source/1594b3e43e7c75ff5dd3d80d9d06fcfb13de0bbc:
> >  but it isn't self contained and you probably don't have much use for it as 
> > a result of that.
> > 
> > So here is the stuff that is actually useful for you.
> > 
> > BIG FAT WARNING: not tested and written out the top of my head. I've been 
> > playing with KIO too much :)
> > -------------------------------------------------------------------
> > #include <kio/job.h>
> > #include <QDebug>
> > 
> > KIO::ListJob* job = 
> > KIO::listDir(QUrl("file:///path/to/your/massive/folder"), 
> > KIO::HideProgressInfo);
> > job->setUiDelegate(0);
> > job->addMetaData("details", "2"); // Default is 2 which means all details. 
> > 0 means just a few essential fields (KIO::UDSEntry::UDS_NAME, 
> > KIO::UDSEntry::UDS_FILE_TYPE and KIO::UDSEntry::UDS_LINK_DEST if it is a 
> > symbolic link. Not provided otherwise.
> > 
> > KIO::UDSEntryList totalEntries;
> > 
> > connect(job, &KIO::ListJob::entries, [&](KIO::Job *, const 
> > KIO::UDSEntryList &entries){
> >   totalEntries += entries;
> >   qDebug() << "Entries received:" << entries.count();
> > });
> > 
> > connect(job, &KIO::ListJob::result, [&](KJob *){
> >   qDebug() << "Total entries:" << totalEntries.count();
> > });
> > -------------------------------------------------------------------
> > 
> > You probably need to adjust it slightly, but it should work very easily 
> > with KDE Frameworks 5. You might need to change the signal/slot to the old 
> > syntax though, most certainly for KDE 4.x. It's not much but it will show 
> > you the memory usage of those entries. Here storing them is obviously 
> > useless, but in reality - in dolphin - this is done as well only 
> > encapsulated in KFileItem objects.
> > 
> > Good luck!
> > If you need more, just ask.

Thanks! I'll see if I can make a test case that works with both kdelibs master 
and frameworks out of this. The simplicity of your test case makes me look 
forward to the day when we can use C++11 for everything ;-)


- Frank


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/113355/#review42128
-----------------------------------------------------------


On Oct. 21, 2013, 6:23 p.m., Frank Reininghaus wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/113355/
> -----------------------------------------------------------
> 
> (Updated Oct. 21, 2013, 6:23 p.m.)
> 
> 
> Review request for kdelibs.
> 
> 
> Repository: kdelibs
> 
> 
> Description
> -------
> 
> This patch is based on some discussions on the kfm-devel mailing list, see 
> http://lists.kde.org/?t=137935784800003&r=1&w=2
> 
> Mark found out that KIO's UDSEntry class is one of the major consumers of 
> memory in applications which use KIO to list directories with a large number 
> of files, and I found a way use implicit sharing to drastically reduce the 
> amount of memory it needs. Many thanks to Milian for his great blog post 
> http://milianw.de/blog/katekdevelop-sprint-vienna-2012-take-1, without which 
> I would probably not have had such ideas.
> 
> 
> 1. The problem
> 
> The UDSEntry keeps all sorts of information about files which can be stored 
> in a string (name, user, group, etc.) or in a long long (file size, 
> modification time, etc.). All these data can be accessed with a uint key, and 
> UDSEntry returns the corresponding QString or long long in O(1) time. 
> Internally, it stores the data in a QHash<uint, Field>, where Field is a 
> struct that has a QString and a long long member.
> 
> The problem is that QHash needs quite a lot of memory to provide the O(1) 
> access, see http://doc.qt.digia.com/qq/qq19-containers.html for details, and 
> that the minimum capacity of a QHash seems to be 17, even though the number 
> of entries in a UDSEntry is often 8 in the rather typical standard kio_file 
> use case.
> 
> 
> 2. Proposed solution
> 
> (a) We can store the "Fields" in a QVector<Field>, which needs only very 
> little overhead compared to the memory that the actual "Fields" need. When 
> loading a UDSEntry from a QDataStream, we just append all "Fields" to this 
> QVector one by one. Moreover, we need a QHash<uint, int>, which maps each key 
> to the index of its Field in the QVector. This restructuring alone does not 
> reduce the memory usage, of course.
> 
> (b) The key observation is that the QDataStream, which KIO::ListJob reads the 
> UDSEntries from, typically provides the different "Fields" in exactly the 
> same order. This means that the QHash<uint, int> is usually exactly the same 
> for every UDSEntry, and we can make use of implicit sharing to store only one 
> copy of this QHash. I've modified
> 
> void UDSEntryPrivate::load(QDataStream &s, UDSEntry &a)
> 
> such that it remembers the most recent QHash<uint, int> and just adds an 
> implicitly shared copy of it to "a" if the order of the Fields has not 
> changed.
> 
> (c) Moreover, some of the QString Fields in the UDSEntries in one directory 
> are often the same, like, e.g., the user and the group. The "load" function 
> also remembers the most recently read values for each Field in a static 
> QVector<QString> and just puts an implicitly shared copy into the UDSEntry if 
> possible.
> 
> 
> 3. Possible disadvantages
> 
> (a) When using the "remove" member, the new version of UDSEntry does not 
> remove the Field from the QVector<Field>. This means that removing and adding 
> a "Field" repeatedly would let the memory usage grow indefinitely. According 
> to David (http://lists.kde.org/?l=kfm-devel&m=138052519927973&w=2), this 
> doesn't matter though because no known user of UDSEntry uses its remove() 
> member. Maybe we should remove remove (pun stolen grom David) in the 
> frameworks branch then?
> 
> (b) In principle, the old version of UDSEntryPrivate::load(QDataStream&, 
> UDSEntry&) was reentrant. This is not the case for my changed version. 
> Reentrancy could be restored rather easily by protecting the access to the 
> static data with a mutex, but given that most of KIO is not supposed to be 
> used from outside the main thread AFAIK, I don't know if this is necessary.
> 
> 
> 4. Changes since the first version of the patch which I posted in 
> http://lists.kde.org/?l=kfm-devel&m=137962995805432&w=2
> 
> (a) Implemented the minor changes suggested by David in 
> http://lists.kde.org/?l=kfm-devel&m=137975442807965&w=2
> 
> (b) Added a unit test to verify that storing and loading UDSEntries from a 
> stream works even if the order of the fields is permuted, and some fields are 
> removed or added in between.
> 
> (c) Fixed a bug which was uncovered by the test: 
> cachedUdsFields.erase(cachedUdsFields.begin() + i, cachedUdsFields.end()) 
> instead of cachedUdsFields.erase(cachedUdsFields.begin() + i)
> 
> (d) Use QVector::reserve to reserve the appropriate size for the 
> QVector<Field>. Saves some time when loading the UDSEntry, and reduces the 
> memory usage further.
> 
> (e) Changed the type of the loop variable from quint32 to int to fix some 
> compiler warnings.
> 
> 
> Diffs
> -----
> 
>   kio/kio/udsentry.cpp 1e1f503 
>   kio/tests/CMakeLists.txt 1019312 
>   kio/tests/udsentrybenchmark.cpp PRE-CREATION 
>   kio/tests/udsentrytest.h PRE-CREATION 
>   kio/tests/udsentrytest.cpp PRE-CREATION 
> 
> Diff: http://git.reviewboard.kde.org/r/113355/diff/
> 
> 
> Testing
> -------
> 
> Old and new unit tests pass. The memory usage of Dolphin when loading a 
> directory with 100,000 files in Icons View is reduced from 165.4 MB to 113.3 
> MB. Any application that uses a file dialog, a KDirLister, or anything else 
> that uses a KIO::ListJob to list directory contents should benefit from 
> similar savings.
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

Reply via email to