-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/
-----------------------------------------------------------

(Updated Dez. 9, 2014, 10:44 nachm.)


Review request for KDE Frameworks and David Faure.


Changes
-------

Sorry for the extremely long delay!

I did run some bechmarks now on my system (Opensuse 13.2, GCC 4.8.3, Qt 5.3.2 
and Frameworks built in release mode, Core i7-4700MQ processor). I compared

a) the current state of the master branch,
a) rev. 1 of this RR, which still applies to current master, and
b) rev. 2, which offers the possibility to reserve memory for the internal 
QVectors in advance, and makes use of this in the benchmarks and in the file 
kioslave.

Funnily enough, the file kioslave already had the reserve(8) call, it was just 
commented out :-)

I also modified David's comparison (autotests/udsentry_benchmark) - unlike rev. 
1, the current version of this RR is faster than all competitors on the slave 
side.

Moreover, I ran the following benchmarks, took the best result of 10 runs, and 
attached an image which summarizes the results:

a) The benchmarks in tests/udsentrybenchmark, and

b) tests/listjobtest, which is a (hopefully) realistic simulation of what 
happens if an app asks KIO to list a directory. To reduce I/O-related effects, 
I did it this way:

mkdir /dev/shm/test
cd /dev/shm/test/
touch  {1..100000}.txt

and then, in the KIO build directory, ran

tests/listjobtest /dev/shm/test/

As you can see, memory consumption is reduced by 50%. Note that I measured with 
KSysGuard, rather than Valgrind or Milians new (and very cool!) "heaptrack" 
tool, because I think that the latter only measure the net heap memory 
consumption, and exclude any overhead that the memory allocator adds (please 
correct me if I'm wrong), such that they disregard the considerable overhead 
that many small hash nodes, which are used by the master version of UDSEntry, 
add.

Concerning the speed, rev. 2 of this RR is always either just as fast as 
master, or a bit faster. The hopefully realistic listjobtest runs in 8% less 
time now.

Sorry again for not continuing to work on this earlier, and thanks to everyone 
who provided feedback and help here!


Repository: kio


Description (updated)
-------

I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
attempts to make UDSEntry more efficient memory and CPU-wise, into independent 
parts. This is the third step after 
https://git.reviewboard.kde.org/r/113591/ and 
https://git.reviewboard.kde.org/r/115739/ .

The present patch modifies the internal data storage of UDSEntry. UDSEntry 
contains a mapping from unsigned int keys to "Field" values, where Field is a 
struct that contains a QString and a long long (some keys correspond to numeric 
values, like size, date, etc, whereas others, like user and group, correspond 
to a QString).

Currently, UDSEntry stores all data in a QHash<uint, Field> internally. This 
ensures that everything can be accessed in O(1) time, but is not very efficient 
memory-wise because a separate memory allocation is done for each hash node.

I propose to change this and store both the uint keys and the Field values in a 
QVector each. This means that accessing a value becomes a O(n) operation, since 
the entire QVector of keys may need to be scanned in order to find a value, but 
because the number n of values in a UDSEntry is usually quite small and can 
currently not exceed a number ~100, this should not matter in practice.

Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:

(a) The QVectors which store the keys (which are usually the same for all items 
in a directory) are not shared yet. Changing this would reduce the memory usage 
further, but I decided to keep this change separate in order to keep the 
current patch small and easy to understand. Moreover, this makes it easier to 
benchmark other similar approaches (e.g., replace QVector by std::vector, or 
store keys and values together in a std::vector<std::pair<uint,Field>>).

(b) No space is reserved in the vectors when key/value pairs are inserted one 
by one. Implementing this would make UDSEntry faster on the slave side (since 
repeated re-allocations would not be necessary any more), but this can be done 
in a later patch. Moreover, it might not be needed any more if UDSEntry is not 
used directly any more on the slave side, as suggested on the frameworks 
mailing list by Aaron (good idea BTW!). 


Diffs (updated)
-----

  autotests/udsentry_benchmark.cpp b5fa8d6 
  src/core/udsentry.h 98a7035 
  src/core/udsentry.cpp b806e0e 
  src/ioslaves/file/file.cpp 1a2a767 
  tests/udsentrybenchmark.cpp d9a118c 

Diff: https://git.reviewboard.kde.org/r/118452/diff/


Testing (updated)
-------

Unit tests still pass.

The memory usage of listjobtest with a directory with 100,000 files is reduced 
from 71344 K to 35392 K according to KSysGuard. I see similar savings when 
opening the directory in Dolphin.

I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
cannot present any benchmark results.


File Attachments (updated)
----------------

Benchmark results
  
https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png


Thanks,

Frank Reininghaus

_______________________________________________
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel

Reply via email to