On Dec. 10, 2014, 7:40 p.m., Frank Reininghaus wrote: > > As for all the appends that you do on a QVector, you might be able to speed > > them up significantly if you know the size (which you do). > > The reason for that potential speedup is append doing some stuff to > > determine if it can append or if it needs to grow the vector first (amongst > > things). That works very well when you don't know the size, but you don't > > need that "overhead" if you do know the size in advance. > > > > My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime) > > > > The results i get (on - cough - windows, but with mingw. My qt was broken > > under linux): > > "Append took: 278 ms" > > "More complicated append took: 92 ms" > > > > The above timing are just the insertion. The reserve() and resize() call > > are not in them. However, if i run the same benchmark with those calls > > included then the timings are as follows: > > "Append took: 278 ms" > > "More complicated append took: 101 ms" > > > > The regular append(...) call took 278 ms for 10 million items. > > The array assignment operator took merely 92 till 101 ms (~3x faster then > > append!). That is including a bounds check and using append if you go over > > your bounds. > > I think you can only win speed with this in the avarage usecase. > > > > Note: for this optimization you have to use "resize" on the QVector, not > > "reserve". > > > > Disclaimer: i know this is most deffinately a micro optimization, but it's > > fun to play with options and see if you can pull out more. > > Milian Wolff wrote: > your benchmark is flawed: your second loop also contains branches, which > the former does not. Both iterate over the same size, no? So you just want to > compare resize + operator[] vs. reserve + append? Yes, the former will be > faster, but afaik only slightly. Please write a proper benchmark with > QBENCHMARK and use `-perf -perfcounter instr:k`, instead of a bogus QTimer. > Also, you sure this was compiled with optimizations enabled? > > Furthermore, this is also a micro-optimization as the cost of append vs. > operator[] is negleglible compared to the cost of the allocations and the > rest of whats going on here. I really don't think this is worth it. > > Mark Gaiser wrote: > My second loop intentionally contains branches since that's how it will > be used in reality. The reason for that is the udsentry being different in > size depending on the settings you pass to it. 8 fields is the common > usecase. The first loop doesn't contain branches - directly - but does do so > indirectly within QVector. > > "So you just want to compare resize + operator[] vs. reserve + append?" > Yes. > > Perf... :) I know. I disagree that this timer is bogus. QElapsedTimer is > good enough to get a rough view if something is faster then something else. > Perf helps when fine tuning it. Yes, this was compiled with optimizations in > release mode. At least my project was. I don't know for the Qt version > (5.4.0). That's just the package with the mingw installer from qt.io. > > I'm actually not sure if it's worth it either. From a readability point > of view it's still quite readable. But i don't know how much difference this > makes in real world 100.000 items sized folder (the benchmark for this RR). > Not much i think.
https://paste.kde.org/pbdsf39z5 - Milian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://git.reviewboard.kde.org/r/118452/#review71736 ----------------------------------------------------------- On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://git.reviewboard.kde.org/r/118452/ > ----------------------------------------------------------- > > (Updated Dec. 9, 2014, 10:44 p.m.) > > > Review request for KDE Frameworks and David Faure. > > > Repository: kio > > > Description > ------- > > I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which > attempts to make UDSEntry more efficient memory and CPU-wise, into > independent parts. This is the third step after > https://git.reviewboard.kde.org/r/113591/ and > https://git.reviewboard.kde.org/r/115739/ . > > The present patch modifies the internal data storage of UDSEntry. UDSEntry > contains a mapping from unsigned int keys to "Field" values, where Field is a > struct that contains a QString and a long long (some keys correspond to > numeric values, like size, date, etc, whereas others, like user and group, > correspond to a QString). > > Currently, UDSEntry stores all data in a QHash<uint, Field> internally. This > ensures that everything can be accessed in O(1) time, but is not very > efficient memory-wise because a separate memory allocation is done for each > hash node. > > I propose to change this and store both the uint keys and the Field values in > a QVector each. This means that accessing a value becomes a O(n) operation, > since the entire QVector of keys may need to be scanned in order to find a > value, but because the number n of values in a UDSEntry is usually quite > small and can currently not exceed a number ~100, this should not matter in > practice. > > Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing: > > (a) The QVectors which store the keys (which are usually the same for all > items in a directory) are not shared yet. Changing this would reduce the > memory usage further, but I decided to keep this change separate in order to > keep the current patch small and easy to understand. Moreover, this makes it > easier to benchmark other similar approaches (e.g., replace QVector by > std::vector, or store keys and values together in a > std::vector<std::pair<uint,Field>>). > > (b) No space is reserved in the vectors when key/value pairs are inserted one > by one. Implementing this would make UDSEntry faster on the slave side (since > repeated re-allocations would not be necessary any more), but this can be > done in a later patch. Moreover, it might not be needed any more if UDSEntry > is not used directly any more on the slave side, as suggested on the > frameworks mailing list by Aaron (good idea BTW!). > > > Diffs > ----- > > autotests/udsentry_benchmark.cpp b5fa8d6 > src/core/udsentry.h 98a7035 > src/core/udsentry.cpp b806e0e > src/ioslaves/file/file.cpp 1a2a767 > tests/udsentrybenchmark.cpp d9a118c > > Diff: https://git.reviewboard.kde.org/r/118452/diff/ > > > Testing > ------- > > Unit tests still pass. > > The memory usage of listjobtest with a directory with 100,000 files is > reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings > when opening the directory in Dolphin. > > I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I > cannot present any benchmark results. > > > File Attachments > ---------------- > > Benchmark results > > https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png > > > Thanks, > > Frank Reininghaus > >
_______________________________________________ Kde-frameworks-devel mailing list Kde-frameworks-devel@kde.org https://mail.kde.org/mailman/listinfo/kde-frameworks-devel