Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-28 Thread Frank Reininghaus

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/
---

(Updated Dec. 28, 2014, 10:22 p.m.)


Status
--

This change has been marked as submitted.


Review request for KDE Frameworks and David Faure.


Repository: kio


Description
---

I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
attempts to make UDSEntry more efficient memory and CPU-wise, into independent 
parts. This is the third step after 
https://git.reviewboard.kde.org/r/113591/ and 
https://git.reviewboard.kde.org/r/115739/ .

The present patch modifies the internal data storage of UDSEntry. UDSEntry 
contains a mapping from unsigned int keys to "Field" values, where Field is a 
struct that contains a QString and a long long (some keys correspond to numeric 
values, like size, date, etc, whereas others, like user and group, correspond 
to a QString).

Currently, UDSEntry stores all data in a QHash internally. This 
ensures that everything can be accessed in O(1) time, but is not very efficient 
memory-wise because a separate memory allocation is done for each hash node.

I propose to change this and store both the uint keys and the Field values in a 
QVector each. This means that accessing a value becomes a O(n) operation, since 
the entire QVector of keys may need to be scanned in order to find a value, but 
because the number n of values in a UDSEntry is usually quite small and can 
currently not exceed a number ~100, this should not matter in practice.

Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:

(a) The QVectors which store the keys (which are usually the same for all items 
in a directory) are not shared yet. Changing this would reduce the memory usage 
further, but I decided to keep this change separate in order to keep the 
current patch small and easy to understand. Moreover, this makes it easier to 
benchmark other similar approaches (e.g., replace QVector by std::vector, or 
store keys and values together in a std::vector>).

(b) No space is reserved in the vectors when key/value pairs are inserted one 
by one. Implementing this would make UDSEntry faster on the slave side (since 
repeated re-allocations would not be necessary any more), but this can be done 
in a later patch. Moreover, it might not be needed any more if UDSEntry is not 
used directly any more on the slave side, as suggested on the frameworks 
mailing list by Aaron (good idea BTW!). 


Diffs
-

  tests/udsentrybenchmark.cpp 9bedb7b 
  autotests/udsentry_benchmark.cpp b5fa8d6 
  src/core/udsentry.h 98a7035 
  src/core/udsentry.cpp b806e0e 
  src/ioslaves/file/file.cpp 1a2a767 

Diff: https://git.reviewboard.kde.org/r/118452/diff/


Testing
---

Unit tests still pass.

The memory usage of listjobtest with a directory with 100,000 files is reduced 
from 71344 K to 35392 K according to KSysGuard. I see similar savings when 
opening the directory in Dolphin.

I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
cannot present any benchmark results.


File Attachments


Benchmark results
  
https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png


Thanks,

Frank Reininghaus

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-28 Thread Frank Reininghaus

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/
---

(Updated Dez. 28, 2014, 10:22 nachm.)


Review request for KDE Frameworks and David Faure.


Repository: kio


Description
---

I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
attempts to make UDSEntry more efficient memory and CPU-wise, into independent 
parts. This is the third step after 
https://git.reviewboard.kde.org/r/113591/ and 
https://git.reviewboard.kde.org/r/115739/ .

The present patch modifies the internal data storage of UDSEntry. UDSEntry 
contains a mapping from unsigned int keys to "Field" values, where Field is a 
struct that contains a QString and a long long (some keys correspond to numeric 
values, like size, date, etc, whereas others, like user and group, correspond 
to a QString).

Currently, UDSEntry stores all data in a QHash internally. This 
ensures that everything can be accessed in O(1) time, but is not very efficient 
memory-wise because a separate memory allocation is done for each hash node.

I propose to change this and store both the uint keys and the Field values in a 
QVector each. This means that accessing a value becomes a O(n) operation, since 
the entire QVector of keys may need to be scanned in order to find a value, but 
because the number n of values in a UDSEntry is usually quite small and can 
currently not exceed a number ~100, this should not matter in practice.

Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:

(a) The QVectors which store the keys (which are usually the same for all items 
in a directory) are not shared yet. Changing this would reduce the memory usage 
further, but I decided to keep this change separate in order to keep the 
current patch small and easy to understand. Moreover, this makes it easier to 
benchmark other similar approaches (e.g., replace QVector by std::vector, or 
store keys and values together in a std::vector>).

(b) No space is reserved in the vectors when key/value pairs are inserted one 
by one. Implementing this would make UDSEntry faster on the slave side (since 
repeated re-allocations would not be necessary any more), but this can be done 
in a later patch. Moreover, it might not be needed any more if UDSEntry is not 
used directly any more on the slave side, as suggested on the frameworks 
mailing list by Aaron (good idea BTW!). 


Diffs (updated)
-

  tests/udsentrybenchmark.cpp 9bedb7b 
  autotests/udsentry_benchmark.cpp b5fa8d6 
  src/core/udsentry.h 98a7035 
  src/core/udsentry.cpp b806e0e 
  src/ioslaves/file/file.cpp 1a2a767 

Diff: https://git.reviewboard.kde.org/r/118452/diff/


Testing
---

Unit tests still pass.

The memory usage of listjobtest with a directory with 100,000 files is reduced 
from 71344 K to 35392 K according to KSysGuard. I see similar savings when 
opening the directory in Dolphin.

I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
cannot present any benchmark results.


File Attachments


Benchmark results
  
https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png


Thanks,

Frank Reininghaus

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-28 Thread Frank Reininghaus


> On Dez. 20, 2014, 11:46 vorm., David Faure wrote:
> > src/core/udsentry.cpp, line 51
> > 
> >
> > Can you add an example, to ease the understanding of the data structure?
> > 
> > Do I understand it correctly that this is how it works?
> > udsIndexes=[UDS_NAME, UDS_FILE_SIZE, ...]
> > and fields=[Field("filename"), Field(1234), ...]
> > 
> > and therefore the two vectors will always have the same size?

Yes indeed, thanks for bringing that issue up! I'll add the example and then 
push the commit in a minute.


- Frank


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review72334
---


On Dez. 11, 2014, 10:08 nachm., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dez. 11, 2014, 10:08 nachm.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp 9bedb7b 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> File Attachments
> 
> 
> Benchmark results
>   
> https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-20 Thread David Faure

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review72334
---

Ship it!


Looks great to me, apart from a bit of missing documentation.


src/core/udsentry.cpp


Can you add an example, to ease the understanding of the data structure?

Do I understand it correctly that this is how it works?
udsIndexes=[UDS_NAME, UDS_FILE_SIZE, ...]
and fields=[Field("filename"), Field(1234), ...]

and therefore the two vectors will always have the same size?


- David Faure


On Dec. 11, 2014, 10:08 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 11, 2014, 10:08 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp 9bedb7b 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> File Attachments
> 
> 
> Benchmark results
>   
> https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-11 Thread Frank Reininghaus

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/
---

(Updated Dez. 11, 2014, 10:08 nachm.)


Review request for KDE Frameworks and David Faure.


Changes
---

Thanks for your comments, Milian and Mark!

Thanks also for fixing the benchmark, Milian. The point of the QCOMPARE calls 
was, as you have probably guessed, to prevent that the compiler optimizes some 
or all of the code in QBENCHMARK {...} away. But I admit that this approach was 
a bit, erm, suboptimal. Your solution is much better indeed :-)

The idea to factor out the common code from the insert(...) functions was quite 
good, thanks! It even made the code faster, maybe because this is more 
cache-friendly and makes branch prediction easier. 

For the createSmallEntries() benchmark, I get, with "tests/udsentrybenchmark 
-perf -perfcounter instr:k", the following number of instructions:

master:322801129
rev. 2 of this RR: 331020709 (+3%)
rev. 3 of this RR: 198900694 (-38% compared to master)

The QVarLengthArray idea is also quite good! I think this should be done in a 
separate patch though. I was going to do some further experiments anyway (like 
checking how much we can gain if we let the UDSEntries implicitly share the 
'udsIndexes' QVectors, or if using std::vector instead of QVector makes any 
difference - comparing that to QVarLengthArray would be interesting indeed).


Repository: kio


Description
---

I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
attempts to make UDSEntry more efficient memory and CPU-wise, into independent 
parts. This is the third step after 
https://git.reviewboard.kde.org/r/113591/ and 
https://git.reviewboard.kde.org/r/115739/ .

The present patch modifies the internal data storage of UDSEntry. UDSEntry 
contains a mapping from unsigned int keys to "Field" values, where Field is a 
struct that contains a QString and a long long (some keys correspond to numeric 
values, like size, date, etc, whereas others, like user and group, correspond 
to a QString).

Currently, UDSEntry stores all data in a QHash internally. This 
ensures that everything can be accessed in O(1) time, but is not very efficient 
memory-wise because a separate memory allocation is done for each hash node.

I propose to change this and store both the uint keys and the Field values in a 
QVector each. This means that accessing a value becomes a O(n) operation, since 
the entire QVector of keys may need to be scanned in order to find a value, but 
because the number n of values in a UDSEntry is usually quite small and can 
currently not exceed a number ~100, this should not matter in practice.

Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:

(a) The QVectors which store the keys (which are usually the same for all items 
in a directory) are not shared yet. Changing this would reduce the memory usage 
further, but I decided to keep this change separate in order to keep the 
current patch small and easy to understand. Moreover, this makes it easier to 
benchmark other similar approaches (e.g., replace QVector by std::vector, or 
store keys and values together in a std::vector>).

(b) No space is reserved in the vectors when key/value pairs are inserted one 
by one. Implementing this would make UDSEntry faster on the slave side (since 
repeated re-allocations would not be necessary any more), but this can be done 
in a later patch. Moreover, it might not be needed any more if UDSEntry is not 
used directly any more on the slave side, as suggested on the frameworks 
mailing list by Aaron (good idea BTW!). 


Diffs (updated)
-

  autotests/udsentry_benchmark.cpp b5fa8d6 
  src/core/udsentry.h 98a7035 
  src/core/udsentry.cpp b806e0e 
  src/ioslaves/file/file.cpp 1a2a767 
  tests/udsentrybenchmark.cpp 9bedb7b 

Diff: https://git.reviewboard.kde.org/r/118452/diff/


Testing
---

Unit tests still pass.

The memory usage of listjobtest with a directory with 100,000 files is reduced 
from 71344 K to 35392 K according to KSysGuard. I see similar savings when 
opening the directory in Dolphin.

I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
cannot present any benchmark results.


File Attachments


Benchmark results
  
https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png


Thanks,

Frank Reininghaus

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-11 Thread Frank Reininghaus


> On Dez. 10, 2014, 11:03 nachm., Milian Wolff wrote:
> > src/core/udsentry.cpp, line 155
> > 
> >
> > is this called often? if so, prefer to use a QList for the udsIndexes, 
> > to prevent the costly conversion here.
> > 
> > QList and QVector of uint are "nearly" the same. QVector just has a 
> > much nicer API, and stuff like resize, which QList is lacking. For this 
> > use-case, I think it should be OK though.
> > 
> > Maybe add a TODO kf6 to change this to QVector?

If QList and QVector of uint are nearly the same depends on the definition of 
"nearly", I guess ;-)

AFAIK, since QList treats all data types as pointers internally, it will add 
padding to the 4-byte uints on a 64 bit system, which will waste some memory, 
and probably also affect the performance since more data needs to be loaded 
into the CPU caches.

I think the listFields() function is not used much. KIO itself uses it only in 
unit tests, and most other libraries and applications probably don't use 
UDSEntry directly, but only indirectly via KDirLister and KFileItem.

Therefore, I'm not sure if changing the return type in KF6 is desirable. It 
would improve the performance in some cases, but these cases are probably 
rather exotic (?), and it seems that the API of both Qt and Frameworks consider 
QList as the default container for everything (even if this is debatable from a 
memory usage and performance point of view).


- Frank


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71763
---


On Dez. 9, 2014, 10:44 nachm., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dez. 9, 2014, 10:44 nachm.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp d9a118c 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the di

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-11 Thread Frank Reininghaus


> On Dez. 11, 2014, 12:43 vorm., Milian Wolff wrote:
> > src/core/udsentry.cpp, line 72
> > 
> >
> > you are missing a reserve call here

Yes, good catch! Note: I do realise that we could reserve space for only 7 on 
Windows here, but I'm not sure if that saving is worth cluttering the code with 
another #ifdef.


- Frank


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71774
---


On Dez. 9, 2014, 10:44 nachm., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dez. 9, 2014, 10:44 nachm.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp d9a118c 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> File Attachments
> 
> 
> Benchmark results
>   
> https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71774
---



src/core/udsentry.cpp


you are missing a reserve call here


- Milian Wolff


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp d9a118c 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> File Attachments
> 
> 
> Benchmark results
>   
> https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71773
---


BTW: more food-for-thought is to use a `QVarLengthArray<*, 8>` inside 
UDSEntryPrivate, assuming that 8 is the common size we expect to see. That way, 
we get rid of the QVector allocations for up to 8 elements, at the cost of some 
overhead for UDSEntryPrivate where less fields are stored. That significantly 
improves the following two tests:

RESULT : UDSEntryBenchmark::createSmallEntries():
 158,125,498 instructions per iteration (total: 158,125,498, iterations: 1)
RESULT : UDSEntryBenchmark::loadSmallEntries():
 745,222,039 instructions per iteration (total: 745,222,039, iterations: 1)

but of course we need some special code where now we do `.toList()`. Since 
before it returned `.keys()`, and now you return `.toList()`, I assume that is 
not often called anyways, so that would be OK.

- Milian Wolff


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp d9a118c 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> File Attachments
> 
> 
> Benchmark results
>   
> https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff


> On Dec. 10, 2014, 11:03 p.m., Milian Wolff wrote:
> > LGTM, some nitpicks. And of course we need the results of the benchmark in 
> > release mode.

NOTE: I just pushed some fixes to udsentrybenchmark, which was pretty broken in 
itself. It contained lots of unrelated stuff in its hotpath. Now, the effects 
of your patch should stand out much better. And indeed, for me I get these 
results:

before:

* Start testing of UDSEntryBenchmark *
Config: Using QtTest library 5.3.2, Qt 5.3.2
PASS   : UDSEntryBenchmark::initTestCase()
PASS   : UDSEntryBenchmark::createSmallEntries()
RESULT : UDSEntryBenchmark::createSmallEntries():
 340,701,155 instructions per iteration (total: 340,701,155, iterations: 1)
PASS   : UDSEntryBenchmark::createLargeEntries()
RESULT : UDSEntryBenchmark::createLargeEntries():
 151,036,236 instructions per iteration (total: 151,036,236, iterations: 1)
PASS   : UDSEntryBenchmark::readFieldsFromSmallEntries()
RESULT : UDSEntryBenchmark::readFieldsFromSmallEntries():
 109,056,035 instructions per iteration (total: 109,056,035, iterations: 1)
PASS   : UDSEntryBenchmark::readFieldsFromLargeEntries()
RESULT : UDSEntryBenchmark::readFieldsFromLargeEntries():
 173,396,471 instructions per iteration (total: 173,396,471, iterations: 1)
PASS   : UDSEntryBenchmark::saveSmallEntries()
RESULT : UDSEntryBenchmark::saveSmallEntries():
 482,945,427 instructions per iteration (total: 482,945,427, iterations: 1)
PASS   : UDSEntryBenchmark::saveLargeEntries()
RESULT : UDSEntryBenchmark::saveLargeEntries():
 242,482,043 instructions per iteration (total: 242,482,043, iterations: 1)
PASS   : UDSEntryBenchmark::loadSmallEntries()
RESULT : UDSEntryBenchmark::loadSmallEntries():
 976,623,140 instructions per iteration (total: 976,623,140, iterations: 1)
PASS   : UDSEntryBenchmark::loadLargeEntries()
RESULT : UDSEntryBenchmark::loadLargeEntries():
 563,162,165 instructions per iteration (total: 563,162,165, iterations: 1)
PASS   : UDSEntryBenchmark::cleanupTestCase()
Totals: 10 passed, 0 failed, 0 skipped
* Finished testing of UDSEntryBenchmark *

after:

* Start testing of UDSEntryBenchmark *
Config: Using QtTest library 5.3.2, Qt 5.3.2
PASS   : UDSEntryBenchmark::initTestCase()
PASS   : UDSEntryBenchmark::createSmallEntries()
RESULT : UDSEntryBenchmark::createSmallEntries():
 210,401,123 instructions per iteration (total: 210,401,123, iterations: 1)
PASS   : UDSEntryBenchmark::createLargeEntries()
RESULT : UDSEntryBenchmark::createLargeEntries():
 124,268,824 instructions per iteration (total: 124,268,824, iterations: 1)
PASS   : UDSEntryBenchmark::readFieldsFromSmallEntries()
RESULT : UDSEntryBenchmark::readFieldsFromSmallEntries():
 121,256,029 instructions per iteration (total: 121,256,029, iterations: 1)
PASS   : UDSEntryBenchmark::readFieldsFromLargeEntries()
RESULT : UDSEntryBenchmark::readFieldsFromLargeEntries():
 235,066,440 instructions per iteration (total: 235,066,440, iterations: 1)
PASS   : UDSEntryBenchmark::saveSmallEntries()
RESULT : UDSEntryBenchmark::saveSmallEntries():
 463,545,365 instructions per iteration (total: 463,545,365, iterations: 1)
PASS   : UDSEntryBenchmark::saveLargeEntries()
RESULT : UDSEntryBenchmark::saveLargeEntries():
 234,572,009 instructions per iteration (total: 234,572,009, iterations: 1)
PASS   : UDSEntryBenchmark::loadSmallEntries()
RESULT : UDSEntryBenchmark::loadSmallEntries():
 831,722,102 instructions per iteration (total: 831,722,102, iterations: 1)
PASS   : UDSEntryBenchmark::loadLargeEntries()
RESULT : UDSEntryBenchmark::loadLargeEntries():
 446,182,994 instructions per iteration (total: 446,182,994, iterations: 1)
PASS   : UDSEntryBenchmark::cleanupTestCase()
Totals: 10 passed, 0 failed, 0 skipped
* Finished testing of UDSEntryBenchmark *


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71763
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys 

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff


On Dec. 10, 2014, 7:40 p.m., Frank Reininghaus wrote:
> > As for all the appends that you do on a QVector, you might be able to speed 
> > them up significantly if you know the size (which you do).
> > The reason for that potential speedup is append doing some stuff to 
> > determine if it can append or if it needs to grow the vector first (amongst 
> > things). That works very well when you don't know the size, but you don't 
> > need that "overhead" if you do know the size in advance.
> > 
> > My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime)
> > 
> > The results i get (on - cough - windows, but with mingw. My qt was broken 
> > under linux):
> > "Append took: 278 ms"
> > "More complicated append took: 92 ms"
> > 
> > The above timing are just the insertion. The reserve() and resize() call 
> > are not in them. However, if i run the same benchmark with those calls 
> > included then the timings are as follows:
> > "Append took: 278 ms"
> > "More complicated append took: 101 ms"
> > 
> > The regular append(...) call took 278 ms for 10 million items.
> > The array assignment operator took merely 92 till 101 ms (~3x faster then 
> > append!). That is including a bounds check and using append if you go over 
> > your bounds.
> > I think you can only win speed with this in the avarage usecase.
> > 
> > Note: for this optimization you have to use "resize" on the QVector, not 
> > "reserve".
> > 
> > Disclaimer: i know this is most deffinately a micro optimization, but it's 
> > fun to play with options and see if you can pull out more.
> 
> Milian Wolff wrote:
> your benchmark is flawed: your second loop also contains branches, which 
> the former does not. Both iterate over the same size, no? So you just want to 
> compare resize + operator[] vs. reserve + append? Yes, the former will be 
> faster, but afaik only slightly. Please write a proper benchmark with 
> QBENCHMARK and use `-perf -perfcounter instr:k`, instead of a bogus QTimer. 
> Also, you sure this was compiled with optimizations enabled?
> 
> Furthermore, this is also a micro-optimization as the cost of append vs. 
> operator[] is negleglible compared to the cost of the allocations and the 
> rest of whats going on here. I really don't think this is worth it.
> 
> Mark Gaiser wrote:
> My second loop intentionally contains branches since that's how it will 
> be used in reality. The reason for that is the udsentry being different in 
> size depending on the settings you pass to it. 8 fields is the common 
> usecase. The first loop doesn't contain branches - directly - but does do so 
> indirectly within QVector.
> 
> "So you just want to compare resize + operator[] vs. reserve + append?"
> Yes.
> 
> Perf... :) I know. I disagree that this timer is bogus. QElapsedTimer is 
> good enough to get a rough view if something is faster then something else. 
> Perf helps when fine tuning it. Yes, this was compiled with optimizations in 
> release mode. At least my project was. I don't know for the Qt version 
> (5.4.0). That's just the package with the mingw installer from qt.io.
> 
> I'm actually not sure if it's worth it either. From a readability point 
> of view it's still quite readable. But i don't know how much difference this 
> makes in real world 100.000 items sized folder (the benchmark for this RR). 
> Not much i think.
> 
> Milian Wolff wrote:
> https://paste.kde.org/pbdsf39z5
> 
> Milian Wolff wrote:
> so, tl;dr; -> don't use resize() to get speed, it's not worth it, just 
> makes the code more complicated and error-prone.
> 
> Mark Gaiser wrote:
> I'm either completely missing it, but i think your paste only shows the 
> if statement in various perf results. I don't see the else statement in a 
> perf report.
> 
> Milian Wolff wrote:
> True :) https://paste.kde.org/pqx94eqir <-- there is up to ~2x 
> difference, I still don't think its a good idea to use this always. I bet in 
> the noise of the actual work, it will drown. Making the code more complex 
> won't pay off then.

which brings me to another thing: if you really need that kind of performance, 
then don't use Qt containers, but `std::vector` e.g. It's not implicitly shared 
and can thus be much faster. https://paste.kde.org/pitgvakda

Anyhow, I still think that nothing in that regard should be changed.


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KD

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff


On Dec. 10, 2014, 7:40 p.m., Frank Reininghaus wrote:
> > As for all the appends that you do on a QVector, you might be able to speed 
> > them up significantly if you know the size (which you do).
> > The reason for that potential speedup is append doing some stuff to 
> > determine if it can append or if it needs to grow the vector first (amongst 
> > things). That works very well when you don't know the size, but you don't 
> > need that "overhead" if you do know the size in advance.
> > 
> > My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime)
> > 
> > The results i get (on - cough - windows, but with mingw. My qt was broken 
> > under linux):
> > "Append took: 278 ms"
> > "More complicated append took: 92 ms"
> > 
> > The above timing are just the insertion. The reserve() and resize() call 
> > are not in them. However, if i run the same benchmark with those calls 
> > included then the timings are as follows:
> > "Append took: 278 ms"
> > "More complicated append took: 101 ms"
> > 
> > The regular append(...) call took 278 ms for 10 million items.
> > The array assignment operator took merely 92 till 101 ms (~3x faster then 
> > append!). That is including a bounds check and using append if you go over 
> > your bounds.
> > I think you can only win speed with this in the avarage usecase.
> > 
> > Note: for this optimization you have to use "resize" on the QVector, not 
> > "reserve".
> > 
> > Disclaimer: i know this is most deffinately a micro optimization, but it's 
> > fun to play with options and see if you can pull out more.
> 
> Milian Wolff wrote:
> your benchmark is flawed: your second loop also contains branches, which 
> the former does not. Both iterate over the same size, no? So you just want to 
> compare resize + operator[] vs. reserve + append? Yes, the former will be 
> faster, but afaik only slightly. Please write a proper benchmark with 
> QBENCHMARK and use `-perf -perfcounter instr:k`, instead of a bogus QTimer. 
> Also, you sure this was compiled with optimizations enabled?
> 
> Furthermore, this is also a micro-optimization as the cost of append vs. 
> operator[] is negleglible compared to the cost of the allocations and the 
> rest of whats going on here. I really don't think this is worth it.
> 
> Mark Gaiser wrote:
> My second loop intentionally contains branches since that's how it will 
> be used in reality. The reason for that is the udsentry being different in 
> size depending on the settings you pass to it. 8 fields is the common 
> usecase. The first loop doesn't contain branches - directly - but does do so 
> indirectly within QVector.
> 
> "So you just want to compare resize + operator[] vs. reserve + append?"
> Yes.
> 
> Perf... :) I know. I disagree that this timer is bogus. QElapsedTimer is 
> good enough to get a rough view if something is faster then something else. 
> Perf helps when fine tuning it. Yes, this was compiled with optimizations in 
> release mode. At least my project was. I don't know for the Qt version 
> (5.4.0). That's just the package with the mingw installer from qt.io.
> 
> I'm actually not sure if it's worth it either. From a readability point 
> of view it's still quite readable. But i don't know how much difference this 
> makes in real world 100.000 items sized folder (the benchmark for this RR). 
> Not much i think.
> 
> Milian Wolff wrote:
> https://paste.kde.org/pbdsf39z5
> 
> Milian Wolff wrote:
> so, tl;dr; -> don't use resize() to get speed, it's not worth it, just 
> makes the code more complicated and error-prone.
> 
> Mark Gaiser wrote:
> I'm either completely missing it, but i think your paste only shows the 
> if statement in various perf results. I don't see the else statement in a 
> perf report.

True :) https://paste.kde.org/pqx94eqir <-- there is up to ~2x difference, I 
still don't think its a good idea to use this always. I bet in the noise of the 
actual work, it will drown. Making the code more complex won't pay off then.


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/11359

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Mark Gaiser


On Dec. 10, 2014, 7:40 p.m., Frank Reininghaus wrote:
> > As for all the appends that you do on a QVector, you might be able to speed 
> > them up significantly if you know the size (which you do).
> > The reason for that potential speedup is append doing some stuff to 
> > determine if it can append or if it needs to grow the vector first (amongst 
> > things). That works very well when you don't know the size, but you don't 
> > need that "overhead" if you do know the size in advance.
> > 
> > My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime)
> > 
> > The results i get (on - cough - windows, but with mingw. My qt was broken 
> > under linux):
> > "Append took: 278 ms"
> > "More complicated append took: 92 ms"
> > 
> > The above timing are just the insertion. The reserve() and resize() call 
> > are not in them. However, if i run the same benchmark with those calls 
> > included then the timings are as follows:
> > "Append took: 278 ms"
> > "More complicated append took: 101 ms"
> > 
> > The regular append(...) call took 278 ms for 10 million items.
> > The array assignment operator took merely 92 till 101 ms (~3x faster then 
> > append!). That is including a bounds check and using append if you go over 
> > your bounds.
> > I think you can only win speed with this in the avarage usecase.
> > 
> > Note: for this optimization you have to use "resize" on the QVector, not 
> > "reserve".
> > 
> > Disclaimer: i know this is most deffinately a micro optimization, but it's 
> > fun to play with options and see if you can pull out more.
> 
> Milian Wolff wrote:
> your benchmark is flawed: your second loop also contains branches, which 
> the former does not. Both iterate over the same size, no? So you just want to 
> compare resize + operator[] vs. reserve + append? Yes, the former will be 
> faster, but afaik only slightly. Please write a proper benchmark with 
> QBENCHMARK and use `-perf -perfcounter instr:k`, instead of a bogus QTimer. 
> Also, you sure this was compiled with optimizations enabled?
> 
> Furthermore, this is also a micro-optimization as the cost of append vs. 
> operator[] is negleglible compared to the cost of the allocations and the 
> rest of whats going on here. I really don't think this is worth it.
> 
> Mark Gaiser wrote:
> My second loop intentionally contains branches since that's how it will 
> be used in reality. The reason for that is the udsentry being different in 
> size depending on the settings you pass to it. 8 fields is the common 
> usecase. The first loop doesn't contain branches - directly - but does do so 
> indirectly within QVector.
> 
> "So you just want to compare resize + operator[] vs. reserve + append?"
> Yes.
> 
> Perf... :) I know. I disagree that this timer is bogus. QElapsedTimer is 
> good enough to get a rough view if something is faster then something else. 
> Perf helps when fine tuning it. Yes, this was compiled with optimizations in 
> release mode. At least my project was. I don't know for the Qt version 
> (5.4.0). That's just the package with the mingw installer from qt.io.
> 
> I'm actually not sure if it's worth it either. From a readability point 
> of view it's still quite readable. But i don't know how much difference this 
> makes in real world 100.000 items sized folder (the benchmark for this RR). 
> Not much i think.
> 
> Milian Wolff wrote:
> https://paste.kde.org/pbdsf39z5
> 
> Milian Wolff wrote:
> so, tl;dr; -> don't use resize() to get speed, it's not worth it, just 
> makes the code more complicated and error-prone.

I'm either completely missing it, but i think your paste only shows the if 
statement in various perf results. I don't see the else statement in a perf 
report.


- Mark


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some k

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff


On Dec. 10, 2014, 7:40 p.m., Frank Reininghaus wrote:
> > As for all the appends that you do on a QVector, you might be able to speed 
> > them up significantly if you know the size (which you do).
> > The reason for that potential speedup is append doing some stuff to 
> > determine if it can append or if it needs to grow the vector first (amongst 
> > things). That works very well when you don't know the size, but you don't 
> > need that "overhead" if you do know the size in advance.
> > 
> > My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime)
> > 
> > The results i get (on - cough - windows, but with mingw. My qt was broken 
> > under linux):
> > "Append took: 278 ms"
> > "More complicated append took: 92 ms"
> > 
> > The above timing are just the insertion. The reserve() and resize() call 
> > are not in them. However, if i run the same benchmark with those calls 
> > included then the timings are as follows:
> > "Append took: 278 ms"
> > "More complicated append took: 101 ms"
> > 
> > The regular append(...) call took 278 ms for 10 million items.
> > The array assignment operator took merely 92 till 101 ms (~3x faster then 
> > append!). That is including a bounds check and using append if you go over 
> > your bounds.
> > I think you can only win speed with this in the avarage usecase.
> > 
> > Note: for this optimization you have to use "resize" on the QVector, not 
> > "reserve".
> > 
> > Disclaimer: i know this is most deffinately a micro optimization, but it's 
> > fun to play with options and see if you can pull out more.
> 
> Milian Wolff wrote:
> your benchmark is flawed: your second loop also contains branches, which 
> the former does not. Both iterate over the same size, no? So you just want to 
> compare resize + operator[] vs. reserve + append? Yes, the former will be 
> faster, but afaik only slightly. Please write a proper benchmark with 
> QBENCHMARK and use `-perf -perfcounter instr:k`, instead of a bogus QTimer. 
> Also, you sure this was compiled with optimizations enabled?
> 
> Furthermore, this is also a micro-optimization as the cost of append vs. 
> operator[] is negleglible compared to the cost of the allocations and the 
> rest of whats going on here. I really don't think this is worth it.
> 
> Mark Gaiser wrote:
> My second loop intentionally contains branches since that's how it will 
> be used in reality. The reason for that is the udsentry being different in 
> size depending on the settings you pass to it. 8 fields is the common 
> usecase. The first loop doesn't contain branches - directly - but does do so 
> indirectly within QVector.
> 
> "So you just want to compare resize + operator[] vs. reserve + append?"
> Yes.
> 
> Perf... :) I know. I disagree that this timer is bogus. QElapsedTimer is 
> good enough to get a rough view if something is faster then something else. 
> Perf helps when fine tuning it. Yes, this was compiled with optimizations in 
> release mode. At least my project was. I don't know for the Qt version 
> (5.4.0). That's just the package with the mingw installer from qt.io.
> 
> I'm actually not sure if it's worth it either. From a readability point 
> of view it's still quite readable. But i don't know how much difference this 
> makes in real world 100.000 items sized folder (the benchmark for this RR). 
> Not much i think.
> 
> Milian Wolff wrote:
> https://paste.kde.org/pbdsf39z5

so, tl;dr; -> don't use resize() to get speed, it's not worth it, just makes 
the code more complicated and error-prone.


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. Thi

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff


On Dec. 10, 2014, 7:40 p.m., Frank Reininghaus wrote:
> > As for all the appends that you do on a QVector, you might be able to speed 
> > them up significantly if you know the size (which you do).
> > The reason for that potential speedup is append doing some stuff to 
> > determine if it can append or if it needs to grow the vector first (amongst 
> > things). That works very well when you don't know the size, but you don't 
> > need that "overhead" if you do know the size in advance.
> > 
> > My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime)
> > 
> > The results i get (on - cough - windows, but with mingw. My qt was broken 
> > under linux):
> > "Append took: 278 ms"
> > "More complicated append took: 92 ms"
> > 
> > The above timing are just the insertion. The reserve() and resize() call 
> > are not in them. However, if i run the same benchmark with those calls 
> > included then the timings are as follows:
> > "Append took: 278 ms"
> > "More complicated append took: 101 ms"
> > 
> > The regular append(...) call took 278 ms for 10 million items.
> > The array assignment operator took merely 92 till 101 ms (~3x faster then 
> > append!). That is including a bounds check and using append if you go over 
> > your bounds.
> > I think you can only win speed with this in the avarage usecase.
> > 
> > Note: for this optimization you have to use "resize" on the QVector, not 
> > "reserve".
> > 
> > Disclaimer: i know this is most deffinately a micro optimization, but it's 
> > fun to play with options and see if you can pull out more.
> 
> Milian Wolff wrote:
> your benchmark is flawed: your second loop also contains branches, which 
> the former does not. Both iterate over the same size, no? So you just want to 
> compare resize + operator[] vs. reserve + append? Yes, the former will be 
> faster, but afaik only slightly. Please write a proper benchmark with 
> QBENCHMARK and use `-perf -perfcounter instr:k`, instead of a bogus QTimer. 
> Also, you sure this was compiled with optimizations enabled?
> 
> Furthermore, this is also a micro-optimization as the cost of append vs. 
> operator[] is negleglible compared to the cost of the allocations and the 
> rest of whats going on here. I really don't think this is worth it.
> 
> Mark Gaiser wrote:
> My second loop intentionally contains branches since that's how it will 
> be used in reality. The reason for that is the udsentry being different in 
> size depending on the settings you pass to it. 8 fields is the common 
> usecase. The first loop doesn't contain branches - directly - but does do so 
> indirectly within QVector.
> 
> "So you just want to compare resize + operator[] vs. reserve + append?"
> Yes.
> 
> Perf... :) I know. I disagree that this timer is bogus. QElapsedTimer is 
> good enough to get a rough view if something is faster then something else. 
> Perf helps when fine tuning it. Yes, this was compiled with optimizations in 
> release mode. At least my project was. I don't know for the Qt version 
> (5.4.0). That's just the package with the mingw installer from qt.io.
> 
> I'm actually not sure if it's worth it either. From a readability point 
> of view it's still quite readable. But i don't know how much difference this 
> makes in real world 100.000 items sized folder (the benchmark for this RR). 
> Not much i think.

https://paste.kde.org/pbdsf39z5


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for eac

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71763
---


LGTM, some nitpicks. And of course we need the results of the benchmark in 
release mode.


src/core/udsentry.cpp


share this code with above, i.e. add something like the following, which 
could be called here and above:

d->insert(field, UDSEntryPrivate::Field(value));



src/core/udsentry.cpp


is this called often? if so, prefer to use a QList for the udsIndexes, to 
prevent the costly conversion here.

QList and QVector of uint are "nearly" the same. QVector just has a much 
nicer API, and stuff like resize, which QList is lacking. For this use-case, I 
think it should be OK though.

Maybe add a TODO kf6 to change this to QVector?



src/core/udsentry.cpp


btw @Mark: if you really want to micro-optimize this, then you could use 
iterators instead. But I really don't think its worth it. `at()` and 
`operator[]` are "equal" on const containers. on non-const containers, 
`operator[]` could lead to detaching though. I'd be suprised if it is ever 
going to be faster than `at()`.


- Milian Wolff


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp d9a118c 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> File Attachments
> 
> 
> Benchmark results
>   
> https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d1

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Mark Gaiser


On Dec. 10, 2014, 7:40 p.m., Frank Reininghaus wrote:
> > As for all the appends that you do on a QVector, you might be able to speed 
> > them up significantly if you know the size (which you do).
> > The reason for that potential speedup is append doing some stuff to 
> > determine if it can append or if it needs to grow the vector first (amongst 
> > things). That works very well when you don't know the size, but you don't 
> > need that "overhead" if you do know the size in advance.
> > 
> > My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime)
> > 
> > The results i get (on - cough - windows, but with mingw. My qt was broken 
> > under linux):
> > "Append took: 278 ms"
> > "More complicated append took: 92 ms"
> > 
> > The above timing are just the insertion. The reserve() and resize() call 
> > are not in them. However, if i run the same benchmark with those calls 
> > included then the timings are as follows:
> > "Append took: 278 ms"
> > "More complicated append took: 101 ms"
> > 
> > The regular append(...) call took 278 ms for 10 million items.
> > The array assignment operator took merely 92 till 101 ms (~3x faster then 
> > append!). That is including a bounds check and using append if you go over 
> > your bounds.
> > I think you can only win speed with this in the avarage usecase.
> > 
> > Note: for this optimization you have to use "resize" on the QVector, not 
> > "reserve".
> > 
> > Disclaimer: i know this is most deffinately a micro optimization, but it's 
> > fun to play with options and see if you can pull out more.
> 
> Milian Wolff wrote:
> your benchmark is flawed: your second loop also contains branches, which 
> the former does not. Both iterate over the same size, no? So you just want to 
> compare resize + operator[] vs. reserve + append? Yes, the former will be 
> faster, but afaik only slightly. Please write a proper benchmark with 
> QBENCHMARK and use `-perf -perfcounter instr:k`, instead of a bogus QTimer. 
> Also, you sure this was compiled with optimizations enabled?
> 
> Furthermore, this is also a micro-optimization as the cost of append vs. 
> operator[] is negleglible compared to the cost of the allocations and the 
> rest of whats going on here. I really don't think this is worth it.

My second loop intentionally contains branches since that's how it will be used 
in reality. The reason for that is the udsentry being different in size 
depending on the settings you pass to it. 8 fields is the common usecase. The 
first loop doesn't contain branches - directly - but does do so indirectly 
within QVector.

"So you just want to compare resize + operator[] vs. reserve + append?"
Yes.

Perf... :) I know. I disagree that this timer is bogus. QElapsedTimer is good 
enough to get a rough view if something is faster then something else. Perf 
helps when fine tuning it. Yes, this was compiled with optimizations in release 
mode. At least my project was. I don't know for the Qt version (5.4.0). That's 
just the package with the mingw installer from qt.io.

I'm actually not sure if it's worth it either. From a readability point of view 
it's still quite readable. But i don't know how much difference this makes in 
real world 100.000 items sized folder (the benchmark for this RR). Not much i 
think.


- Mark


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff


On Dec. 10, 2014, 7:40 p.m., Frank Reininghaus wrote:
> > As for all the appends that you do on a QVector, you might be able to speed 
> > them up significantly if you know the size (which you do).
> > The reason for that potential speedup is append doing some stuff to 
> > determine if it can append or if it needs to grow the vector first (amongst 
> > things). That works very well when you don't know the size, but you don't 
> > need that "overhead" if you do know the size in advance.
> > 
> > My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime)
> > 
> > The results i get (on - cough - windows, but with mingw. My qt was broken 
> > under linux):
> > "Append took: 278 ms"
> > "More complicated append took: 92 ms"
> > 
> > The above timing are just the insertion. The reserve() and resize() call 
> > are not in them. However, if i run the same benchmark with those calls 
> > included then the timings are as follows:
> > "Append took: 278 ms"
> > "More complicated append took: 101 ms"
> > 
> > The regular append(...) call took 278 ms for 10 million items.
> > The array assignment operator took merely 92 till 101 ms (~3x faster then 
> > append!). That is including a bounds check and using append if you go over 
> > your bounds.
> > I think you can only win speed with this in the avarage usecase.
> > 
> > Note: for this optimization you have to use "resize" on the QVector, not 
> > "reserve".
> > 
> > Disclaimer: i know this is most deffinately a micro optimization, but it's 
> > fun to play with options and see if you can pull out more.

your benchmark is flawed: your second loop also contains branches, which the 
former does not. Both iterate over the same size, no? So you just want to 
compare resize + operator[] vs. reserve + append? Yes, the former will be 
faster, but afaik only slightly. Please write a proper benchmark with 
QBENCHMARK and use `-perf -perfcounter instr:k`, instead of a bogus QTimer. 
Also, you sure this was compiled with optimizations enabled?

Furthermore, this is also a micro-optimization as the cost of append vs. 
operator[] is negleglible compared to the cost of the allocations and the rest 
of whats going on here. I really don't think this is worth it.


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can b

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Milian Wolff


> On Dec. 10, 2014, 7:40 p.m., Mark Gaiser wrote:
> > src/core/udsentry.cpp, line 99
> > 
> >
> > micro optimization, but d->fields[index] is - in my experience - 
> > slightly faster.

nope, that is not true.


> On Dec. 10, 2014, 7:40 p.m., Mark Gaiser wrote:
> > src/core/udsentry.cpp, line 109
> > 
> >
> > micro optimization, but d->fields[index] is - in my experience - 
> > slightly faster.

nope, that is not true.


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---


On Dec. 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated Dec. 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp d9a118c 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> File Attachments
> 
> 
> Benchmark results
>   
> https://git.reviewboard.kde.org/media/uploaded/files/2014/12/09/038e443c-78eb-443b-b33a-b451b28d10ea__UDSEntry-benchmarks.png
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-10 Thread Mark Gaiser

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review71736
---



src/core/udsentry.cpp


micro optimization, but d->fields[index] is - in my experience - slightly 
faster.



src/core/udsentry.cpp


micro optimization, but d->fields[index] is - in my experience - slightly 
faster.


As for all the appends that you do on a QVector, you might be able to speed 
them up significantly if you know the size (which you do).
The reason for that potential speedup is append doing some stuff to determine 
if it can append or if it needs to grow the vector first (amongst things). That 
works very well when you don't know the size, but you don't need that 
"overhead" if you do know the size in advance.

My benchmark code is here: https://paste.kde.org/pxly3rryg (1 year lifetime)

The results i get (on - cough - windows, but with mingw. My qt was broken under 
linux):
"Append took: 278 ms"
"More complicated append took: 92 ms"

The above timing are just the insertion. The reserve() and resize() call are 
not in them. However, if i run the same benchmark with those calls included 
then the timings are as follows:
"Append took: 278 ms"
"More complicated append took: 101 ms"

The regular append(...) call took 278 ms for 10 million items.
The array assignment operator took merely 92 till 101 ms (~3x faster then 
append!). That is including a bounds check and using append if you go over your 
bounds.
I think you can only win speed with this in the avarage usecase.

Note: for this optimization you have to use "resize" on the QVector, not 
"reserve".

Disclaimer: i know this is most deffinately a micro optimization, but it's fun 
to play with options and see if you can pull out more.

- Mark Gaiser


On dec 9, 2014, 10:44 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated dec 9, 2014, 10:44 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   autotests/udsentry_benchmark.cpp b5fa8d6 
>   src/core/udsentry.h 98a7035 
>   src/core/udsentry.cpp b806e0e 
>   src/ioslaves/file/file.cpp 1a2a767 
>   tests/udsentrybenchmark.cpp d9a118c 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-12-09 Thread Frank Reininghaus

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/
---

(Updated Dez. 9, 2014, 10:44 nachm.)


Review request for KDE Frameworks and David Faure.


Changes
---

Sorry for the extremely long delay!

I did run some bechmarks now on my system (Opensuse 13.2, GCC 4.8.3, Qt 5.3.2 
and Frameworks built in release mode, Core i7-4700MQ processor). I compared

a) the current state of the master branch,
a) rev. 1 of this RR, which still applies to current master, and
b) rev. 2, which offers the possibility to reserve memory for the internal 
QVectors in advance, and makes use of this in the benchmarks and in the file 
kioslave.

Funnily enough, the file kioslave already had the reserve(8) call, it was just 
commented out :-)

I also modified David's comparison (autotests/udsentry_benchmark) - unlike rev. 
1, the current version of this RR is faster than all competitors on the slave 
side.

Moreover, I ran the following benchmarks, took the best result of 10 runs, and 
attached an image which summarizes the results:

a) The benchmarks in tests/udsentrybenchmark, and

b) tests/listjobtest, which is a (hopefully) realistic simulation of what 
happens if an app asks KIO to list a directory. To reduce I/O-related effects, 
I did it this way:

mkdir /dev/shm/test
cd /dev/shm/test/
touch  {1..10}.txt

and then, in the KIO build directory, ran

tests/listjobtest /dev/shm/test/

As you can see, memory consumption is reduced by 50%. Note that I measured with 
KSysGuard, rather than Valgrind or Milians new (and very cool!) "heaptrack" 
tool, because I think that the latter only measure the net heap memory 
consumption, and exclude any overhead that the memory allocator adds (please 
correct me if I'm wrong), such that they disregard the considerable overhead 
that many small hash nodes, which are used by the master version of UDSEntry, 
add.

Concerning the speed, rev. 2 of this RR is always either just as fast as 
master, or a bit faster. The hopefully realistic listjobtest runs in 8% less 
time now.

Sorry again for not continuing to work on this earlier, and thanks to everyone 
who provided feedback and help here!


Repository: kio


Description (updated)
---

I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
attempts to make UDSEntry more efficient memory and CPU-wise, into independent 
parts. This is the third step after 
https://git.reviewboard.kde.org/r/113591/ and 
https://git.reviewboard.kde.org/r/115739/ .

The present patch modifies the internal data storage of UDSEntry. UDSEntry 
contains a mapping from unsigned int keys to "Field" values, where Field is a 
struct that contains a QString and a long long (some keys correspond to numeric 
values, like size, date, etc, whereas others, like user and group, correspond 
to a QString).

Currently, UDSEntry stores all data in a QHash internally. This 
ensures that everything can be accessed in O(1) time, but is not very efficient 
memory-wise because a separate memory allocation is done for each hash node.

I propose to change this and store both the uint keys and the Field values in a 
QVector each. This means that accessing a value becomes a O(n) operation, since 
the entire QVector of keys may need to be scanned in order to find a value, but 
because the number n of values in a UDSEntry is usually quite small and can 
currently not exceed a number ~100, this should not matter in practice.

Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:

(a) The QVectors which store the keys (which are usually the same for all items 
in a directory) are not shared yet. Changing this would reduce the memory usage 
further, but I decided to keep this change separate in order to keep the 
current patch small and easy to understand. Moreover, this makes it easier to 
benchmark other similar approaches (e.g., replace QVector by std::vector, or 
store keys and values together in a std::vector>).

(b) No space is reserved in the vectors when key/value pairs are inserted one 
by one. Implementing this would make UDSEntry faster on the slave side (since 
repeated re-allocations would not be necessary any more), but this can be done 
in a later patch. Moreover, it might not be needed any more if UDSEntry is not 
used directly any more on the slave side, as suggested on the frameworks 
mailing list by Aaron (good idea BTW!). 


Diffs (updated)
-

  autotests/udsentry_benchmark.cpp b5fa8d6 
  src/core/udsentry.h 98a7035 
  src/core/udsentry.cpp b806e0e 
  src/ioslaves/file/file.cpp 1a2a767 
  tests/udsentrybenchmark.cpp d9a118c 

Diff: https://git.reviewboard.kde.org/r/118452/diff/


Testing (updated)
---

Unit tests still pass.

The memory usage of listjobtest with a directory with 100,000 files is reduced 
from 71344 K to 35392 K according to KSysGuard. I s

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-12 Thread Frank Reininghaus


> On June 6, 2014, 10:34 p.m., David Faure wrote:
> > Interestingly, I had a benchmark to compare a number of data structures for 
> > UDSEntry (which made me turn the Qt3 QList into the Qt4 QHash). That 
> > benchmark is the commented-out "newApiPerformance()" method in jobtest.cpp.
> > 
> > I just extended it to include your suggestion: 
> > http://www.davidfaure.fr/2014/udsentry_benchmark.diff
> > 
> > Here are the results I get (in debug mode, because otherwise I'd have to 
> > make sure the compiler doesn't optimize away the useless iterations loops 
> > :))
> > 
> > Old API: slave code: 5
> > Old API: app code: 5
> > QHash+QVariant API: slave code: 9
> > QHash+QVariant API: app code: 4
> > QHash+struct API: slave code: 7
> > QHash+struct API: app code: 3
> > QMap+struct API: slave code: 8
> > QMap+struct API: app code: 4
> > Frank's API: slave code: 13
> > Frank API: app code: 6
> > 
> > So, unless I'm missing something, this is much slower for the slave (but we 
> > could decide to just not use UDSEntry in the slave, as discussed on k-f-d), 
> > but it's also slower on the app side
> > 
> > Can you check if I did something stupid when grabbing your code or writing 
> > the testcode for it?
> > Otherwise the next step is to actually run this with -O2, ensuring the 
> > loops still loop.
> 
> Milian Wolff wrote:
> I'd say a benchmark should use QBENCHMARK. Also, I'm afraid the results 
> you posted are meaningless when they come from a non-optimized, esp. 
> considering that the differences are so small. I suggest someone picks this 
> test up and makes it a proper benchmark. To prevent the optimizer from 
> kicking in, actually use the results in one way or another. E.g. calculate 
> the sum of string lengths returned and number values for the read operation. 
> And of course also compare the sum to some expected value.
> 
> David Faure wrote:
> Yeah - this code is very old, it predates Q_BENCHMARK :-)
> 
> "the differences are small" - not really, these are entire seconds, due 
> to large amount of iterations.
> 
> But yep, let me convert all this to a proper benchmark. Coming up.
> 
> David Faure wrote:
> Done, see kio/autotests/udsentry_benchmark.
> 
> Results (in release mode, with ./kiocore-udsentry_benchmark -minimumvalue 
> 1000)
> 
> KDE3:  slave=1123 app=1411
> Hash+Variant:  slave=1856 app=1605
> Hash+Struct:   slave=1051 app=1418  // the kde4 solution
> Map+Struct:slave=1418 app=1427
> TwoVectors:slave=1204 app=1390  // Frank's suggestion in this RR.
> 
> Not bad :-)
> 
> Again, please double-check the test and the results
> 
> David Faure wrote:
> Pure CPU results with -callgrind :
> 
> KDE3: slave=10284 app=14032
> Hash+Variant: slave=14446 app=14456
> Hash+Struct:  slave=9363  app=13973 // kde4
> Map+Struct:   slave=11262 app=13857
> TwoVectors:   slave=11199 app=13854 // this RR
> 
> Looks somewhat similar: this solution is good for the app side, but on 
> the slave side we can do better (especially if we use a different data 
> structure there, with no lookup capabilities).
> 
> Mark Gaiser wrote:
> So.. This sucks. I was expecting to test over and over again with 
> "RelWithDebInfo" but apparently i uncovered a bug in kdesrc-build that 
> doesn't honor that anymore and just compiles everything in debug. Oh well, 
> reported on the mailinglist and re-compiled with RelWithDebInfo :)
> 
> The numbers below are from the test results from the part: "0.002412 
> msecs per iteration (total: 1,265, iterations: 524288)" where i took the last 
> 4 numbers of the "msecs per iteration". I'm a bit surprised that i'm not 
> beating David's numbers :)
> 
> Here are my results (with RelWithDebInfo obviously):
> KDE3: slave=1746 app=2264
> Hash+Variant: slave=2794 app=2725
> Hash+Struct:  slave=1738 app=2470 // kde4
> Map+Struct:   slave=1843 app=2260
> TwoVectors:   slave=2023 app=2323 // this RR
> TwoVectorsMark:   slave=1602 app=2412 // see below. Raw numbers: 
> http://p.sc2.nl/pkbuewd2b Code: http://p.sc2.nl/pqlnisylt (1 month lifetime 
> for both)
> 
> So what did i do in "TwoVectorsMark" (internally called FrankUDSEntryFind 
> because i was first only adding std::find). Well, i was guessing the indexAt 
> to be the slowest in the current review request. And from tons of other 
> benchmark experience i know that raw C++ containers almost always beat Qt 
> containers in terms of speed (Qt usually is a bit more memory friendly when 
> it comes to containers). So i simply changed QVector to std::vector along 
> with std::find for finding the index. It seems to be doing rather well!
> 
> Regarding the timing. I'm not going to claim that it's slower or faster 
> then the current RR. The timings are different for me in every run (yes, even 
> with -minimumvalue 1000). So

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-07 Thread Mark Gaiser


> On June 6, 2014, 10:34 p.m., David Faure wrote:
> > Interestingly, I had a benchmark to compare a number of data structures for 
> > UDSEntry (which made me turn the Qt3 QList into the Qt4 QHash). That 
> > benchmark is the commented-out "newApiPerformance()" method in jobtest.cpp.
> > 
> > I just extended it to include your suggestion: 
> > http://www.davidfaure.fr/2014/udsentry_benchmark.diff
> > 
> > Here are the results I get (in debug mode, because otherwise I'd have to 
> > make sure the compiler doesn't optimize away the useless iterations loops 
> > :))
> > 
> > Old API: slave code: 5
> > Old API: app code: 5
> > QHash+QVariant API: slave code: 9
> > QHash+QVariant API: app code: 4
> > QHash+struct API: slave code: 7
> > QHash+struct API: app code: 3
> > QMap+struct API: slave code: 8
> > QMap+struct API: app code: 4
> > Frank's API: slave code: 13
> > Frank API: app code: 6
> > 
> > So, unless I'm missing something, this is much slower for the slave (but we 
> > could decide to just not use UDSEntry in the slave, as discussed on k-f-d), 
> > but it's also slower on the app side
> > 
> > Can you check if I did something stupid when grabbing your code or writing 
> > the testcode for it?
> > Otherwise the next step is to actually run this with -O2, ensuring the 
> > loops still loop.
> 
> Milian Wolff wrote:
> I'd say a benchmark should use QBENCHMARK. Also, I'm afraid the results 
> you posted are meaningless when they come from a non-optimized, esp. 
> considering that the differences are so small. I suggest someone picks this 
> test up and makes it a proper benchmark. To prevent the optimizer from 
> kicking in, actually use the results in one way or another. E.g. calculate 
> the sum of string lengths returned and number values for the read operation. 
> And of course also compare the sum to some expected value.
> 
> David Faure wrote:
> Yeah - this code is very old, it predates Q_BENCHMARK :-)
> 
> "the differences are small" - not really, these are entire seconds, due 
> to large amount of iterations.
> 
> But yep, let me convert all this to a proper benchmark. Coming up.
> 
> David Faure wrote:
> Done, see kio/autotests/udsentry_benchmark.
> 
> Results (in release mode, with ./kiocore-udsentry_benchmark -minimumvalue 
> 1000)
> 
> KDE3:  slave=1123 app=1411
> Hash+Variant:  slave=1856 app=1605
> Hash+Struct:   slave=1051 app=1418  // the kde4 solution
> Map+Struct:slave=1418 app=1427
> TwoVectors:slave=1204 app=1390  // Frank's suggestion in this RR.
> 
> Not bad :-)
> 
> Again, please double-check the test and the results
> 
> David Faure wrote:
> Pure CPU results with -callgrind :
> 
> KDE3: slave=10284 app=14032
> Hash+Variant: slave=14446 app=14456
> Hash+Struct:  slave=9363  app=13973 // kde4
> Map+Struct:   slave=11262 app=13857
> TwoVectors:   slave=11199 app=13854 // this RR
> 
> Looks somewhat similar: this solution is good for the app side, but on 
> the slave side we can do better (especially if we use a different data 
> structure there, with no lookup capabilities).

So.. This sucks. I was expecting to test over and over again with 
"RelWithDebInfo" but apparently i uncovered a bug in kdesrc-build that doesn't 
honor that anymore and just compiles everything in debug. Oh well, reported on 
the mailinglist and re-compiled with RelWithDebInfo :)

The numbers below are from the test results from the part: "0.002412 msecs per 
iteration (total: 1,265, iterations: 524288)" where i took the last 4 numbers 
of the "msecs per iteration". I'm a bit surprised that i'm not beating David's 
numbers :)

Here are my results (with RelWithDebInfo obviously):
KDE3: slave=1746 app=2264
Hash+Variant: slave=2794 app=2725
Hash+Struct:  slave=1738 app=2470 // kde4
Map+Struct:   slave=1843 app=2260
TwoVectors:   slave=2023 app=2323 // this RR
TwoVectorsMark:   slave=1602 app=2412 // see below. Raw numbers: 
http://p.sc2.nl/pkbuewd2b Code: http://p.sc2.nl/pqlnisylt (1 month lifetime for 
both)

So what did i do in "TwoVectorsMark" (internally called FrankUDSEntryFind 
because i was first only adding std::find). Well, i was guessing the indexAt to 
be the slowest in the current review request. And from tons of other benchmark 
experience i know that raw C++ containers almost always beat Qt containers in 
terms of speed (Qt usually is a bit more memory friendly when it comes to 
containers). So i simply changed QVector to std::vector along with std::find 
for finding the index. It seems to be doing rather well!

Regarding the timing. I'm not going to claim that it's slower or faster then 
the current RR. The timings are different for me in every run (yes, even with 
-minimumvalue 1000). Sometimes it's the fastest of the tested approaches, 
sometimes it's slightly slower then this RR. It's just different. And i have no 
clue if the 

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-07 Thread David Faure


> On June 6, 2014, 10:34 p.m., David Faure wrote:
> > Interestingly, I had a benchmark to compare a number of data structures for 
> > UDSEntry (which made me turn the Qt3 QList into the Qt4 QHash). That 
> > benchmark is the commented-out "newApiPerformance()" method in jobtest.cpp.
> > 
> > I just extended it to include your suggestion: 
> > http://www.davidfaure.fr/2014/udsentry_benchmark.diff
> > 
> > Here are the results I get (in debug mode, because otherwise I'd have to 
> > make sure the compiler doesn't optimize away the useless iterations loops 
> > :))
> > 
> > Old API: slave code: 5
> > Old API: app code: 5
> > QHash+QVariant API: slave code: 9
> > QHash+QVariant API: app code: 4
> > QHash+struct API: slave code: 7
> > QHash+struct API: app code: 3
> > QMap+struct API: slave code: 8
> > QMap+struct API: app code: 4
> > Frank's API: slave code: 13
> > Frank API: app code: 6
> > 
> > So, unless I'm missing something, this is much slower for the slave (but we 
> > could decide to just not use UDSEntry in the slave, as discussed on k-f-d), 
> > but it's also slower on the app side
> > 
> > Can you check if I did something stupid when grabbing your code or writing 
> > the testcode for it?
> > Otherwise the next step is to actually run this with -O2, ensuring the 
> > loops still loop.
> 
> Milian Wolff wrote:
> I'd say a benchmark should use QBENCHMARK. Also, I'm afraid the results 
> you posted are meaningless when they come from a non-optimized, esp. 
> considering that the differences are so small. I suggest someone picks this 
> test up and makes it a proper benchmark. To prevent the optimizer from 
> kicking in, actually use the results in one way or another. E.g. calculate 
> the sum of string lengths returned and number values for the read operation. 
> And of course also compare the sum to some expected value.
> 
> David Faure wrote:
> Yeah - this code is very old, it predates Q_BENCHMARK :-)
> 
> "the differences are small" - not really, these are entire seconds, due 
> to large amount of iterations.
> 
> But yep, let me convert all this to a proper benchmark. Coming up.
> 
> David Faure wrote:
> Done, see kio/autotests/udsentry_benchmark.
> 
> Results (in release mode, with ./kiocore-udsentry_benchmark -minimumvalue 
> 1000)
> 
> KDE3:  slave=1123 app=1411
> Hash+Variant:  slave=1856 app=1605
> Hash+Struct:   slave=1051 app=1418  // the kde4 solution
> Map+Struct:slave=1418 app=1427
> TwoVectors:slave=1204 app=1390  // Frank's suggestion in this RR.
> 
> Not bad :-)
> 
> Again, please double-check the test and the results

Pure CPU results with -callgrind :

KDE3: slave=10284 app=14032
Hash+Variant: slave=14446 app=14456
Hash+Struct:  slave=9363  app=13973 // kde4
Map+Struct:   slave=11262 app=13857
TwoVectors:   slave=11199 app=13854 // this RR

Looks somewhat similar: this solution is good for the app side, but on the 
slave side we can do better (especially if we use a different data structure 
there, with no lookup capabilities).


- David


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review59479
---


On June 1, 2014, 1:50 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated June 1, 2014, 1:50 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> 

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-07 Thread David Faure


> On June 6, 2014, 10:34 p.m., David Faure wrote:
> > Interestingly, I had a benchmark to compare a number of data structures for 
> > UDSEntry (which made me turn the Qt3 QList into the Qt4 QHash). That 
> > benchmark is the commented-out "newApiPerformance()" method in jobtest.cpp.
> > 
> > I just extended it to include your suggestion: 
> > http://www.davidfaure.fr/2014/udsentry_benchmark.diff
> > 
> > Here are the results I get (in debug mode, because otherwise I'd have to 
> > make sure the compiler doesn't optimize away the useless iterations loops 
> > :))
> > 
> > Old API: slave code: 5
> > Old API: app code: 5
> > QHash+QVariant API: slave code: 9
> > QHash+QVariant API: app code: 4
> > QHash+struct API: slave code: 7
> > QHash+struct API: app code: 3
> > QMap+struct API: slave code: 8
> > QMap+struct API: app code: 4
> > Frank's API: slave code: 13
> > Frank API: app code: 6
> > 
> > So, unless I'm missing something, this is much slower for the slave (but we 
> > could decide to just not use UDSEntry in the slave, as discussed on k-f-d), 
> > but it's also slower on the app side
> > 
> > Can you check if I did something stupid when grabbing your code or writing 
> > the testcode for it?
> > Otherwise the next step is to actually run this with -O2, ensuring the 
> > loops still loop.
> 
> Milian Wolff wrote:
> I'd say a benchmark should use QBENCHMARK. Also, I'm afraid the results 
> you posted are meaningless when they come from a non-optimized, esp. 
> considering that the differences are so small. I suggest someone picks this 
> test up and makes it a proper benchmark. To prevent the optimizer from 
> kicking in, actually use the results in one way or another. E.g. calculate 
> the sum of string lengths returned and number values for the read operation. 
> And of course also compare the sum to some expected value.
> 
> David Faure wrote:
> Yeah - this code is very old, it predates Q_BENCHMARK :-)
> 
> "the differences are small" - not really, these are entire seconds, due 
> to large amount of iterations.
> 
> But yep, let me convert all this to a proper benchmark. Coming up.

Done, see kio/autotests/udsentry_benchmark.

Results (in release mode, with ./kiocore-udsentry_benchmark -minimumvalue 1000)

KDE3:  slave=1123 app=1411
Hash+Variant:  slave=1856 app=1605
Hash+Struct:   slave=1051 app=1418  // the kde4 solution
Map+Struct:slave=1418 app=1427
TwoVectors:slave=1204 app=1390  // Frank's suggestion in this RR.

Not bad :-)

Again, please double-check the test and the results


- David


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review59479
---


On June 1, 2014, 1:50 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated June 1, 2014, 1:50 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches 

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-07 Thread David Faure


> On June 6, 2014, 10:34 p.m., David Faure wrote:
> > Interestingly, I had a benchmark to compare a number of data structures for 
> > UDSEntry (which made me turn the Qt3 QList into the Qt4 QHash). That 
> > benchmark is the commented-out "newApiPerformance()" method in jobtest.cpp.
> > 
> > I just extended it to include your suggestion: 
> > http://www.davidfaure.fr/2014/udsentry_benchmark.diff
> > 
> > Here are the results I get (in debug mode, because otherwise I'd have to 
> > make sure the compiler doesn't optimize away the useless iterations loops 
> > :))
> > 
> > Old API: slave code: 5
> > Old API: app code: 5
> > QHash+QVariant API: slave code: 9
> > QHash+QVariant API: app code: 4
> > QHash+struct API: slave code: 7
> > QHash+struct API: app code: 3
> > QMap+struct API: slave code: 8
> > QMap+struct API: app code: 4
> > Frank's API: slave code: 13
> > Frank API: app code: 6
> > 
> > So, unless I'm missing something, this is much slower for the slave (but we 
> > could decide to just not use UDSEntry in the slave, as discussed on k-f-d), 
> > but it's also slower on the app side
> > 
> > Can you check if I did something stupid when grabbing your code or writing 
> > the testcode for it?
> > Otherwise the next step is to actually run this with -O2, ensuring the 
> > loops still loop.
> 
> Milian Wolff wrote:
> I'd say a benchmark should use QBENCHMARK. Also, I'm afraid the results 
> you posted are meaningless when they come from a non-optimized, esp. 
> considering that the differences are so small. I suggest someone picks this 
> test up and makes it a proper benchmark. To prevent the optimizer from 
> kicking in, actually use the results in one way or another. E.g. calculate 
> the sum of string lengths returned and number values for the read operation. 
> And of course also compare the sum to some expected value.

Yeah - this code is very old, it predates Q_BENCHMARK :-)

"the differences are small" - not really, these are entire seconds, due to 
large amount of iterations.

But yep, let me convert all this to a proper benchmark. Coming up.


- David


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review59479
---


On June 1, 2014, 1:50 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated June 1, 2014, 1:50 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-06 Thread Milian Wolff


> On June 6, 2014, 10:34 p.m., David Faure wrote:
> > Interestingly, I had a benchmark to compare a number of data structures for 
> > UDSEntry (which made me turn the Qt3 QList into the Qt4 QHash). That 
> > benchmark is the commented-out "newApiPerformance()" method in jobtest.cpp.
> > 
> > I just extended it to include your suggestion: 
> > http://www.davidfaure.fr/2014/udsentry_benchmark.diff
> > 
> > Here are the results I get (in debug mode, because otherwise I'd have to 
> > make sure the compiler doesn't optimize away the useless iterations loops 
> > :))
> > 
> > Old API: slave code: 5
> > Old API: app code: 5
> > QHash+QVariant API: slave code: 9
> > QHash+QVariant API: app code: 4
> > QHash+struct API: slave code: 7
> > QHash+struct API: app code: 3
> > QMap+struct API: slave code: 8
> > QMap+struct API: app code: 4
> > Frank's API: slave code: 13
> > Frank API: app code: 6
> > 
> > So, unless I'm missing something, this is much slower for the slave (but we 
> > could decide to just not use UDSEntry in the slave, as discussed on k-f-d), 
> > but it's also slower on the app side
> > 
> > Can you check if I did something stupid when grabbing your code or writing 
> > the testcode for it?
> > Otherwise the next step is to actually run this with -O2, ensuring the 
> > loops still loop.

I'd say a benchmark should use QBENCHMARK. Also, I'm afraid the results you 
posted are meaningless when they come from a non-optimized, esp. considering 
that the differences are so small. I suggest someone picks this test up and 
makes it a proper benchmark. To prevent the optimizer from kicking in, actually 
use the results in one way or another. E.g. calculate the sum of string lengths 
returned and number values for the read operation. And of course also compare 
the sum to some expected value. 


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review59479
---


On June 1, 2014, 1:50 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated June 1, 2014, 1:50 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   src/core/udsentry.cpp c6ac21a 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest 

Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-06 Thread David Faure

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review59479
---


Interestingly, I had a benchmark to compare a number of data structures for 
UDSEntry (which made me turn the Qt3 QList into the Qt4 QHash). That benchmark 
is the commented-out "newApiPerformance()" method in jobtest.cpp.

I just extended it to include your suggestion: 
http://www.davidfaure.fr/2014/udsentry_benchmark.diff

Here are the results I get (in debug mode, because otherwise I'd have to make 
sure the compiler doesn't optimize away the useless iterations loops :))

Old API: slave code: 5
Old API: app code: 5
QHash+QVariant API: slave code: 9
QHash+QVariant API: app code: 4
QHash+struct API: slave code: 7
QHash+struct API: app code: 3
QMap+struct API: slave code: 8
QMap+struct API: app code: 4
Frank's API: slave code: 13
Frank API: app code: 6

So, unless I'm missing something, this is much slower for the slave (but we 
could decide to just not use UDSEntry in the slave, as discussed on k-f-d), but 
it's also slower on the app side

Can you check if I did something stupid when grabbing your code or writing the 
testcode for it?
Otherwise the next step is to actually run this with -O2, ensuring the loops 
still loop.

- David Faure


On June 1, 2014, 1:50 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated June 1, 2014, 1:50 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   src/core/udsentry.cpp c6ac21a 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-02 Thread Milian Wolff


> On June 1, 2014, 2:41 p.m., Mark Gaiser wrote:
> > A very big +1 from me!
> > 
> > Your change is going to - heavily - conflict with my local UDSEntry changes 
> > which has a different approach. In that change i'm basically just storing 
> > all the default fields as direct class members of UDSEntryPrivate, the hash 
> > is still there but only for custom fields. That removes the need to do any 
> > lookups in QHash (as it was) or QVector (as it is with this patch). Lookups 
> > would then only be required for custom fields.
> > 
> > I'm not sure if my changes still make sense after this change lands. I will 
> > just have to do some benchmarks and see if the added complexity is worth it.

Both of you should do before/after benchmarks and run them through Valgrind's 
massif (for memory consumption) and through perf stat or similar (for runtime 
performance).


- Milian


---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review58901
---


On June 1, 2014, 1:50 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated June 1, 2014, 1:50 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   src/core/udsentry.cpp c6ac21a 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Re: Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-01 Thread Mark Gaiser

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/#review58901
---


A very big +1 from me!

Your change is going to - heavily - conflict with my local UDSEntry changes 
which has a different approach. In that change i'm basically just storing all 
the default fields as direct class members of UDSEntryPrivate, the hash is 
still there but only for custom fields. That removes the need to do any lookups 
in QHash (as it was) or QVector (as it is with this patch). Lookups would then 
only be required for custom fields.

I'm not sure if my changes still make sense after this change lands. I will 
just have to do some benchmarks and see if the added complexity is worth it.

- Mark Gaiser


On June 1, 2014, 1:50 p.m., Frank Reininghaus wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://git.reviewboard.kde.org/r/118452/
> ---
> 
> (Updated June 1, 2014, 1:50 p.m.)
> 
> 
> Review request for KDE Frameworks and David Faure.
> 
> 
> Repository: kio
> 
> 
> Description
> ---
> 
> I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
> attempts to make UDSEntry more efficient memory and CPU-wise, into 
> independent parts. This is the third step after 
> https://git.reviewboard.kde.org/r/113591/ and 
> https://git.reviewboard.kde.org/r/115739/ .
> 
> The present patch modifies the internal data storage of UDSEntry. UDSEntry 
> contains a mapping from unsigned int keys to "Field" values, where Field is a 
> struct that contains a QString and a long long (some keys correspond to 
> numeric values, like size, date, etc, whereas others, like user and group, 
> correspond to a QString).
> 
> Currently, UDSEntry stores all data in a QHash internally. This 
> ensures that everything can be accessed in O(1) time, but is not very 
> efficient memory-wise because a separate memory allocation is done for each 
> hash node.
> 
> I propose to change this and store both the uint keys and the Field values in 
> a QVector each. This means that accessing a value becomes a O(n) operation, 
> since the entire QVector of keys may need to be scanned in order to find a 
> value, but because the number n of values in a UDSEntry is usually quite 
> small and can currently not exceed a number ~100, this should not matter in 
> practice.
> 
> Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:
> 
> (a) The QVectors which store the keys (which are usually the same for all 
> items in a directory) are not shared yet. Changing this would reduce the 
> memory usage further, but I decided to keep this change separate in order to 
> keep the current patch small and easy to understand. Moreover, this makes it 
> easier to benchmark other similar approaches (e.g., replace QVector by 
> std::vector, or store keys and values together in a 
> std::vector>).
> 
> (b) No space is reserved in the vectors when key/value pairs are inserted one 
> by one. Implementing this would make UDSEntry faster on the slave side (since 
> repeated re-allocations would not be necessary any more), but this can be 
> done in a later patch. Moreover, it might not be needed any more if UDSEntry 
> is not used directly any more on the slave side, as suggested on the 
> frameworks mailing list by Aaron (good idea BTW!). 
> 
> 
> Diffs
> -
> 
>   src/core/udsentry.cpp c6ac21a 
> 
> Diff: https://git.reviewboard.kde.org/r/118452/diff/
> 
> 
> Testing
> ---
> 
> Unit tests still pass.
> 
> The memory usage of listjobtest with a directory with 100,000 files is 
> reduced from 71344 K to 35392 K according to KSysGuard. I see similar savings 
> when opening the directory in Dolphin.
> 
> I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
> cannot present any benchmark results.
> 
> 
> Thanks,
> 
> Frank Reininghaus
> 
>

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel


Review Request 118452: Reduce the memory usage of UDSEntry by using QVector, rather than QHash, for the internal data storage

2014-06-01 Thread Frank Reininghaus

---
This is an automatically generated e-mail. To reply, visit:
https://git.reviewboard.kde.org/r/118452/
---

Review request for KDE Frameworks and David Faure.


Repository: kio


Description
---

I am continuing to split up https://git.reviewboard.kde.org/r/113355/ , which 
attempts to make UDSEntry more efficient memory and CPU-wise, into independent 
parts. This is the third step after 
https://git.reviewboard.kde.org/r/113591/ and 
https://git.reviewboard.kde.org/r/115739/ .

The present patch modifies the internal data storage of UDSEntry. UDSEntry 
contains a mapping from unsigned int keys to "Field" values, where Field is a 
struct that contains a QString and a long long (some keys correspond to numeric 
values, like size, date, etc, whereas others, like user and group, correspond 
to a QString).

Currently, UDSEntry stores all data in a QHash internally. This 
ensures that everything can be accessed in O(1) time, but is not very efficient 
memory-wise because a separate memory allocation is done for each hash node.

I propose to change this and store both the uint keys and the Field values in a 
QVector each. This means that accessing a value becomes a O(n) operation, since 
the entire QVector of keys may need to be scanned in order to find a value, but 
because the number n of values in a UDSEntry is usually quite small and can 
currently not exceed a number ~100, this should not matter in practice.

Some parts of https://git.reviewboard.kde.org/r/113355/ are still missing:

(a) The QVectors which store the keys (which are usually the same for all items 
in a directory) are not shared yet. Changing this would reduce the memory usage 
further, but I decided to keep this change separate in order to keep the 
current patch small and easy to understand. Moreover, this makes it easier to 
benchmark other similar approaches (e.g., replace QVector by std::vector, or 
store keys and values together in a std::vector>).

(b) No space is reserved in the vectors when key/value pairs are inserted one 
by one. Implementing this would make UDSEntry faster on the slave side (since 
repeated re-allocations would not be necessary any more), but this can be done 
in a later patch. Moreover, it might not be needed any more if UDSEntry is not 
used directly any more on the slave side, as suggested on the frameworks 
mailing list by Aaron (good idea BTW!). 


Diffs
-

  src/core/udsentry.cpp c6ac21a 

Diff: https://git.reviewboard.kde.org/r/118452/diff/


Testing
---

Unit tests still pass.

The memory usage of listjobtest with a directory with 100,000 files is reduced 
from 71344 K to 35392 K according to KSysGuard. I see similar savings when 
opening the directory in Dolphin.

I still haven't set up a Qt5/KF5 build in release mode (shame on me!), so I 
cannot present any benchmark results.


Thanks,

Frank Reininghaus

___
Kde-frameworks-devel mailing list
Kde-frameworks-devel@kde.org
https://mail.kde.org/mailman/listinfo/kde-frameworks-devel