D11487: optimization of TextLineData::attribute

2018-04-03 Thread Milian Wolff
mwolff added a comment.


  Ah, I knew it :D I suspected missing compiler optimizations from the start... 
But tricky indeed for you to see!
  
  But the real question would now be: what time does it take *without* this 
patch for you? probably it won't be more than a minute anymore. Still, this 
patch is a good improvement and can/should be kept.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate, cullmann
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-04-03 Thread Jaime Torres Amate
jtamate added a comment.


  In D11487#232555 , @mwolff wrote:
  
  > WTF :D Your desktop CPU has clearly a better performance than my mobile 
CPU, no? Is AMD really so much worse here? How can that be - I don't get it :D
  >
  > Anyhow, I give up trying to understand this now - thanks a lot for your 
repeated input Jaime!
  
  
  I've found the problem for the poor performance. In .kdesrc-buildrc, I have 
also a like cxxflags # -fsanitize=address -fsanitize=signed-integer-overflow  
-fsanitize=bounds-strict -fsanitize=undefined -fsanitize-recover=address
  That line removes the -O2 from the compiler options. Added manually in the 
way of cxxflags -O2 -mtune=native # -fsanitize=address .
  I "only" get 10 seconds less, 22 seconds with the patch applied.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate, cullmann
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Milian Wolff
mwolff added a comment.


  WTF :D Your desktop CPU has clearly a better performance than my mobile CPU, 
no? Is AMD really so much worse here? How can that be - I don't get it :D
  
  Anyhow, I give up trying to understand this now - thanks a lot for your 
repeated input Jaime!

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate, cullmann
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Jaime Torres Amate
jtamate added a comment.


  In D11487#232497 , @mwolff wrote:
  
  > @jtamate I just checked, the function is called with the same parameters 
for me locally. What output do you get for this:
  >
  > For me this is the interesting bit:
  >
  >   15.097 debug: unknown[unknown:0]: [ (0, 0)  ->  (0, 358720) ] "" true 
false
  >   15.097 debug: unknown[unknown:0]: 0 1979 358720
  >   23.013 debug: unknown[unknown:0]: [ (5, 0)  ->  (5, 162682) ] "" true 
false
  >   23.013 debug: unknown[unknown:0]: 5 12 162682
  >   
  
  
  
  
  > So for line 0 I should actually get quite abysmal performance too, worst 
case 1979 * 358720, but it flies right through here... Now I wonder what kind 
of CPU you are using? I have a i7-5600U CPU @ 2.60GHz here in my laptop, which 
has:
  > 
  >   $ lstopo --of console
  >   Machine (11GB)
  > Package L#0 + L3 L#0 (4096KB)
  >   L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
  > PU L#0 (P#0)
  > PU L#1 (P#1)
  >   L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
  > PU L#2 (P#2)
  > PU L#3 (P#3)
  > 
  > 
  > The 1979 entries of attributes (each 12byte large) produce ~24KB of data, 
which fits nicely within my L1 cache. Maybe I'm just lucky and this is what 
hides this issue from me?
  
  Same debug output.
  
  AMD Phenom(tm) II X6 1100T Processor, 3.30 GHz
  Machine (16GB)
  
Package L#0 + L3 L#0 (6144KB)
  L2 L#0 (512KB) + L1d L#0 (64KB) + L1i L#0 (64KB) + Core L#0 + PU L#0 (P#0)
  L2 L#1 (512KB) + L1d L#1 (64KB) + L1i L#1 (64KB) + Core L#1 + PU L#1 (P#1)
  L2 L#2 (512KB) + L1d L#2 (64KB) + L1i L#2 (64KB) + Core L#2 + PU L#2 (P#2)
  L2 L#3 (512KB) + L1d L#3 (64KB) + L1i L#3 (64KB) + Core L#3 + PU L#3 (P#3)
  L2 L#4 (512KB) + L1d L#4 (64KB) + L1i L#4 (64KB) + Core L#4 + PU L#4 (P#4)
  L2 L#5 (512KB) + L1d L#5 (64KB) + L1i L#5 (64KB) + Core L#5 + PU L#5 (P#5)

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate, cullmann
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Milian Wolff
mwolff added a comment.


  @jtamate I just checked, the function is called with the same parameters for 
me locally. What output do you get for this:
  
diff --git a/src/spellcheck/spellcheck.cpp b/src/spellcheck/spellcheck.cpp
index 9e04d788..50e92885 100644
--- a/src/spellcheck/spellcheck.cpp
+++ b/src/spellcheck/spellcheck.cpp
@@ -133,6 +133,7 @@ QList > 
KateSpellCheckManager::spellCheckWrtH
 bool singleLine,
 bool returnSingleRange)
 {
+qDebug() << range << dictionary << singleLine << returnSingleRange;
 QList > toReturn;
 if (range.isEmpty()) {
 return toReturn;
@@ -172,6 +173,7 @@ QList > 
KateSpellCheckManager::spellCheckWrtH
 }
 const int start = (line == startLine) ? startColumn : 0;
 const int end = (line == endLine) ? endColumn : 
kateTextLine->length();
+qDebug() << line << kateTextLine->attributesList().size() << 
(end - start);
 for (int i = start; i < end;) { // WARNING: 'i' has to be 
incremented manually!
 int attr = kateTextLine->attribute(i);
 const KatePrefixStore  = 
highlighting->getCharacterEncodingsPrefixStore(attr);
  
  For me this is the interesting bit:
  
15.097 debug: unknown[unknown:0]: [ (0, 0)  ->  (0, 358720) ] "" true false
15.097 debug: unknown[unknown:0]: 0 1979 358720
23.013 debug: unknown[unknown:0]: [ (5, 0)  ->  (5, 162682) ] "" true false
23.013 debug: unknown[unknown:0]: 5 12 162682
  
  So for line 0 I should actually get quite abysmal performance too, worst case 
1979 * 358720, but it flies right through here... Now I wonder what kind of CPU 
you are using? I have a i7-5600U CPU @ 2.60GHz here in my laptop, which has:
  
$ lstopo --of console
Machine (11GB)
  Package L#0 + L3 L#0 (4096KB)
L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
  PU L#0 (P#0)
  PU L#1 (P#1)
L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
  PU L#2 (P#2)
  PU L#3 (P#3)
  
  The 1979 entries of attributes (each 12byte large) produce ~24KB of data, 
which fits nicely within my L1 cache. Maybe I'm just lucky and this is what 
hides this issue from me?

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate, cullmann
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Jaime Torres Amate
This revision was automatically updated to reflect the committed changes.
jtamate marked 2 inline comments as done.
Closed by commit R39:787318967fce: optimization of TextLineData::attribute 
(authored by jtamate).

REPOSITORY
  R39 KTextEditor

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D11487?vs=30331=30334

REVISION DETAIL
  https://phabricator.kde.org/D11487

AFFECTED FILES
  src/buffer/katetextline.cpp
  src/buffer/katetextline.h

To: jtamate, #frameworks, #kate, cullmann
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Christoph Cullmann
cullmann accepted this revision.
cullmann added a comment.
This revision is now accepted and ready to land.


  Yep, makes sense.
  Thanks for the improvement.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate, cullmann
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Milian Wolff
mwolff added a comment.


  +1 from my side, @cullmann, @dhaumann ?
  
  I'll continue to figure out why I don't see the performance issue here

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Jaime Torres Amate
jtamate updated this revision to Diff 30331.
jtamate edited the summary of this revision.
jtamate added a comment.


  Using upper_bound with <
  Changed the name of the iterator
  Using -> instead of (*x)
  
  These are the calls I get since I open the document:
  doc= KTextEditor::DocumentPrivate(0x228fe10)  range= [ (0, 0)  ->  (0, 0) ]  
dictionary= "" singleLine= true returnSingleRange= false
  doc= KTextEditor::DocumentPrivate(0x228fe10)  range= [ (1, 0)  ->  (1, 4096) 
]  dictionary= "" singleLine= true returnSingleRange= false
  doc= KTextEditor::DocumentPrivate(0x228fe10)  range= [ (0, 0)  ->  (0, 4090) 
]  dictionary= "" singleLine= true returnSingleRange= false
  sonnet.core: Missing trigrams for languages: QSet("en_CA", "en_GB", "en_AU")
  doc= KTextEditor::DocumentPrivate(0x228fe10)  range= [ (0, 0)  ->  (0, 
358720) ]  dictionary= "" singleLine= true returnSingleRange= false
  doc= KTextEditor::DocumentPrivate(0x228fe10)  range= [ (5, 0)  ->  (5, 
162682) ]  dictionary= "" singleLine= true returnSingleRange= false

REPOSITORY
  R39 KTextEditor

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D11487?vs=30134=30331

REVISION DETAIL
  https://phabricator.kde.org/D11487

AFFECTED FILES
  src/buffer/katetextline.cpp
  src/buffer/katetextline.h

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Milian Wolff
mwolff added a comment.


  In D11487#232258 , @jtamate wrote:
  
  > > One question to that though: Why do you sort/lookup by `x.offset + 
x.length <= p`? Note how lower_bound returns the first iterator that is _not_ 
going to return true.
  >
  > Assuming there are neither overlaps nor unsorted entries.
  >  Lets call X the iterator returned by lower_bound, suppose X is not cend(), 
so X.offset + X.length > p.
  
  
  Ah right, the offset + length is larger, not the offset itself - that's the 
error in my reasoning - thanks for clearing that up!
  Still, you'd get the same with an upper_bound and `x.offset + x.length < p` 
search, no? I ask this, because to me it is somewhat odd to use a `<=` 
comparison for a lower_bound.
  
  > If other iterator Y could satisfy Y.offset + Y.length > p and Y.offset <= 
X.offset, it means there are overlaps, contradiction.
  >  Therefore,  the rest of the iterators has the following properties:
  >  Y.offset + Y.length > p and Y.offset > X.offset
  > or
  >  Y.offset + Y.length <= p and Y.offset < X.offset
  > 
  >> To me, it looks like your code cannot actually work and will always return 
0?
  > 
  > Nope. Otherwise the tests fail, more precisely kateindenttest.
  > 
  >> Personally, I'd try to use `upper_bound` with `x.offset < p` in the 
comparison. The iterator should then point to the first item that has it's 
offset larger than `p`. So decrementing the iterator once (while checking 
against `begin()`) yields the iterator that could potentially match. Thus, 
check if `p` is contained in its range and if so return it's attribute, 
otherwise return 0.
  > 
  > I've tried. The code gets uglier.
  
  See above, `upper_bound` with `offset + length < p` should return the same 
iterator as your search now, but be more natural (imo) to what one would see 
elsewhere.
  
  >> Besides this: I am still looking for an explanation why spell checking is 
so extremely slow for you. I have the same settings enabled, and spell checking 
is seemingly fast for me... Am I missing some dictionary or something other to 
reproduce this?
  > 
  > Perhaps some Ignored words? I have Amarok, KAddressBook, KDevelop, KHTML, 
KIO, 
  
  Can you maybe check how often, and for what arguments 
`spellCheckWrtHighlightingRanges` is called for you? I'll do the same, so maybe 
we can figure out what is going on then. Also, I'll try with a clean 
configuration and see if anything is different there.
  
  >> Also, what is "@mwolf solution"
  > 
  > we can return the iterator and take it as an argument again. I tried 
locally the python style, returning a QPair.
  > 
  >> Can you try that locally and see how it goes for you?
  > 
  > I only get one second less, 33 seconds. The problem is that attribute() is 
called from more places. Is it worth to have two implementations?
  
  OK - thanks for measuring. I was hoping for a bigger difference. I aggre that 
we can ignore the 3% improvement and stay with your binary search then as it's 
generic and will work without requiring code changes elsewhere.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Jaime Torres Amate
jtamate added a comment.


  > One question to that though: Why do you sort/lookup by `x.offset + x.length 
<= p`? Note how lower_bound returns the first iterator that is _not_ going to 
return true.
  
  Assuming there are neither overlaps nor unsorted entries.
  Lets call X the iterator returned by lower_bound, suppose X is not cend(), so 
X.offset + X.length > p. 
  If other iterator Y could satisfy Y.offset + Y.length > p and Y.offset <= 
X.offset, it means there are overlaps, contradiction.
  Therefore,  the rest of the iterators has the following properties:
  Y.offset + Y.length > p and Y.offset > X.offset
  or
  Y.offset + Y.length <= p and Y.offset < X.offset
  
  > To me, it looks like your code cannot actually work and will always return 
0?
  
  Nope. Otherwise the tests fail, more precisely kateindenttest.
  
  > Personally, I'd try to use `upper_bound` with `x.offset < p` in the 
comparison. The iterator should then point to the first item that has it's 
offset larger than `p`. So decrementing the iterator once (while checking 
against `begin()`) yields the iterator that could potentially match. Thus, 
check if `p` is contained in its range and if so return it's attribute, 
otherwise return 0.
  
  I've tried. The code gets uglier.
  
  > Besides this: I am still looking for an explanation why spell checking is 
so extremely slow for you. I have the same settings enabled, and spell checking 
is seemingly fast for me... Am I missing some dictionary or something other to 
reproduce this?
  
  Perhaps some Ignored words? I have Amarok, KAddressBook, KDevelop, KHTML, 
KIO, 
  
  > Also, what is "@mwolf solution"
  
  we can return the iterator and take it as an argument again. I tried locally 
the python style, returning a QPair.
  
  > Can you try that locally and see how it goes for you?
  
  I only get one second less, 33 seconds. The problem is that attribute() is 
called from more places. Is it worth to have two implementations?

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Milian Wolff
mwolff added a comment.


  Also, what is "@mwolf solution" - I didn't provide any code, until now:
  
diff --git a/src/spellcheck/spellcheck.cpp b/src/spellcheck/spellcheck.cpp
index 9e04d788..3a97e5c5 100644
--- a/src/spellcheck/spellcheck.cpp
+++ b/src/spellcheck/spellcheck.cpp
@@ -170,10 +170,32 @@ QList > 
KateSpellCheckManager::spellCheckWrtH
 if (!kateTextLine) {
 continue;// bug #303496
 }
+const auto attributes = kateTextLine->attributesList();
+auto attributes_it = attributes.begin();
+const auto attributes_end = attributes.end();
+auto find_attr = [_it, attributes_end](int pos) -> 
int {
+// bail out early if possible
+if (attributes_it == attributes_end || 
attributes_it->offset > pos) {
+return 0;
+}
+// advance until we encompass the position
+while (pos >= (attributes_it->offset + 
attributes_it->length)) {
+++attributes_it;
+if (attributes_it == attributes_end) {
+return 0;
+}
+}
+// now check if we have a match
+if (attributes_it->offset <= pos) {
+return attributes_it->attributeValue;
+} else {
+return 0;
+}
+};
 const int start = (line == startLine) ? startColumn : 0;
 const int end = (line == endLine) ? endColumn : 
kateTextLine->length();
 for (int i = start; i < end;) { // WARNING: 'i' has to be 
incremented manually!
-int attr = kateTextLine->attribute(i);
+int attr = find_attr(i);
 const KatePrefixStore  = 
highlighting->getCharacterEncodingsPrefixStore(attr);
 QString prefixFound = prefixStore.findPrefix(kateTextLine, 
i);
 if 
(!document->highlight()->attributeRequiresSpellchecking(static_cast(attr))
  
  Can you try that locally and see how it goes for you? Without a way to 
reproduce, I can't say how it performance as it's fast before and after for me 
:) With some `QElapsedTimer` directly in this method, I see that the above is 
faster than the previous method, but it drowns in the overall measurements. 
Anyhow, this should in theory be better than any binary search as it will only 
go through the list at most once per line, whereas the binary search has to 
find the start position repeatedly.
  
  But as I said, both approaches could be combined if we want to. And of course 
if we decide to follow my suggestion, then I'd need to clean it up to put the 
code directly into `TextLineData`.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-23 Thread Milian Wolff
mwolff added a comment.


  In D11487#231656 , @jtamate wrote:
  
  > In D11487#231522 , @mwolff wrote:
  >
  > > @jtamate looking at your screenshots, it represents closely what I see 
locally. Most notably, there are no red underlines in your screenshots which 
could arise due to spell checking. Thus I really wonder why you are seeing such 
a big hotspot there.
  > >
  > > Try perf, or try a poor mans profiler like GDB and regularly interrupt. 
Do you really end up in `TextLineData::attribute()`? Or, alternatively: Measure 
the time it takes for kate/kwrite to open the file and then go to the end. Then 
compare this before and after your patch. Do you see anything in the order of 
~75% reduction for the time then too? Note how callgrind only measure 
instructions, so a supposed reduction of 75% of instructions should certainly 
have an impact on time too - of course not 75% too... I simply cannot fathom 
why you are seeing such an impact but I cannot reproduce this at all!
  >
  >
  > I've done some measurements, as the times are so big, with a stopwatch 2 
times each test.
  >  With "Enable autodetection of Language" and "Automatic spell checking 
enabled by default" enabled,
  >  the test as before: since pressing "Temporarily raise limit and reload 
file", press Ctrl+end and finish the scroll to the end of the document.
  >
  > without any version of the patch:
  >  1 min 25 seconds 
  >  With @mwolf solution, used only in spellCheckWrtHighlightingRanges.
  >  38 seconds
  >  With the binary search:
  >  34 seconds
  
  
  OK, this is a huge win, so I guess one way or another we should integrate 
this patch. One question to that though: Why do you sort/lookup by `x.offset + 
x.length <= p`? Note how lower_bound returns the first iterator that is _not_ 
going to return true. To me, it looks like your code cannot actually work and 
will always return 0? Personally, I'd try to use `upper_bound` with `x.offset < 
p` in the comparison. The iterator should then point to the first item that has 
it's offset larger than `p`. So decrementing the iterator once (while checking 
against `begin()`) yields the iterator that could potentially match. Thus, 
check if `p` is contained in its range and if so return it's attribute, 
otherwise return 0.
  
  Besides this: I am still looking for an explanation why spell checking is so 
extremely slow for you. I have the same settings enabled, and spell checking is 
seemingly fast for me... Am I missing some dictionary or something other to 
reproduce this?

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-22 Thread Jaime Torres Amate
jtamate added a comment.


  In D11487#231522 , @mwolff wrote:
  
  > @jtamate looking at your screenshots, it represents closely what I see 
locally. Most notably, there are no red underlines in your screenshots which 
could arise due to spell checking. Thus I really wonder why you are seeing such 
a big hotspot there.
  >
  > Try perf, or try a poor mans profiler like GDB and regularly interrupt. Do 
you really end up in `TextLineData::attribute()`? Or, alternatively: Measure 
the time it takes for kate/kwrite to open the file and then go to the end. Then 
compare this before and after your patch. Do you see anything in the order of 
~75% reduction for the time then too? Note how callgrind only measure 
instructions, so a supposed reduction of 75% of instructions should certainly 
have an impact on time too - of course not 75% too... I simply cannot fathom 
why you are seeing such an impact but I cannot reproduce this at all!
  
  
  I've done some measurements, as the times are so big, with a stopwatch 2 
times each test.
  With "Enable autodetection of Language" and "Automatic spell checking enabled 
by default" enabled,
  the test as before: since pressing "Temporarily raise limit and reload file", 
press Ctrl+end and finish the scroll to the end of the document.
  
  without any version of the patch:
  1 min 25 seconds 
  With @mwolf solution, used only in spellCheckWrtHighlightingRanges.
  38 seconds
  With the binary search:
  34 seconds

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-22 Thread Milian Wolff
mwolff added a comment.


  @jtamate looking at your screenshots, it represents closely what I see 
locally. Most notably, there are no red underlines in your screenshots which 
could arise due to spell checking. Thus I really wonder why you are seeing such 
a big hotspot there.
  
  Try perf, or try a poor mans profiler like GDB and regularly interrupt. Do 
you really end up in `TextLineData::attribute()`? Or, alternatively: Measure 
the time it takes for kate/kwrite to open the file and then go to the end. Then 
compare this before and after your patch. Do you see anything in the order of 
~75% reduction for the time then too? Note how callgrind only measure 
instructions, so a supposed reduction of 75% of instructions should certainly 
have an impact on time too - of course not 75% too... I simply cannot fathom 
why you are seeing such an impact but I cannot reproduce this at all!

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-22 Thread Milian Wolff
mwolff added inline comments.

INLINE COMMENTS

> anthonyfieroni wrote in katetextline.cpp:214-216
> Use operator->, it's faster than operator* and operator.
> 
>   first->offset

sorry, but that's simply not true at all.

Stylistic wise I agree, but an optimizing compiler will generate the same code 
for both.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-22 Thread Jaime Torres Amate
jtamate added a comment.


  In D11487#231110 , @mwolff wrote:
  
  > Considering spell checking is involved - can you show a screenshot for how 
the file looks like for you? There shouldn't be a lot of spell checking going 
on, or so I hope...
  
  
  F5762231: fake_first.png  opening the 
file and accepting to extend the line limit.
  F5762233: fake_last.png  after pressing 
ctrl+end

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Anthony Fieroni
anthonyfieroni added inline comments.

INLINE COMMENTS

> katetextline.cpp:214-216
> +if (first != m_attributesList.cend() && (*first).offset <= pos && pos < 
> ((*first).offset + (*first).length))
> +{
> +return (*first).attributeValue;

Use operator->, it's faster than operator* and operator.

  first->offset

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: anthonyfieroni, dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, 
demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Christoph Cullmann
cullmann added a comment.


  For the sorting/overlapping question: actually the HL code should ensure 
there are neither overlaps nor unsorted entries.
  And I think if there are, already now "strange" things happen, as normally 
the first item would win.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Milian Wolff
mwolff added a comment.


  Considering spell checking is involved - can you show a screenshot for how 
the file looks like for you? There shouldn't be a lot of spell checking going 
on, or so I hope...

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Milian Wolff
mwolff added a comment.


  When I open your fake file in kwrite, then raise the line length limit and 
wait for the file to be rendered (which takes quite some time!), perf does not 
show any performance issues with attribute for me. It only corresponds to ~1.3% 
of the CPU cycles. `QTextLine::layout_helper` on the other hand consumes ~86.2% 
of the CPU cycles!
  
  So now again I wonder: why are you seeing such drastically different results 
- these orders of magnitude I cannot easily explain by differences in valgrind 
vs. perf.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Milian Wolff
mwolff added a comment.


  In D11487#230913 , @jtamate wrote:
  
  > In D11487#230895 , @mwolff wrote:
  >
  > > Looking at `TextLineData::addAttribute`, it doesn't seem to sort the data 
- how can you be sure that the attributes list really is sorted by `offset + 
length`? I think this should be done manually there, or at least asserted. The 
old code apparently also relied on the order of attributes after all.
  >
  >
  > For what I've seen in my traces, the attributes are sorted.
  >  How do you assert on the order of a list? I do not know how to do that.
  
  
  You assert that newly added items actually sort-after the last added item in 
the vector
  
  >> @cullmann @dhaumann: can we have overlapping attributes? If so, then this 
opens another can of worms when we start to use binary searches over *ranges*.
  >> 
  >> Note that we could think about an alternative (could also be combined with 
your patch) approach to speeding up this function: 
  >> Pass around the previously found position. 
  >>  The way I guess this hotspot arises is that some function calls attribute 
on every pos in the line, linearly. We can leverage this and continue from 
where we left off the last time we returned.
  > 
  > The function is KateSpellCheckManager::spellCheckWrtHighlightingRanges in 
src/spellcheck/spellcheck.cpp(176)
  >  I tried this approach. Didn't work for me. :-(
  >  You have to store that information in a pointer, because the method is 
const.
  
  we can return the iterator and take it as an argument again
  
  >> Really, I'd like to get my hands on one of your large XML files to play 
around with this myself :)
  > 
  > I'll try to generate one :with no private data ;-)
  
  thanks :)

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Jaime Torres Amate
jtamate added a comment.


  I'm sorry, the previous file lines are 10 times longer than expected.
  This one {F5761669 }can be managed by 
kate, not only by okteta.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Jaime Torres Amate
jtamate added a comment.


  Uploaded a similar file F5761614: fake.xml.gz 

  It is created using http://interglacial.com/~sburke/pub/rtf2xml.html from a 
rtf with two images and some text.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Jaime Torres Amate
jtamate added a comment.


  In D11487#230895 , @mwolff wrote:
  
  > Looking at `TextLineData::addAttribute`, it doesn't seem to sort the data - 
how can you be sure that the attributes list really is sorted by `offset + 
length`? I think this should be done manually there, or at least asserted. The 
old code apparently also relied on the order of attributes after all.
  
  
  For what I've seen in my traces, the attributes are sorted.
  How do you assert on the order of a list? I do not know how to do that.
  
  > @cullmann @dhaumann: can we have overlapping attributes? If so, then this 
opens another can of worms when we start to use binary searches over *ranges*.
  > 
  > Note that we could think about an alternative (could also be combined with 
your patch) approach to speeding up this function: 
  > Pass around the previously found position. 
  >  The way I guess this hotspot arises is that some function calls attribute 
on every pos in the line, linearly. We can leverage this and continue from 
where we left off the last time we returned.
  
  The function is KateSpellCheckManager::spellCheckWrtHighlightingRanges in 
src/spellcheck/spellcheck.cpp(176)
  I tried this approach. Didn't work for me. :-(
  You have to store that information in a pointer, because the method is const.
  
  > Really, I'd like to get my hands on one of your large XML files to play 
around with this myself :)
  
  I'll try to generate one :with no private data ;-)

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Milian Wolff
mwolff added a subscriber: dhaumann.
mwolff added a comment.


  Looking at `TextLineData::addAttribute`, it doesn't seem to sort the data - 
how can you be sure that the attributes list really is sorted by `offset + 
length`? I think this should be done manually there, or at least asserted. The 
old code apparently also relied on the order of attributes after all.
  
  @cullmann @dhaumann: can we have overlapping attributes? If so, then this 
opens another can of worms when we start to use binary searches over *ranges*.
  
  Note that we could think about an alternative (could also be combined with 
your patch) approach to speeding up this function: Pass around the previously 
found position. The way I guess this hotspot arises is that some function calls 
attribute on every pos in the line, linearly. We can leverage this and continue 
from where we left off the last time we returned.
  
  Really, I'd like to get my hands on one of your large XML files to play 
around with this myself :)

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: dhaumann, mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Jaime Torres Amate
jtamate updated this revision to Diff 30134.
jtamate added a comment.


  > Huh, just specify -DCMAKE_INSTALL_PREFIX to the same path where you install 
the KF5 devel packages to, as a simple fix.
  
  Thanks, that worked.
  
  > Are you really compiling in release mode while measuring this?
  
  Yes. cmake-options -DCMAKE_BUILD_TYPE=RelWithDebInfo
  
  I misunderstood the role of the comparison function, Now I have it right.

REPOSITORY
  R39 KTextEditor

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D11487?vs=30117=30134

REVISION DETAIL
  https://phabricator.kde.org/D11487

AFFECTED FILES
  src/buffer/katetextline.cpp
  src/buffer/katetextline.h

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Milian Wolff
mwolff added a comment.


  In D11487#230791 , @jtamate wrote:
  
  > In D11487#230755 , @mwolff wrote:
  >
  > > yes, definitely don't roll your own lower_bound - use the STL provided 
one. Are you really compiling in release mode while measuring this? Also, I can 
only repeat myself in saying that you shouldn't use callgrind for performance 
measurements anymore, perf/hotspot should give you a much better view of where 
the CPU cycles are spent.
  >
  >
  > One of the thinks I like about using callgrind, well, kcachegrind, is that 
I have the tree of calls to one method, that I don't have in perf report.
  >  As I said, I've been unable to compile hotspot yet, because I don't have 
the KF5 devel libraries installed from the distro, and I do not know how to 
specify an alternative directory in hotspsot cmake.
  
  
  Huh, just specify `-DCMAKE_INSTALL_PREFIX` to the same path where you install 
the KF5 devel packages to, as a simple fix. Or try to set `CMAKE_LIBRARY_PATH`, 
CMAKE_INCLUDE_PATH` and `CMAKE_PREFIX_PATH` env vars. Or well, just use the 
appimage?
  
  > I'll try again with std::lower_bound, I'll take a look at folding, but I 
was unable to make it work last time.
  >  There is a big difference between lower_bound and this implementation: 
lower_bound does a < checking, while this does a <=.
  
  Do a lower_bound and then increment from there, or maybe try to use 
upper_bound? I'm not acquainted with the exact semantics you are looking for, 
but in general: Don't reinvent the wheel, but rather try to aggregate existing 
building blocks.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Jaime Torres Amate
jtamate added a comment.


  In D11487#230755 , @mwolff wrote:
  
  > yes, definitely don't roll your own lower_bound - use the STL provided one. 
Are you really compiling in release mode while measuring this? Also, I can only 
repeat myself in saying that you shouldn't use callgrind for performance 
measurements anymore, perf/hotspot should give you a much better view of where 
the CPU cycles are spent.
  
  
  One of the thinks I like about using callgrind, well, kcachegrind, is that I 
have the tree of calls to one method, that I don't have in perf report.
  As I said, I've been unable to compile hotspot yet, because I don't have the 
KF5 devel libraries installed from the distro, and I do not know how to specify 
an alternative directory in hotspsot cmake.
  
  I'll try again with std::lower_bound, I'll take a look at folding, but I was 
unable to make it work last time.
  There is a big difference between lower_bound and this implementation: 
lower_bound does a < checking, while this does a <=.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Milian Wolff
mwolff added a comment.


  yes, definitely don't roll your own lower_bound - use the STL provided one. 
Are you really compiling in release mode while measuring this? Also, I can only 
repeat myself in saying that you shouldn't use callgrind for performance 
measurements anymore, perf/hotspot should give you a much better view of where 
the CPU cycles are spent.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Christoph Cullmann
cullmann added a comment.


  Ok, that is true, for the "degenerated" very long line case it makes sense to 
do the binary search.
  But I don't think you should do that on your own, lower_bound is good enough, 
we use that for other things like the folding, too, if I am not mistaken.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Jaime Torres Amate
jtamate updated this revision to Diff 30117.
jtamate edited the summary of this revision.
jtamate added a comment.


  The problem with attribute() is that with a long line and word wrapping 
enabled,
  when the long line is shown, it is called  as many times as the length of the 
line,
  in my test case, 566.039 times.
  
  The list is not always so small...In my test file, the longest list has 7257 
items.
  Therefore, previously it had to go trough, in average,  3628 items 566.039 
times.
  
  Kate and Okteta are the only tools that can handle such files and be 
responsive to user input.

REPOSITORY
  R39 KTextEditor

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D11487?vs=29975=30117

REVISION DETAIL
  https://phabricator.kde.org/D11487

AFFECTED FILES
  src/buffer/katetextline.cpp
  src/buffer/katetextline.h

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-21 Thread Christoph Cullmann
cullmann added a comment.


  I still am not that convinced that binary search helps for vectors which 
normally have only << 10-20 elements for normal long text lines.
  But I would be ok with some range based for variant, as the loop with index 
increment is even ugly without any speed in mind ;=)

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-20 Thread Jaime Torres Amate
jtamate added a comment.


  In D11487#229906 , @mwolff wrote:
  
  > can you publish the test file and how you measured it? Then I can test it 
with perf too, to confirm the impact.
  
  
  I'm sorry, I can't. It contains too much private information to be removed.
  But it was created using the python program rtf2xml.py

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-20 Thread Jaime Torres Amate
jtamate added a comment.


  In D11487#229889 , @cullmann wrote:
  
  > Hmm, does it really get that much faster or is the stuff just moved to the 
subroutines?
  
  
  The sum of the subroutines time is lower than the routine time as shown in 
kcachegrind.
  
  > I would not have thought that something is faster than simple linear scan 
for a short vector.
  
  I'm sorry. The third patch is totally wrong, I changed the >= comparison 
sides in my mental model. :-(

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-20 Thread Milian Wolff
mwolff added a comment.


  can you publish the test file and how you measured it? Then I can test it 
with perf too, to confirm the impact.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: mwolff, cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-20 Thread Christoph Cullmann
cullmann added a comment.


  Hmm, does it really get that much faster or is the stuff just moved to the 
subroutines?
  I would not have thought that something is faster than simple linear scan for 
a short vector.

REPOSITORY
  R39 KTextEditor

REVISION DETAIL
  https://phabricator.kde.org/D11487

To: jtamate, #frameworks, #kate
Cc: cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann


D11487: optimization of TextLineData::attribute

2018-03-20 Thread Jaime Torres Amate
jtamate updated this revision to Diff 29975.
jtamate retitled this revision from "simple optimization of 
TextLineData::attribute" to "optimization of TextLineData::attribute".
jtamate edited the summary of this revision.
jtamate edited the test plan for this revision.
jtamate added a comment.


  As m_attributesList is sorted, otherwise the original implementation will not 
work, 
  do a binary search looking for the lowest item that has offset >= pos.
  
  It gets even better, 0,32% of cpu in callgrind.

REPOSITORY
  R39 KTextEditor

CHANGES SINCE LAST UPDATE
  https://phabricator.kde.org/D11487?vs=29925=29975

REVISION DETAIL
  https://phabricator.kde.org/D11487

AFFECTED FILES
  src/buffer/katetextline.h

To: jtamate, #frameworks, #kate
Cc: cullmann, michaelh, kevinapavew, ngraham, demsking, sars, dhaumann